RE: whether ASCII for Korean language or not !!!

From: Jungshik Shin (jshin@mailaps.org)
Date: Tue Aug 13 2002 - 14:08:44 EDT


On Tue, 13 Aug 2002, Marco Cimarosti wrote:

> Jungshik Shin wrote:
> > [...] Besides, South Korea and North Korea agreed to
> > devise a new ISO 2002 compliant single byte character set for Korean
> > script(Hangul/Chosun-gul/Jeong-eum). (I don't know what it's
> > for though since we now have Unicode.)
>
> BTW, I always wondered whether North Korea uses the same standard(s) as
> South Korea. In Unihan.txt I see two sets of Korean standards: "KS C ..."
> and "PKS C ...". Are these both used in both Koreas?

  'PKS C ...' is just a 'pseudo'-character set of Hanjas(Chinese
characters) assembled by the South Korean delegation to the IRG for
the sole purpose of submitting Hanjas (found in Korean literature)
to the IRG. South Korea hasn't made a new coded character set only to
make some characters be encoded separately in Unicode/ISO 10646 for the
sake of round-trip compatibility with 'legacy' character sets. Someone
might think differently. All right, it did when 'JOHAB' was added as an
annex to KS C 5601-1992.

  As for your question, you seem to have overlooked there are KP sources
in addtion to Ksources in UniHan DB. K sources are South Korean and KP
sources are North Korean. KPS 9566-97 is an ISO 2022 compliant 94 x 94
coded character set (roughly equivalent to KS X 1001:1998 - formerly
KS C 5601 - of South Korea) and I believe KPS 10721-2000 is a 94x94
coded character set to supplement KPS 9566-97 (just like KS X 1002
-formerly KS C 5657- does for KS X 1001.) KPS 9566-97 has notorious
'six emphasized syllables' for two North Korean dictators along with
vulgar fractions with horizontal bars. A more important difference
between KS X 1001:1998 and KPS 9566-97 is Hangul/Choseon-gul/Jeong-eum*
syllable collation. North Korea requested to ISO/IEC JTC1/SC2/WG2 that the
arrangement of precomposed Hangul/Choseon-gul/Jeong-eum syllables(U+AC00)
and conjoining Jamos (U+1100) be shuffled to meet North Korean dictionary
order. Of course, it's rejected downright. There's absolutely no need
for that because there's no language/script for which a naive code-point
based sorting works without tailoring.

  I'm not sure how widely KPS 9566-97 is used in North Korea. If it's
used, the usage would be similar to the way KS X 1001 is used in
EUC-KR. (that is, KPS 9566-97 is designated as G1 and invoked on GR with
US-ASCII designated as G0 and invoked on GL.). However, a few North Korean
web sites on the Net reportedly use EUC-KR or its MS extension CP949.
IIRC, some North Korean office products(e.g. word processor : Chang-deok)
and other programs for MS Windows also use(support) EUC-KR or CP949. I
guess they have no choice but to use South Korean standard(s) for the sake
of interoperability.

   BTW, ISO/IEC JTC1/SC2/WG2 web page has a few reports produced by Korean
ad-hoc group in which South Korea, North Korea and PRC took part.

  Another BTW, AFAIK, there's no mapping table between KPS 9566-97 and ISO
10646/Unicode. It's trivial to generate one for Choseon-gul and Hanja,
but it's a bit tedious to map symbol characters (about 1000) in KPS
9566-97 to Unicode/10646. Moreover, there are some characters not yet
encoded in Unicode/10646. Some of them have been rejected while others
are still in the pipeline.

   Jungshik Shin

* Hangul(한글) is used in South Korea while North Korea refers to
it as Choseongul(조선글). Recently, Korean ad-hoc group agreed
to propose to WG2 that Jeongum(정음 : 正音) be used in place of
Hangul/Choseongul in ISO 10646. Personally, I think 'Korean script'
would be a better compromise.



This archive was generated by hypermail 2.1.2 : Tue Aug 13 2002 - 12:19:29 EDT