Re: General category of Hangul Conjoining Jamos (U+1100 block)

From: Jungshik Shin (
Date: Sun May 05 2002 - 15:25:13 EDT

On Sun, 5 May 2002, Mark Davis wrote:

Thank you for explaining the historical background.

> In the early days of Unicode development, there were two models. As an

> A. Non-spacing mark model.
> With this method, there are base jamo and non-spacing jamo. Base jamo
> are all consonants, while trailing were consonants and vowels. So the
> structure of a Hangul syllable was B N*. The B values would have been
> the existing independent jamo, plus an additional set of N values. For

> B. Conjoining jamo.
> This is the current mechanism
> (
> avior). Syllables are L+ V+ T*. For example:


> There were long discussions on the best model to use, but we finally
> ended up with model B to accomodate the requests of the Korean
> national body in the merger between ISO 10646 and Unicode. They
> explicitly did not want the characters classified as 'combining' or
> 'non-spacing', so the new term 'conjoining' was coined instead.

  I can sort of see why South Korean nat'l body liked model B better
than model A. I would have chosen model B, too. However, I have
little idea what South Korean nat'l body's reasoning was when requesting
explicitly Hangul Jamos(medial vowels and trailing consonants) not be
non-spacing/combining. Did they want to set Hangul apart from South
and Southeastern Asian scripts? Why? Out of nationalistic 'zeal' under
the misguided belief that Hangul is unique and has to be different from
others in Unicode. I hope not, but then what else? Hmm, was it just an
(accidental) side-efffect of adopting model B instead of model A? As I
wrote (in my prev. message) and am gonna write below, I don't think using
model B necessarily means that medial vowels and trailing consonants are
not Mn. Some linguists have presented some (circumstancial) evidences
that the inventors of Hangul were influenced by Indic scripts as well
as by Mongolian Phagspa script. Given this, to me it's only fitting
that they are treated in a similar manner in the Unicode standard.
Could you tell me what their rationale was if you remember?

  It seems to me that model A is very similar to the model used
for Tibetan script where each consonant is encoded twice, once as a
head consonant and the second time as a subjoined consonant. Model
B appears to be a hybrid of the model used for Devanagari (and
most other South and Southeastern Asian scripts) and the model for
Tibetan. In model B, leading consonants have two spacining properties
(at the beginning of a syllable, they're spacing and otherwise they're
non-spacing/combining). This is also the case of Devangari consonants
which are classified as Lo despite the fact that they can be combining.
However, model B also has separately encoded trailiing consonants which
can be regarded as homologous(1) to subjoined consonants of Tibetan
(classified as Mn). Considering this similarity, I believe Hangul trailing
consonants have to be categorized as Mn as well. In both model A and B,
vowels are clearly non-spacing/combining.

    (1) Not exactly in that Hangul trailing consonants come at the
    syllable coda position while Tibetan subjoined consonants form a
    consonant cluster at the syllable onset position

 Whether model A or B is used, I think my rationale for classifying
Hangul medial vowels and trailing consonants as combining and non-spacing.
still stand. They have to be treated as combining/non-spacing as a
couple of implementations (ICU and Markus Kuhn's wcwidth()) already
do and I don't see much reason not to classify them as Mn. It has had
only a negative influence on the full-fledged support of Korean Hangul
by 'hiding' from developers the fact that Hangul Jamos have a lot in
common with South and Southeastern Asian scripts and require complex
text processing as required of South and Southeast Asian scripts.

   Thanks again for your explanation,

   Jungshik Shin

This archive was generated by hypermail 2.1.2 : Sun May 05 2002 - 16:22:09 EDT