From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Wed Jan 14 2004 - 09:09:02 EST
[trying to catch up on *some* of the e-mails here...]
François Yergeau wrote:
> This little-known fact (along with the better-known fact that not all
> non-zero-ccc-characters do take part in existing precomposed
> characters) has
> prompted the W3C's Character Model spec to define "composing
> characters", a
> concept somewhat distinct from Unicode's combining
> characters. Appendix C
> at
....
> contains the definition as well as a list of the characters with
> ccc=0 that do take part in existing compositions; U+102E is there, of
> course, as well as the above-mentionned Hangul plus some others.
Hmm, Hangul. Now, the composition rules for Hangul ARE special.
That's why it's not just the case that V and T Jamos are combining,
and all the rest of Hangul characters just regular non-combining.
ALL of the L, V, T, LV, and LVT Hangul characters are CONJOINING.
E.g. an L followed by an LVT is a SINGLE Hangul syllable. The notion
of "composing characters" in that appendix C misses that point,
and goes back to an old proposed (but never in Unicode) model
where there where just the Ls, Vs, and Ts, with the latter two
combining, and L T V V T would be a single Hangul syllable.
Unfortunately, that is plain wrong in the adopted model for Hangul.
However, L L V V T *is* a single Hangul syllable, so is L LV T T, and
LVT T, and ... Indeed, an L LVT (e.g.) may normalise to (another) LVT.
/kent k
This archive was generated by hypermail 2.1.5 : Wed Jan 14 2004 - 09:58:27 EST