Re: Mysteries in the BMP Roadmap

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri May 12 2006 - 19:19:16 CDT

  • Next message: SADAHIRO Tomoyuki: "Re: Mysteries in the BMP Roadmap"

    Karl Pentzlin asked:

    > Am Freitag, 12. Mai 2006 um 21:04 schrieb Kenneth Whistler:
    >
    > KW> The ROK National Body has indicated interest in the context
    > KW> of the WG2 meetings in bringing in a proposal for adding
    > KW> more jamo characters. Given the approximate number of jamos
    > KW> they have been talking about -- which would be intended for
    > KW> representation of Old Hangul choseong, jungseong, and jongseong --
    > KW> the Roadmap additions are simply preliminary allocations of
    > KW> a matching number of columns.
    >
    > If only "Old" Hangul needs these jamos, and if they are in fact not
    > needed for contemporary Korean, why are these characters not
    > roadmapped to go into the SMP, as the available gaps in the BMP
    > have become somewhat small?

    Well, first of all, they will be needed for contemporary *implementations*
    of Korean, even if the jamos in question would only appear in
    Old Korean syllables. The problem is that nobody is going to
    implement Korean *twice* -- once for modern Korean and separately
    for Old Korean. The intent here -- as best I can tell -- is for
    Korean implementations to simply be able to deal with all of
    the repertoire, including the old syllabic forms.

    Second, all of the Old Korean syllables can be represented by
    existing combinations of jamos, which are already on the BMP
    in the 1100 block. The issue for the ROK has to do with the
    structure of resulting syllables, normalization, and so on, but
    whatever new combined jamos for Old Hangul might be added will be akin
    to jamos already existing in the 1100 block.

    Third, whenever possible a *script* is not split across planes.
    Of course, Han itself is an exception, because of the huge size
    of the Han repertoire, but no other *script* has had historic
    extensions relegated to the SMP simply because they were historic,
    at least not yet. And if you are looking for violations of plane
    functions based on contemporary versus historic usage, look first
    to the log in thy eye! The Latin script just got a huge collection
    of medievalist characters added, on the *BMP* in the Latin Extended-D
    block. Those, too, have no contemporary usage, but are on the BMP
    because the rest of Latin is there already.

    Fourth, what else is a reasonable candidate for filling the
    gap at the end of the Hangul Syllables block: U+D7A4..U+D7FF?
    That range is essentially dead air on the BMP unless used for
    these jamos. There are also 16 code points available in the
    U+1100 block itself. That probably won't be enough, however -- hence
    the roadmapping of some columns at A4C0..A4FF. We'll see what
    is actually needed if and when a real proposal shows up from
    Korea.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri May 12 2006 - 19:23:45 CDT