Re: Why is U+17C1 of General category Mc while U+0E40 and U+0EC) are of category Lo ?

From: jcowan@reutershealth.com
Date: Mon Mar 29 2004 - 12:42:50 EST

  • Next message: jcowan@reutershealth.com: "Re: Printing and Displaying Dependent Vowels"

    Patrick Andries scripsit:
    > Small question again.
    >
    > Why is U+17C1 KHMER VOWEL SIGN E of General category Mc (Mark, Spacing
    > Combining) while similar signs in Lao and Thai, related scripts, are of
    > General category Lo (Letter, Other) ?
    >
    > See U+0E40 THAI CHARACTER SARA E and U+0EC0 LAO VOWEL SIGN E, I believe
    > these signs are also placed on the left of the consonant affected.

    Thai (and Lao, whose encoding closely parallels that of Thai) are
    encoded in Unicode on unique principles: by a straight left-to-right
    typewriter-style encoding. This was done for compatibility with the
    pervasive Thai 8-bit standard. It also means that for collation purposes
    what are historically left-side vowels must be moved after the following
    consonant.

    Note that the Thai characters are not labeled LETTER or VOWEL SIGN or
    what have you, but simply CHARACTER.

    -- 
    Only do what only you can do.               John Cowan <jcowan@reutershealth.com>
      --Edsger W. Dijkstra's advice             http://www.reutershealth.com
        to a student in search of a thesis      http://www.ccil.org/~cowan
    


    This archive was generated by hypermail 2.1.5 : Mon Mar 29 2004 - 13:35:23 EST