Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)

From: John Cowan (cowan@mercury.ccil.org)
Date: Tue Nov 25 2003 - 08:23:32 EST

  • Next message: Michael Everson: "Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)"

    Michael Everson scripsit:

    > Ridiculous. This happened centuries ago, and it is not "why" Ethiopic
    > was encoded as a syllabary. It was encoded as a syllabary because it
    > is a syllabary.

    Structurally it's an abugida, like Indic and UCAS.

    > You are, because the floodgates, while once open, have been closed by
    > normalization.

    Indeed, they were opened in Unicode 1.1, as a result of the merger with
    FDIS 10646; since then, only 46 characters with canonical decompositions
    have been added to Unicode (excepting compatibility ideographs, which
    are a special case).

    Specifically, 16 were added in Unicode 2.0, 29 in Unicode 1.0, and
    just one in Unicode 3.2 (the slashed version of a symbol added at the
    same time).

    -- 
    "What has four pairs of pants, lives            John Cowan
    in Philadelphia, and it never rains             http://www.reutershealth.com
    but it pours?"                                  jcowan@reutershealth.com
            --Rufus T. Firefly                      http://www.ccil.org/~cowan
    


    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 09:12:41 EST