Re: Ideographic Description Characters

From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Mon Dec 08 2003 - 05:15:57 EST

  • Next message: Michael Everson: "Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)"

    On Sun, 7 Dec 2003 11:25:01 -0700, Tom Gewecke wrote:
    >
    > Can anyone tell me whether ideographic description characters are ever
    > actually used?

    Well, I use them on a couple of my web pages to describe unencoded ideographs
    (try viewing http://uk.geocities.com/BabelStone1357/Alphabets/Zhuang.html with
    Code2000), but I can't recall ever having seen them used elsewhere.

    > I recently ran into a Han (Vietnamese Nôm) character
    > which does not seem to be encoded yet, "slice" radical on left and
    > "heart" radical on right, and was wondering whether it would make
    > practical sense to encode this as U+2FF1, U+2F5A, U+2F3C (
    ⿰⽚⼼).

    Remember IDCs *describe* ideographs, they are not used to *encode* them. One of
    the reasons why IDCs can't be used to formally encode an ideograph is that there
    are usually several or even many different ways to describe the same ideograph
    with IDCs depending upon how far you break down its constituent components, and
    whether you use radicals or complete ideographs for the constituent components.
    Even for your simple example, you could variously describe the character as
    <U+2FF1, U+2F5A, U+2F3C>, <U+2FF1, U+2F5A, U+5FC3>, <U+2FF1, U+7247, U+2F3C> or
    <U+2FF1, U+7247, U+5FC3> ... which all in all means that you cannot hope to
    successfully search for an IDC-described ideograph or do most of the other
    operations you would expect to be able to do with formally encoded ideographs.

    Andrew



    This archive was generated by hypermail 2.1.5 : Mon Dec 08 2003 - 05:52:38 EST