Re: Awful Unicode character names (was Re: I-Ching Hexagrams)

From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Apr 10 2003 - 22:28:45 EDT

  • Next message: Michael Everson: "Re: Awful Unicode character names (was Re: I-Ching Hexagrams)"

    Kevin Brown asked:

    > OK, now I'm really confused. Michael Everson earlier in this thread (8
    > April) stated:
    >
    > > Character names once assigned cannot be changed.
    >
    > Then Ken Whistler states (9 April) in relation to U+262B:
    >
    > >in Unicode 1.0 it was called "SYMBOL OF IRAN", which was closer
    > >to your description of its use. It was WG2 that insisted on renaming
    > >it "FARSI SYMBOL" to get "IRAN" out of the name...
    >
    > So do character names once assigned get changed or don't they?

    Ancient history. Hundreds -- maybe thousands -- of Unicode 1.0
    character names were changed in 1993 for Unicode 1.1 as part
    of the merger between the repertoires of Unicode and ISO/IEC 10646-1:1993.
    (The Great Compromise) The gory details of all the changes can be
    found in UTR #4, The Unicode Standard, Version 1.1. It was *after*
    that point (which was *very* painful for some people) that we
    put in place the never change a character name rule.

    The whole reason for having a Unicode 1.0 Name field in the
    UnicodeData.txt file was to track that name change.

    > Also, as the originator of the original thread (I-Ching Hexagrams), and
    > at the risk of re-awakening the rather cranky sleeping dragon earlier
    > disturbed by Eric Rasmussen, I was wondering about Michael's other
    > comment...
    >
    > >characters have to have names, and we used what was available
    > >rather than making something up.
    >
    > I'm curious about whether the proposers at any stage considered using a
    > similar character name convention to that used for the naming of the
    > Braille characters? (U+2800 - U+28FF)?

    I don't think so. Since the hexagrams have established names
    (although they are in Chinese and the translations into English
    are problematical, at best), the proposers went with those.
    I don't think binary names like:

    HEXAGRAM-000000
    HEXAGRAM-111111
    HEXAGRAM-101110
    etc.

    would have been very palatable.

    Then there was the existing pattern of naming for the trigrams:

    U+2630 TRIGRAM FOR HEAVEN
    U+2631 TRIGRAM FOR LAKE

    etc. It was that existing pattern of naming that resulted in
    the names:

    U+4DC0 HEXAGRAM FOR THE CREATIVE EARTH
    U+4DC1 HEXAGRAM FOR THE RECEPTIVE EARTH
    U+4DC2 HEXAGRAM FOR DIFFICULTY AT THE BEGINNING
    etc.

    The people who review names for new characters are rather
    sensitive to patterns which have already been established,
    and tend to follow those patterns when similar characters
    are added to the standard, to minimize the randomizing which
    would otherwise result in even more confusing names.

    At any rate, the last time I cast my yarrow sticks, I
    came up with the HEXAGRAM FOR AFTER COMPLETION, so
    maybe we've worried this particular dead horse enough.

    --Ken



    This archive was generated by hypermail 2.1.5 : Thu Apr 10 2003 - 23:15:36 EDT