Re: Changes to Subranges in 4.1

From: Kenneth Whistler (
Date: Mon Apr 11 2005 - 13:01:09 CST

  • Next message: Dean Snyder: "Unicode for the Blind"

    Kevin Brown asked:

    > How would I find these without checking through the whole database and comparing
    > every subrange in 4.1 with those in 4.0 (a daunting prospect I think you'll
    > agree!)?

    Thanks to Andrew West for doing the work with WinDiff and documenting
    what he found.

    > Is there somewhere on the Unicode site where such subtle changes are
    > listed?


    > Would the powers that be consider marking subrange name changes in the same way
    > that you mark the new characters?

    Actually, no.

    There is no normative status to any subrange of characters below
    the level of a Block in the standard. The concept of a "subrange"
    doesn't even exist in the standard.

    As you have noticed, if you are maintaining a datafile of subranges
    of the sort you are talking about, the addition of new characters
    often necessitates recategorizing groups of characters in particular
    ranges in the names list, as a group of characters of a particular
    type may be extended in either direction, or something may be
    encoded in a gap in the chart which then changes the nature of the
    collection of characters in its vicinity.

    What you are talking about are not actually subranges at all,
    but rather "SUBHEADER"s. See NamesList.html, which documents the
    format of NamesList.txt. The SUBHEADER fields are completely within
    the purview of the editorial committee, and are added, renamed,
    or removed based on their use in visually clarifying the printed
    version of the names list. They are not decided by the UTC, and
    have no normative status to *be* tracked.

    A good example of how editorial decisions impact this can be seen
    in the treatment of the Mathematical Operators block, U+2200..U+22FF,
    as mentioned by Andrew. Prior to Unicode 4.1, the names list for
    that block had only one SUBHEADER, "Mathematical operators", for
    the entire block. But the later addition of various other blocks
    of mathematical symbols, Miscellaneous Mathematical Symbols A,
    Miscellaneous Mathematical Symbols B, and so on, was accompanied
    by a careful addition of SUBHEADERs to those blocks, to divide
    up the character lists into appropriate types, for easier reference.
    That resulted in a editorial inconsistency with the way the original
    Mathematical Operators block was presented. So for the Unicode 4.1
    charts, a number of SUBHEADERs were added to NamesList.txt, to
    divide that block into the same general categories of operators,
    relations, other symbols, etc.


    This archive was generated by hypermail 2.1.5 : Mon Apr 11 2005 - 13:01:54 CST