From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Apr 11 2005 - 13:01:09 CST
Kevin Brown asked:
> How would I find these without checking through the whole database and comparing
> every subrange in 4.1 with those in 4.0 (a daunting prospect I think you'll
> agree!)?
Thanks to Andrew West for doing the work with WinDiff and documenting
what he found.
> Is there somewhere on the Unicode site where such subtle changes are
> listed?
No.
>
> Would the powers that be consider marking subrange name changes in the same way
> that you mark the new characters?
Actually, no.
There is no normative status to any subrange of characters below
the level of a Block in the standard. The concept of a "subrange"
doesn't even exist in the standard.
As you have noticed, if you are maintaining a datafile of subranges
of the sort you are talking about, the addition of new characters
often necessitates recategorizing groups of characters in particular
ranges in the names list, as a group of characters of a particular
type may be extended in either direction, or something may be
encoded in a gap in the chart which then changes the nature of the
collection of characters in its vicinity.
What you are talking about are not actually subranges at all,
but rather "SUBHEADER"s. See NamesList.html, which documents the
format of NamesList.txt. The SUBHEADER fields are completely within
the purview of the editorial committee, and are added, renamed,
or removed based on their use in visually clarifying the printed
version of the names list. They are not decided by the UTC, and
have no normative status to *be* tracked.
A good example of how editorial decisions impact this can be seen
in the treatment of the Mathematical Operators block, U+2200..U+22FF,
as mentioned by Andrew. Prior to Unicode 4.1, the names list for
that block had only one SUBHEADER, "Mathematical operators", for
the entire block. But the later addition of various other blocks
of mathematical symbols, Miscellaneous Mathematical Symbols A,
Miscellaneous Mathematical Symbols B, and so on, was accompanied
by a careful addition of SUBHEADERs to those blocks, to divide
up the character lists into appropriate types, for easier reference.
That resulted in a editorial inconsistency with the way the original
Mathematical Operators block was presented. So for the Unicode 4.1
charts, a number of SUBHEADERs were added to NamesList.txt, to
divide that block into the same general categories of operators,
relations, other symbols, etc.
--Ken
This archive was generated by hypermail 2.1.5 : Mon Apr 11 2005 - 13:01:54 CST