Re: CESU-8 marches on

From: John H. Jenkins (jenkins@apple.com)
Date: Sat Dec 22 2001 - 10:34:48 EST


One small quibble.

On Saturday, December 22, 2001, at 01:31 AM, DougEwell2@cs.com wrote:

>
> he Han characters are generally thought to
> be less commonly used than those in the BMP; otherwise (so the story goes)
> they would have been encoded in Unicode sooner. Remember that none of
> these
> non-BMP characters could be conformantly used (e.g. stored in a database)
> until the publication of Unicode 3.1.
>
> So my question is: What supplementary characters are currently, TODAY,
> stored in Oracle or PeopleSoft databases that require the creation of a
> new
> encoding scheme to ensure they can continue to be sorted consistently?
>

The Han situation is a bit more complicated than that. There is in
Extension B (and therefore on Plane 2) a small number of characters from
HKSCS. These are characters which are required either for place or people
names in Hong Kong or for the writing of Cantonese. Their omission from
earlier versions of Unihan stems more from the fact that it took a while
for the HKSAR government to get their act together and start working on
the gathering of ideographs which they need without depending on other
people to stumble across those characters on their behalf.

Naturally, I cannot vouch for how many of these characters are as of this
moment in databases, but there is the distinct possibility that people
will start to use them in existing databases at any moment.

None of this has any effect on your overall argument, of course.

==========
John H. Jenkins
jenkins@apple.com
jenkins@mac.com
http://homepage.mac.com/jenkins/



This archive was generated by hypermail 2.1.2 : Sat Dec 22 2001 - 10:07:09 EST