RE: [long] Use of Unicode in AbiWord

From: Chris Pratley (chrispr@microsoft.com)
Date: Tue Mar 23 1999 - 17:05:56 EST


Unfortunately, Rick, it is a little more real than that. The Hong Kong
Government provides fonts and input methods for its "GCCS" private extension
to Big5. Even when the duplications with Unihan are removed, there are 1800
characters in GCCS that are not in Unicode 3.0 including Ext A to Unihan.
These are mainly addresses, names of people, even names of horses- from the
Jockey Club. The characters in this list were compiled from newspapers and
government documents and consist of characters these people had to create
manually since there was no way to encode them. So, they are actual, real
world characters that people in Hong Kong use, know how to read and write,
etc.

Similarly, the Taiwan Government maintains its records in a private encoding
system that has around 50000+ Han characters. These characters are needed in
order to handle the database of citizen's names. I don't believe work has
been completed to verify which characters lie outside of Unihan, and I've
never seen any data indicating frequency of use of characters outside
Unihan.

However, your point is well-taken. The national encoding standards of each
for the Han-using countries obviously are sufficient for everyday usage.
However, the fact that tools for EUDC generation exist is evidence that
people do occasionally find a character they need outside of these
encodings. As I mentioned, even with Unicode, I know for a fact that in Hong
Kong characters outside even the expanded Unihan are in general usage. As
someone noted on this list a few days ago, work is ongoing to get the Hong
Kong characters in Unicode via ext B or later.

Chris Pratley
Microsoft Word

-----Original Message-----
From: Rick McGowan [mailto:rmcgowan@apple.com]
Sent: March 19, 1999 3:45 PM
To: Unicode List
Subject: Re: [long] Use of Unicode in AbiWord

> I think for "ONE" you should substitute "TWO", since many people have
> non-Unihan characters in their names, or so we are told.

You mean TWO of 500 million, or TWO characters? ;-)

Anyway, it's all just an urban legend, John. Nobody, to my knowledge, has
EVER provided even one shred of public evidence that any such characters
exist. Supposedly there are one-of-a-kind made-up name characters,
supposedly used in Hong Kong or Taiwan or whever ever. Lots of people
believe that, but I'm not convinced yet. I think it's just another urban
legend made up to scare people into thinking the issue is important.

The proof is in the documentation, not the rumor. Until then, I'll stick to
ONE.

        Rick



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT