Re: GB18030

From: Michael \(michka\) Kaplan (michka@trigeminal.com)
Date: Thu Sep 27 2001 - 19:05:13 EDT


From: Yung-Fong Tang

> Case mapping ? You have no way to generate mapping table for
> case mapping with knowing the character unless you already
> define those character have no case or only one case.

Um, Unicode defines a behavior and even properties for unassigned code
points. If you choose not to implement this because you only handle
"assigned code points" then that is actually a problem with your software.

No one is arguing the point you make that until a code point is assigned,
its exact *FINAL* behavior is not completely understood with regards to
casing, collation,
and everything else. So you do not need to continue arguing this point -- I
am sure everyone agrees with it.

But do you understand that there is certainly a defined behavior for it in
the interim? In the time before it is assigned an actual character? That is
I think the crux of the matter here.

> Don't tell me there any people how implemented
> HanCharacterStokeNumber(U+20000) in 1996, no body have a
> implementation of HanCharacterStokeNumber(U+20000) until
> U+20000 got defined.

Actually, several companies had the mechanisms defined to convert that to a
surrogate pair. Or to treat it as a single unassigned character for the
purposes of collation.

The difference between them and you would be that you do not recognize the
existence of this state -- the time before direct assignment?

MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/



This archive was generated by hypermail 2.1.2 : Thu Sep 27 2001 - 17:51:25 EDT