Re: GB18030

From: Yung-Fong Tang (ftang@netscape.com)
Date: Wed Sep 26 2001 - 21:19:51 EDT


how can you implement tolower(U+4ff3a) without knowing what U+4ff3a is ?

DougEwell2@cs.com wrote:

> In a message dated 2001-09-24 20:50:25 Pacific Daylight Time,
> dstarner98@aasaa.ofe.org writes:
>
> >> Does GB18030 DEFINED the mapping between GB18030 and the rest of 11 planes?
> >> I don't think so, since Unicode have not define them yet, right ?
> >
> > Unicode defined all the planes, a long long time ago. It's added
> > characters for 3 of them - Plane 1 (basically the overflow area for the
> > non-CJK part of the BMP), Plane 2 (more ideographs) and Plane 14
> > (special tag characters).
>
> David's absolutely right. This is another common misconception, about
> Unicode "not defining" the code space unless characters are actually assigned
> to all the code points.
>
> This kind of thinking led, in part, to all the complacency on the part of
> database vendors and others concerning the need to support surrogate code
> points. They thought that just because no characters had YET been assigned
> to non-BMP code points, they could safely ignore the whole issue of surrogate
> processing. Then, when non-BMP characters became a reality, we began to see
> kludges like CESU-8.
>
> -Doug Ewell
> Fullerton, California



This archive was generated by hypermail 2.1.2 : Wed Sep 26 2001 - 20:03:04 EDT