From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Tue Oct 15 2002 - 09:55:16 EDT
On Tue, 15 Oct 2002, "Stefan Persson" wrote:
> That font also includes some characters mapped to the PUA: A € sign, and
> several 漢 character, many of which look like radicals. Why? Is that
> something that's also required by that law?
>
It's my experience that many fonts include gunk in the Private Use Area. A quick check of some of
the CJK glyphs in the PUA of SimSun-18030 shows that they are not unique, but are also mapped to
codepoints in the CJK Radical Supplement and CJK-A blocks for example.
I believe that it is intended to maintain a one-to-one correspondence between the GB18030 standard
and Unicode, and so there should be no need for any supplementary glyphs in the PUA.
The new PRC law is, as you hint, overly restrictive and prescriptive, and is, I think, a serious
setback for popularisation of Unicode on the Web. The intent is that GB18030 should replace GB2312
and Big5, and so that instead of the current mishmash of GB2312 (SC) and Big5 (TC) websites, in the
future Traditional and Simplified Chinese sites (at least those hosted in China) will use the same
GB18030 encoding.
Where does this leave websites written in Unicode Chinese ? Out in the cold !
At present web pages written in Unicode Chinese (some of mine for example) are not being indexed by
Google, and are ignored by both Yahoo China (SC) and Chinese Yahoo (TC). The situation will
certainly not be improved by the replacement of GB2312 and Big5 with GB18030.
Andrew
This archive was generated by hypermail 2.1.5 : Tue Oct 15 2002 - 10:51:12 EDT