From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Thu Jan 06 2005 - 07:01:05 CST
On Wed, 5 Jan 2005 11:37:06 -0800 (PST), Kenneth Whistler wrote:
>
> Philippe said some interesting things about the status of
> EU recommendations, Directives, etc., but...
>
> > (Since now the mapping between GB18030 and ISO/IEC 10646 is well defined and
> > closed,
>
> False. Both GB18030 and ISO/IEC 10646 will be amended in the future,
> and mappings will change, and neither has (in principle) a closed repertoire.
>
I don't get this. I understand that neither GB18030 or ISO/IEC 10646 has a
closed character repertoire, but (and I think this is the point that Philippe
was trying to make) they do both have a closed code point repertoire, and there
is a one-to-one mapping between all ISO/IEC 10646 code points in planes 0-16 and
GB18030 code points, and this code point mapping will never change. I've
implemented Unicode to/from GB18030 conversion using the widely available
mapping tables (e.g. from <http://developers.sun.com/dev/gadc/technicalpublications/articles/gb18030.html>),
and I don't expect to ever have to modify my conversion routine. The unassigned
Unicode code point A840 maps to GB18030 code point sequence "82 36 ED 35", and
if ever A840 is assigned as a character by ISO/IEC 10646 and Unicode I expect
that the corresponding character in GB18030 will still be at code point "82 36
ED 35" -- as long as the Standardization Administration of PRC recognises the
new character -- and I understood that China was committed to maintaining
synchronization with the ISO/IEC 10646 character repertoire. This is why the
large set of precomposed "BrdaRten" Tibetan characters recently standardized by
China are mapped to the PUA (i.e. the areas of the GB18030 code point space that
corresponds to the Unicode PUA blocks) and not mapped to GB18030 code points
that correspond to the ISO/IEC 10646 code points in the BMP that China have been
trying unsuccessfully to get them assigned to over the last couple of years.
> > the only way for the repertoire associated to GB18030 to be extended
> > is that the repertoire in ISO/IEC 10646 is extended).
>
> False. What appears (in the future) in GB18030 will depend on what
> the Chinese standards organization decides to put in it. That is up
> to them, as it is their standard, and not an ISO standard.
Again, I think Philippe is refering to code point repertoire. Of course it is
always possible for China to desynchronize the character repertoire of GB18030
with ISO/IEC 10646, but I would have thought that this would be highly unlikely,
and as a general statement I don't think that it is misleading to say that
GB18030's character repertoire is expected to extend in parallel with the
extension of ISO/IEC 10646's character repertoire. The only caveat is that China
is free to define GB18030 code points that correspond to ISO/IEC 10646 PUA code
points as abstract characters that must be supported by GB18030-compliant
software, which is indeed what it has done with the precomposed BrdaRten
characters.
Andrew
This archive was generated by hypermail 2.1.5 : Thu Jan 06 2005 - 07:09:43 CST