Re: ISO 10646 compliance and EU law

From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Thu Jan 06 2005 - 07:01:05 CST

  • Next message: Philipp Reichmuth: "Re: ISO 10646 compliance and EU law"

    On Wed, 5 Jan 2005 11:37:06 -0800 (PST), Kenneth Whistler wrote:
    >
    > Philippe said some interesting things about the status of
    > EU recommendations, Directives, etc., but...
    >
    > > (Since now the mapping between GB18030 and ISO/IEC 10646 is well defined and
    > > closed,
    >
    > False. Both GB18030 and ISO/IEC 10646 will be amended in the future,
    > and mappings will change, and neither has (in principle) a closed repertoire.
    >

    I don't get this. I understand that neither GB18030 or ISO/IEC 10646 has a
    closed character repertoire, but (and I think this is the point that Philippe
    was trying to make) they do both have a closed code point repertoire, and there
    is a one-to-one mapping between all ISO/IEC 10646 code points in planes 0-16 and
    GB18030 code points, and this code point mapping will never change. I've
    implemented Unicode to/from GB18030 conversion using the widely available
    mapping tables (e.g. from <http://developers.sun.com/dev/gadc/technicalpublications/articles/gb18030.html>),
    and I don't expect to ever have to modify my conversion routine. The unassigned
    Unicode code point A840 maps to GB18030 code point sequence "82 36 ED 35", and
    if ever A840 is assigned as a character by ISO/IEC 10646 and Unicode I expect
    that the corresponding character in GB18030 will still be at code point "82 36
    ED 35" -- as long as the Standardization Administration of PRC recognises the
    new character -- and I understood that China was committed to maintaining
    synchronization with the ISO/IEC 10646 character repertoire. This is why the
    large set of precomposed "BrdaRten" Tibetan characters recently standardized by
    China are mapped to the PUA (i.e. the areas of the GB18030 code point space that
    corresponds to the Unicode PUA blocks) and not mapped to GB18030 code points
    that correspond to the ISO/IEC 10646 code points in the BMP that China have been
    trying unsuccessfully to get them assigned to over the last couple of years.
     
    > > the only way for the repertoire associated to GB18030 to be extended
    > > is that the repertoire in ISO/IEC 10646 is extended).
    >
    > False. What appears (in the future) in GB18030 will depend on what
    > the Chinese standards organization decides to put in it. That is up
    > to them, as it is their standard, and not an ISO standard.

    Again, I think Philippe is refering to code point repertoire. Of course it is
    always possible for China to desynchronize the character repertoire of GB18030
    with ISO/IEC 10646, but I would have thought that this would be highly unlikely,
    and as a general statement I don't think that it is misleading to say that
    GB18030's character repertoire is expected to extend in parallel with the
    extension of ISO/IEC 10646's character repertoire. The only caveat is that China
    is free to define GB18030 code points that correspond to ISO/IEC 10646 PUA code
    points as abstract characters that must be supported by GB18030-compliant
    software, which is indeed what it has done with the precomposed BrdaRten
    characters.

    Andrew



    This archive was generated by hypermail 2.1.5 : Thu Jan 06 2005 - 07:09:43 CST