Kenneth Whistler wrote:
> Frank,
>
> Yes. Absolutely it does. It is spelled out in the standard
> itself.
>
> GB 18030 <--> Unicode conversion is basically like a big
> UTF, with an enormous table for all the GBK part of the
> encoding, and a bunch of offset ranges to convert all the
> other code points.
I know. I already implement the Unicode BMP to GB18030 conversion (back
and forth) in Mozilla. The 4 bytes GB18030 to Unicode BMP conversion
only take me about 1488 bytes (see
http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvcn/gb180304bytes.ut
) . The Unicode BMP to GB18030 4 bytes part (not including the 2 bytes
part) only take me 1036 bytes to code the table (see
http://lxr.mozilla.org/seamonkey/source/intl/uconv/ucvcn/gb180304bytes.uf
). I got the origional mapping from Sun Microsystem. Unfortunately, I
did find a mapping table beyond BMP. You don't need to explain to me
the concept of GB18030. The question I have is about details mapping
information.
>
>
> > Unless you
> > can said that is YES and show me the specification how to
> > map between
> > them, there are no way people can implement code set
> > conversion between GB18030 and Unicode.
>
> http://www-106.ibm.com/developerworks/library/u-china.html
>
> Markus Scherer's excellent documentation of GB 18030, with
> code snippets and pointer to a complete ICU implementation.
That paper itself does not specify any details mapping table.
I look at
http://oss.software.ibm.com/cvs/icu/charset/data/xml/gb-18030-2000.xml .
It is interesting that the mapping between U+10000 and U+10FFFF is check
in only 5 weeks ago in the version 1.3
| 30910: <range uFirst="10000" uLast="10FFFF"
bFirst="90 30 81 30" bLast="E3 32 9A 35" bMin="81 30 81 30" bMax="FE 39
FE 39"/>
Can anyone tell me where can I find a online version of the GB18030
standard (yes, I want the STANDARD itself. Not someone's paper talk
about the standard) . Or anyone could tell me where to get a copy of the
standard.
Is the U+10000 - U+10FFFF mapping between Unicode and GB18030 specified
in the GB18030 standard itself? can someone fax me that page ? Thanks.
looks like I beat ICU by checkin my mapping table at April 9 (to
mozilla) , 10 days before they check in their first version of GB18030
xml mapping table :) I probably can still claim the first open source
project which support GB18030 to Unicode conversion, althought I didn't
do anything beyond BMP ....
>
>
> >
> > That question is not wheather they should define the
> > relationship or not, but have they defined it yet.
>
> They have.
>
> --Ken
This archive was generated by hypermail 2.1.2 : Thu Sep 27 2001 - 14:32:02 EDT