RE: Code charts

From: Vaintroub, Wladislav (Wladislav.Vaintroub@softwareag.com)
Date: Mon Apr 09 2001 - 07:00:29 EDT


Tomas McGuinness wrote:
> The problem is that the UCS-2 hex
> representation for say 0x003C (<) is not present in GB2312
> [the same glypg I
> mean]. Its not in the mapping table chart I have anyway. Does
> the Simplified
> Chinese character set have this character or is my mapping
> table incorrect?

Hi Tomas,

I assume ,the mapping table you got is incorrect.
GB2312 must contain US-ASCII characters anyway ,otherwise it would be
impossible to create an HTML/XML document in GB2312.

A mapping table for "extended" GB2312 or GBK can be found at

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT

Also note , that the mapping in
http://www.unicode.org/Public/MAPPINGS/EASTASIA/GB/GB2312.TXT is not given
in the "usual" EUC form.

To get an EUC-mapping out of GB2312.txt , one should add 0x8080 to each
GB2312 codepoint AND add SBCS mappings for US-ASCII characters

Regards,
Wlad
 



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:15 EDT