From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Apr 17 2003 - 18:15:14 EDT
IBM Code Page 837 is the DBCS portion of the Host Simplified
Chinese CCSID's. It defines all the wide characters.
You don't actually *use* Code Page 837 by itself. It is used,
together with Code Page 836 to define the IBM Host merged
code page: IBM Code Page 935. Code Page 836 is the SBCS
portion for Simplified Chinese: basically (in EBCDIC), the
ASCII repertoire plus the yen (yuan) sign, the pound (currency)
sign, and the broken bar.
So if you test against the ICU Code Page 935 mapping (or anybody
else's implementation of Code Page 935), you will pick up
*all* of the Chinese characters for the DBCS portion (Code Page 837).
> Partially answering my own question,
> ICU says (on this page:
http://www-124.ibm.com/icu/charset/roundtripIndex.html#ibm-837_X100-1995)
> that IBM 837 is a bit more than 98% similar to IBM 935.
>
> The ICU .ucm file for 837 doesn't exist (as far as I can tell
> from looking at the file names in the ICU 'mappings' directory.
> So, would it be safe to conclude that the ICU file
> ibm-935_P110-1999.ucm is 98% of what I need?
> Again, I'm looking for a list of all characters in IBM 837.
If you want that list explicitly, just grab all the double-byte
characters out of the ICU mapping for IBM 935.
--Ken
This archive was generated by hypermail 2.1.5 : Thu Apr 17 2003 - 19:03:11 EDT