On 5 Oct 2001, Lars Marius Garshol wrote:
> I've received data encoded in ISO 2022-JP that I am unable to figure
> out how to map to Unicode. The characters in question do not appear in
> the old JIS0208.TXT, and I can't find them in UniHan.txt either
> These characters are the problem:
>
> (7A22, 7C22, 7964 and 7B64)
Assuming what I wrote below is correct (and I didn't make
any mistake in 'math'), they are:
u+605D, u+91DE, u+5953, u+FA21
> Does anyone know what characters these are, and how to map them to
> Unicode? Are they part of some vendor extension to JIS 0208? If so,
I'm not sure, but it seems like they're a part of NEC Kanji
character set (ref. Ken Lunde, CJKV Information Processing, p. 592).
According to CJKV Information Processing NEC Kanji adds 360 Kanjis and
14 Non-Kanji from IBM Japanese character sets in rows 89-92.
IBM Japanese is listed as 'kIBMJapan' in UniHan.txt.
> does anyone know of a conversion table for that extension?
Provided that what's above is the case, I guess you can rather easily
construct a conversion table from UniHan.txt by comparing the table in
p.585 of CJKV I.P. and the table in p. 594 of the same book.
Jungshik Shin
This archive was generated by hypermail 2.1.2 : Fri Oct 05 2001 - 11:18:42 EDT