If you want to duplicate the IE mappings, you could write a quick little
program to see what the COM APIs map the PUA SJIS characters to.
Mark
—————
Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]
----- Original Message -----
From: "Lars Marius Garshol" <larsga@garshol.priv.no>
To: <unicode@unicode.org>
Sent: Friday, January 18, 2002 06:51
Subject: Re: Fun with UDCs in Shift-JIS
>
> * David Hopwood
> |
> | Presumably the "NT 4.0" mapping at
<http://www.autumn.org/etc/unidif.html>
> | (in Japanese, but the table is readable by non-Japanese-speakers).
> |
> | That mapping is a superset of CP932
> |
(<ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT>),
> | with additional mappings from 0xF040..0xF9FC to U+E000..E757, from
> | 0x80 to U+0080 (why?), and from the other 4 reserved single-byte codes
> | to U+F8F0..F8F3.
>
> Hmmmm. It seems like this table has a little more information
> (provided I interpret it correctly).
>
> | I wouldn't know, but the private use codes can't be assumed to mean
> | anything in particular, regardless of what charset they start out as.
> | Such pages are broken, and should be using NCRs or images instead.
>
> OK, so it seems like this characters have no fixed interpretation.
>
> I did a little test, by generating a Shift-JIS test page and viewing
> it in MSIE. It turns out that MSIE supports only the 0xFA40 - 0xFC4B
> range, and not the rest of the 0xF0F0 - 0xFCFC range, which means that
> the tables I've already been referred to contain all the information I
> need.
>
> An interesting question is whether this range is particular to
> Shift-JIS, or whether these characters should be considered to have
> been added to JIS 0208, so that EUC-JP and ISO 2022-JP also can use
> them. Does anyone know?
>
> --Lars M.
>
>
>
This archive was generated by hypermail 2.1.2 : Fri Jan 18 2002 - 10:24:44 EST