From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu May 29 2003 - 06:56:58 EDT
From: "Kazuhiro Kazama" <kazama@ingrid.org>
> From: Jane Liu <xjliu_ca@yahoo.com>
> Subject: Shift-JIS/Unicode mapping in JAVA
> Date: Wed, 28 May 2003 12:36:39 -0700 (PDT)
> Message-ID: <20030528193639.92471.qmail@web10707.mail.yahoo.com>
> > I am running a JAVA program on Japanese Windows 2000 system, looking
> > at the Unicode conversion of the following four characters from
> > Shift-JIS encoding (MS-CP932) in both JRE 1.3.1 and JRE 1.4.1, and
> > noticed some interesting changes:
>
> I guess that you used the charset name "Shift_JIS". Would you try to
> use "Windows-31J"?
I think that the canonical name of this encoding should be used, as "Windows-31J" is very uncommon.
So it seems better to designate the encoding with "CP932", or "windows-932", which Windows and Internet Explorer also prefers (and probably many other browsers).
It is true that MS-CP932 is NOT Shift-JIS, even if it's mostly compatible with it. It was created a long time ago as an extension of an *old* version of the JIS standard, and includes characters that have been later integrated in Shift_JIS. The current version of Shift_JIS has now more characters than the Microsoft codepage 932, but MS-CP932 also includes some characters defined in all Microsoft codepages and that are still missing from Shift_JIS and won't be added now that Shift_JIS has been deprecated by a newer version that includes support for all UniHan and Unicode/ISO10646 characters.
This archive was generated by hypermail 2.1.5 : Thu May 29 2003 - 07:38:58 EDT