From: Kazuhiro Kazama (kazama@ingrid.org)
Date: Wed May 28 2003 - 23:10:17 EDT
From: Jane Liu <xjliu_ca@yahoo.com>
Subject: Shift-JIS/Unicode mapping in JAVA
Date: Wed, 28 May 2003 12:36:39 -0700 (PDT)
Message-ID: <20030528193639.92471.qmail@web10707.mail.yahoo.com>
> I am running a JAVA program on Japanese Windows 2000 system, looking
> at the Unicode conversion of the following four characters from
> Shift-JIS encoding (MS-CP932) in both JRE 1.3.1 and JRE 1.4.1, and
> noticed some interesting changes:
I guess that you used the charset name "Shift_JIS". Would you try to
use "Windows-31J"?
Two Shift-JIS variations are registed in the IANA registry:
"Shift_JIS" and "Windows-31J". The former is for JIS X 0208 and the
latter is for Microsoft's CP932. "Windows-31J" was proposed by one of
Microsoft's Japanese engeneers.
"Shift_JIS" is aliased to JIS X 0208 in JDK 1.1-1.1.7. But it is
re-aliased to CP932 in JDK 1.1.8-J2SE 1.4 ("Windows-31J" is also
aliased to CP932) and we found problems that we can't select the right
character encoding in J2EE platforms or there is a mapping
mis-matching between JDK and Xerces (Xerces has an original alias
table to alias "Shift_JIS" to JIS X 0208).
So we requested the following alias change and it was accepted in J2SE
1.4.1:
Shift_JIS -> JIS X 0208's shift-jis encoding.
Windows-31J -> Microsoft's CP932
See changes of J2SE 1.4.1.
http://java.sun.com/j2se/1.4.1/changes.html#Shift-JIS
Kazuhiro Kazama (kazama@ingrid.org) NTT Network Innovation Laboratories
This archive was generated by hypermail 2.1.5 : Wed May 28 2003 - 23:53:53 EDT