From: Katsuhiko Momoi (momoi@alumni.indiana.edu)
Date: Sat Apr 02 2005 - 03:06:26 CST
Markus Scherer wrote:
>Charsets are a mess.
>
>
Agreed.
>Japanese charsets are particulary notorious, see "XML Japanese
>Profile" http://www.w3.org/TR/japanese-xml/
>
>
Thanks for the info. We checked and it turns out that we mistakenly fed
one of the lookalike characters, \uFF0D rather than \u2212 to setContent
with the target encoding, ISO-2022-JP.
So, please disregard my query.
>ISO-2022-* are even worse than others because no one publishes
>comprehensive documentation for how they convert for these.
>
>Evidently, in this case the Java 1.4 and 1.5 converters are different.
>
>
As stated above. This was our error and not the fault of Java's converters.
>On Apr 1, 2005 12:24 AM, Katsuhiko Momoi <momoi@alumni.indiana.edu> wrote:
>
>
>>Using Java's native2ascii conversion utility -- I used the one that came
>>with SDK 1.5 for Windows, \u2212 converts to ISO-2022-JP. ...
>>... Java fails to convert \u2212 to ISO-2022-JP. (JDK version 1.4.x.)
>>
>>
>
>
>
>>Has anyone experienced this problem? I would appreciate a workaround or
>>a solution.
>>
>>
>
>Use UTF-8. Seriously.
>
>
Indeed. If only we could change national mail encoding (de facto)
standards overnight!
- Kat
-- Katsuhiko Momoi
This archive was generated by hypermail 2.1.5 : Sat Apr 02 2005 - 03:07:29 CST