Philippe de Rochambeau wrote:
> On the other hand, if I store the previous "go" character
> plus an unusual
> CJK ideogram whose Unicode equivalent is \u5439 (E5 90 B9 in UTF-8)
> in the DB and retrieve the data, JRun 3.1 will only display the first
> character in my form's textarea, plus a few invisible
> characters, and the
> database will contain the following hex values:
>
> E8 AA 9E E5 3F B9 20 20 20 20 20 20 0D 0A 0A
>
> As you can see, "go" is still there, but the following
> character (E5 3F B9)
> is not \u5439 (E5 90 B9). I cannot figure out how to fix this problem.
>
> Any help with this problem would be much appreciated.
I see what the problem is. As usual, it's all the fault of Bill Gate$. :-)
If you interpret <E5, 90, B9> according to Windows-1252, you see that E5 is
"å", B9 is "¹", but 90 is an unassigned slot! Unassigned characters are
normally turned into a question marks, and "?"'s code is (guess what) 3F...
<E8, AA, 9E> this works only by chance, because all three bytes are valid
Windows-1252 characters: "é", "ª", and "ž", respectively.
I guess that the problem starts when you try to fool the system into
thinking that the text is ISO 8859-1:
byte[] byt = (newQfLibelleArray[i]).getBytes( "ISO8859_1" );
String tempUtf16 = new String( byt );
But, sorry. I can't help with a fix, because I don't know Java API's well
enough.
Can't you do something like <.getBytes("UTF-8")>? Or, even better, doesn't
(newQfLibelleArray[i]) have a method to return a <String> object directly?
_ Marco
This archive was generated by hypermail 2.1.2 : Thu Sep 12 2002 - 08:42:23 EDT