From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Fri Mar 14 2003 - 12:40:40 EST
Nooo - Java's old "UTF" functions do not process UTF-8! They are there for String serialization, a
Java-internal format.
Use the Java Reader/Writer classes instead of these old ones!
See the Java tutorials on Internationalization:
http://java.sun.com/docs/books/tutorial/i18n/text/convertintro.html
http://java.sun.com/docs/books/tutorial/i18n/text/index.html
http://java.sun.com/docs/books/tutorial/i18n/index.html
See the descriptions of readUTF() functions (highlighting with ***):
http://java.sun.com/j2se/1.4/docs/api/java/io/DataInputStream.html#readUTF(java.io.DataInput)
"Reads from the stream in a representation of a Unicode character string encoded in ***Java modified
UTF-8*** format; this string of characters is then returned as a String. The details of the
***modified UTF-8*** representation are exactly the same as for the readUTF method of DataInput."
http://java.sun.com/j2se/1.4/docs/api/java/io/DataInput.html#readUTF()
Java's *modified* UTF-8 in its "UTF" functions resembles CESU-8, and writes U+0000 with two bytes
instead of one, as far as I remember.
markus
Yung-Fong Tang wrote:
> what is rsResult? Blob?
> you probably need to use
>
> BufferedInputStream
>
> and
>
> DataInputStream
>
> to pipe the InputStream
> and use readChar or readUTF in the InputStream interface instad.
> See http://www.webdeveloper.com/java/java_jj_read_write.html and
> http://java.sun.com/j2se/1.4/docs/api/java/io/DataInputStream.html#readUTF()
> for more info.
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.
This archive was generated by hypermail 2.1.5 : Fri Mar 14 2003 - 13:35:55 EST