Mark Davis scripsit:
> 2. For serializing strings internally, they use a byte format which is the same
> UTF-8, except that they use two bytes for null (<C0, 80>). The standard algorithm
> for converting UTF-8 to Unicode will convert this correctly back to a null,
> unless special checks are made for shortest forms.
IMHO, the only real blunder the Javasoft folks made in this respect
was in the names of the methods DataInput.readUTF() and DataOutput.writeUTF(),
which suggest that these are general-purpose UTF-8 transput methods.
As I have said, they are meant to transput Java Strings in binary
contexts, include a length value, and should have been called
DataInput.readString() and DataOutput.writeString().
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT