Re: Strange UTF-8 in Java

From: John Cowan (cowan@locke.ccil.org)
Date: Thu Oct 01 1998 - 13:26:27 EDT


Elliotte Rusty Harold wrote:

> Actually the length is a two byte, big endian, unsigned integer; not a four
> byte integer.

You're right, and I'm shocked. It turns out that (although this
is not documented) an attempt to writeUTF() a String requiring
more than 65535 bytes in modified UTF-8 will throw a
UTFDataFormatException. So when trying to transput large
Strings to binary files with this mechanism, a higher-level
protocol may be needed to fragment and reassemble them.
The Java Language Specification is distressingly silent about
this point.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT