Re: Strange UTF-8 in Java

From: John Cowan (cowan@locke.ccil.org)
Date: Wed Sep 30 1998 - 10:59:45 EDT


Doug Ewell wrote:

> The only time a UTF-8 stream would contain an embedded 0x00 would be
> when the underlying Unicode text contains 0x0000. Why this perfectly
> normal and appropriate use of a NUL would have to be concealed in an
> escape sequence is beyond me.

Again, it is not the representation of streams that is at issue,
but of in-memory strings. Using the mutated UTF-8 representation
allows the use of the traditional C representation, in which 0x00
represents end-of-string.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT