OK, I'm confused. My reading of the UTF-8 spec leads me to believe that
UTF-8 encodes characters are encoded in a maximum of 4 bytes. Characters
from planes 0x1 through 0xF should always be handled as surrogates.
Yet, I've seen UTF-8 explanations that show planes 0x1 through 0xF encoded
as 5 & 6 byte sequences.
Are these 5 & 6 bytes encodings valid UTF-8? ...or... do they fall under
the category of "Be generous in what you accept."?
Sean O'Leary
oleary@awii.com
Automated Wagering International
973-594-5077
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT