Re: PDUTR #26 posted

From: DougEwell2@cs.com
Date: Mon Sep 17 2001 - 11:36:51 EDT


In a message dated 2001-09-17 4:25:47 Pacific Daylight Time,
michka@trigeminal.com writes:

>> How should an UTF-8 application behave if it accidentally receives
>> a CESU-8 surrogate sequence? How does an application which
>> relies on CESU-8 binary sorting behave if it accidentally receives an
>> UTF-8 4-byte sequence?
>
> Both should error out. In practice, I wonder how common it would be and
> because of this how many people will actually do THAT in their parsers. I
> expect lots of non-compliant parsers.

If Michka is referring to non-compliant CESU-8 parsers, I really wouldn't
care much because CESU-8 is supposed to live in its own little private world.
 But if people start compromising their UTF-8 parsers to accommodate CESU-8
"adaptively," it would be a great blow to UTF-8. It would essentially undo
all the tightening-up that was accomplished by the Corrigendum, and it would
revive all the old Bruce Schneier-style skepticism about the "security" of
Unicode.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Mon Sep 17 2001 - 10:23:03 EDT