Re: PDUTR #26 posted

From: David Hopwood (david.hopwood@zetnet.co.uk)
Date: Sun Sep 16 2001 - 16:28:34 EDT


-----BEGIN PGP SIGNED MESSAGE-----

"Carl W. Brown" wrote:
> Doug,
> > But if people start compromising their UTF-8 parsers to accommodate
> > CESU-8 "adaptively," it would be a great blow to UTF-8. It would
> > essentially undo all the tightening-up that was accomplished by the
> > Corrigendum, and it would revive all the old Bruce Schneier-style
> > skepticism about the "security" of Unicode.

Bruce Schneier's comments were about Intrusion Detection Systems not being
able to detect non-shortest form, and that never made sense: the appropriate
solution to that is to fix the IDS. (Note that a similar situation occurs for
any non-unique escaping mechanism.) However, there are other more plausible
security problems that can arise if you're converting between supposedly
equivalent character sequences with a transform that is not 1-1.

> You are right. Elimination the non-shortest for where by specifying a space
> for example as \xC0\xA0 instead of \x20 it insure that text screening
> programs only have one form of space to check for. CESU-8 reopens this
> security hole.

It doesn't reopen that specific type of security hole, because irregular
UTF-8 sequences (as defined by Unicode 3.1) can only decode to characters above
0xFFFF, and those characters are unlikely to be "special" for any application
protocol. However, I entirely agree that it's desirable that UTF-8 should only
allow shortest form; 6-byte surrogate encodings have always been incorrect.

> It would seem to be that if you either have to change the UTF-8 code to
> support CESU-8 or change the UTF-16 compare logic then changing the UTF-16
> logic to do code point order compares is a much more containable change with
> a much lower processing impact.

Yes, exactly. Peoplesoft is being short-sighted here; if they go with CESU-8
then they will still have to implement CESU-8 <-> UTF-8 conversion in order
to interoperate with anything that uses UTF-8 and rejects irregular sequences,
so it's not as though this solution is a no-op.

- --
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBO6T0CTkCAxeYt5gVAQFY/wf/dkNXdL3bjGpDMhR0bb70HT0Nej3AX8sO
Ouh95TRHwn6euAOKzmepsJzigR1Ym0ZYkmF7Km1gRMKwxXZ268GXhWzEuAgbVg1q
bp9liDXGss0B7rSXTNAnu7+0Y6Q/G42fch563WOBpkE1vwzv58b2ZnoFqqZ5jxkr
RR+SsanEVY1SkwcEYumk8S9DMLLO/SqpUMgD0UuvWSQ+twMfz34k9eLsI1VcoYpt
BVtezPGCplp2HlYcSuxopII4o4CDnxHYQRm9Y2HVAURHU2nDICjuUZnOJxA1UzLo
EkxhSmWDc6D24Jw8MgYxz9QVemd9JeaUYUnrCmZgdy1Bh0cpZO5AGg==
=QeJX
-----END PGP SIGNATURE-----



This archive was generated by hypermail 2.1.2 : Mon Sep 17 2001 - 17:53:53 EDT