Re: UTF-17

From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Jun 21 2001 - 20:45:21 EDT

Next message: Martin Duerst: "Re: XML Blueberry Requirements"
Previous message: Markus Scherer: "Re: UTF-17"
Maybe in reply to: Kenneth Whistler: "UTF-17"
Next in thread: Antoine Leca: "Re: UTF-17"
Reply: Antoine Leca: "Re: UTF-17"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Markus,

Thank you for your comment.

> Nice, but you have the same kind of shortest-form problem as in UTF-8:
> <38 30 30 30 30 30 30 30> could be mis-interpreted by a lenient decoder as U+0000.

Well, actually, that is not technically a "shortest-form problem". All
UTF-17 forms are exactly 8 bytes long, so any valid form is automatically
also a shortest form.

Furthermore, lenient decoders are not allowed for UTF-17.

<38 30 30 30 30 30 30 00> is specified as the *unique* representation
of U+0000. That means that <38 30 30 30 30 30 30 30> is ill-formed,
and therefore disallowed.

>
> Ts, ts...
>
> At least it sorts binary in code point order.

Yes, good point. Rick and I have added that to the Internet Draft
for UTF-17.

--Ken

>
> markus
>
>

Next message: Martin Duerst: "Re: XML Blueberry Requirements"
Previous message: Markus Scherer: "Re: UTF-17"
Maybe in reply to: Kenneth Whistler: "UTF-17"
Next in thread: Antoine Leca: "Re: UTF-17"
Reply: Antoine Leca: "Re: UTF-17"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT