Markus,
Thank you for your comment.
> Nice, but you have the same kind of shortest-form problem as in UTF-8:
> <38 30 30 30 30 30 30 30> could be mis-interpreted by a lenient decoder as U+0000.
Well, actually, that is not technically a "shortest-form problem". All
UTF-17 forms are exactly 8 bytes long, so any valid form is automatically
also a shortest form.
Furthermore, lenient decoders are not allowed for UTF-17.
<38 30 30 30 30 30 30 00> is specified as the *unique* representation
of U+0000. That means that <38 30 30 30 30 30 30 30> is ill-formed,
and therefore disallowed.
>
> Ts, ts...
>
> At least it sorts binary in code point order.
Yes, good point. Rick and I have added that to the Internet Draft
for UTF-17.
--Ken
>
> markus
>
>
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT