From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Dec 29 2009 - 17:06:25 CST
On 12/29/2009 2:03 PM, Phillips, Addison wrote:
> No, that's not it.
>
> UTF-7, BOCU, and SCSU are banned either because they auto-detect as something other than themselves or because an otherwise "innocuous" byte sequence detects as being one of them, thus serving as the basis for an XSS attack. UTF-32 is banned apparently because naïve implementations might detect it as UTF-16.
>
> None of these encodings encode the same (full) sequence of code points in more than one way,
Hmmm. Not what I remember about SCSU. The encoder has wide latitude in
picking compression techniques in SCSU, if I remember from my sample
implementation. The result would be that two strings that were identical
in UTF-16 can become two different byte strings if encoded with two
different encoders for SCSU.
A./
> unless you mean that some of them encode identical subsequences of a larger document using different byte values? But that's not the same thing.
>
> Addison
>
> Addison Phillips
> Globalization Architect -- Lab126
>
> Internationalization is not a feature.
> It is an architecture.
>
>
>
>> -----Original Message-----
>> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
>> On Behalf Of Andrew Lipscomb
>> Sent: Tuesday, December 29, 2009 1:01 PM
>> To: unicode@unicode.org
>> Subject: The "prohibited" encodings...
>>
>> I think I just realized what they have in common--each one has the
>> ability to represent binary-identical strings in *more than one*
>> way.
>>
>>
>
>
>
>
>
This archive was generated by hypermail 2.1.5 : Tue Dec 29 2009 - 17:09:14 CST