From: Doug Ewell (doug@ewellic.org)
Date: Mon Dec 28 2009 - 08:17:10 CST
"verdy_p" <verdy underscore p at wanadoo dot fr> wrote:
> But anyway, isn't there a default ordering in UTF-32 when no BOM is
> present ? Why HTML5 wants to change the default ordering and still
> maintain its name as "UTF-32", in contradiction with TUS ? Shouln't
> HTML5 rename its modified encoding as "HTML5-UTF-32" (even if it then
> requires using the BOM... which was also proposed, and also
> contradicts TUS which only allow optional BOMs in UTF-32 and forbids
> all BOMs in UTF-32BE and UTF-32LE)...
The HTML5 draft uses disclaimers such as this one to justify such
decisions:
"This algorithm is a willful violation of the HTTP specification, which
requires that the encoding be assumed to be ISO-8859-1 in the absence of
a character encoding declaration to the contrary, and of RFC 2046, which
requires that the encoding be assumed to be US-ASCII in the absence of a
character encoding declaration to the contrary. This specification's
third approach is motivated by a desire to be maximally compatible with
legacy content. [HTTP] [RFC2046]"
This was from a table in Section 9.2.2.1, where browser developers are
encouraged to choose a default encoding that is not Unicode 2/3 of the
time based on "the user's locale."
More "willful violations" appear in Section 9.2.2.2, in which browsers
are required to "misinterpret for compatibility" ISO and
national-standard character sets as Windows code pages, even when the
author specified the ISO or national character set.
The implications are that (1) the authors of the present draft know
better than authors of previous works on character encoding and (2)
compatibility with existing, incorrectly or incompletely marked HTML
documents is more important than adherence to standards. This is a
departure from all other HTML and XHTML specifications I've ever seen
from the W3C.
-- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s
This archive was generated by hypermail 2.1.5 : Mon Dec 28 2009 - 08:20:06 CST