From: Tim Greenwood (timg1952@aol.com)
Date: Thu May 06 2004 - 14:18:19 CDT
This situation is rather analogous to the case where HTTP is sent with no charset parameter, either directly or in an HTML META statement. RFC 2616 is explicit in section 3.7.1
" When no explicit charset
parameter is provided by the sender, media subtypes of the "text"
type are defined to have a default charset value of "ISO-8859-1" when
received via HTTP. "
However every browser that I have examined violates this and actually guesses the character
set from other information available to it, such as the locale of the machine, or an explicit user setting. To my mind the browser manufacturers are correct and the standard is wrong.
One thing that RFC does get right in correcting some earlier deviant behavior of browsers is in section 3.4.1
"3.4.1 Missing Charset
Some HTTP/1.0 software has interpreted a Content-Type header without
charset parameter incorrectly to mean "recipient should guess."
Senders wishing to defeat this behavior MAY include a charset
parameter even when the charset is ISO-8859-1 and SHOULD do so when
it is known that it will not confuse the recipient.
Unfortunately, some older HTTP/1.0 clients did not deal properly with
an explicit charset parameter. HTTP/1.1 recipients MUST respect the
charset label provided by the sender; and those user agents that have
a provision to "guess" a charset MUST use the charset from the
content-type field if they support that charset, rather than the
recipient's preference, when initially displaying a document. See
section 3.7.1."
i.e. - if it is there, do as it says. Here the standard is almost, but not quite, admitting that the previous RFC 2068 was wrong and the clients correct in the absence of a charset parameter. It is a pity that it did not correct the error rather than repeating it in section 3.7.1 - but of little practical concern since that section is ignored in practice.
- Tim
This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:26 CDT