From: YTang0648@aol.com
Date: Tue Nov 04 2003 - 18:05:23 EST
In a message dated 11/4/2003 2:31:00 PM Pacific Standard Time, JD@BD8.COM
writes:
At 5:18 pm -0500 4/11/03, YTang0648@aol.com wrote:
> According to the HTML standard (see
>
<http://www.w3.org/TR/html4/struct/global.html#h-7.4.4>http://www.w3.org/TR/html4/struct/global.html#h-7.4.4 )
> the right way to specify the charset in html is to use the
> http-equiv attribute in META tag with a value "Content-Type" and
> put the charset value after the "text/html; charset=" in the value
> of the content attribute. The HTML specification does not specify
> the order between http-equiv and content attribute
I believe this part is still true.
> nither does it
> prohibit other attribute (such as charset=UTF-8 ) to be place.
I think this part I was wrong.
Having a charset=UTF-8 in the <META element will make it an "invalid html
document".
The interesting part is the following in html 4.0.1
http://www.w3.org/TR/html401/appendix/notes.html#h-B.1
[begin of the quote]
B.1 Notes on invalid documents
This specification does not define how conforming user agents handle general
error conditions, including how user agents behave when they encounter
elements, attributes, attribute values, or entities not specified in this document.
However, to facilitate experimentation and interoperability between
implementations of various versions of HTML, we recommend the following behavior:
If a user agent encounters an element it does not recognize, it should try to
render the element's content.
If a user agent encounters an attribute it does not recognize, it should
ignore the entire attribute specification (i.e., the attribute and its value).
If a user agent encounters an attribute value it doesn't recognize, it should
use the default attribute value.
If it encounters an undeclared entity, the entity should be treated as
character data.
We also recommend that user agents provide support for notifying the user of
such errors.
Since user agents may vary in how they handle error conditions, authors and
users must not rely on specific error recovery behavior.
[end of quote]
So... such html document is an invalid document, and the HTML user agents are
recommended to ignore the "charset=", but also are recommended to report to
the user about such error.
Well then, have it from the horse's mouth:
<http://validator.w3.org/>
Below are the results of attempting to parse this document with an SGML
parser.
Line 4, column 14 :there is no attribute "CHARSET" (explain... ).
<META charset=UTF-8 http-equiv=Content-Type content="text/html;
charset=utf-8">
==================================
Frank Yung-Fong Tang
System Architect, Iñtërnâtiônàl Dèvélôpmeñt, AOL Intèrâçtívë Sërviçes
AIM:yungfongta mailto:ytang0648@aol.com Tel:650-937-2913
Yahoo! Msg: frankyungfongtan
John 3:16 "For God so loved the world that he gave his one and only Son, that
whoever believes in him shall not perish but have eternal life.
Does your software display Thai language text correctly for Thailand users?
-> Basic Conceptof Thai Language linked from Frank Tang's
Iñtërnâtiônàlizætiøn Secrets
Want to translate your English text to something Thailand users can
understand ?
-> Try English-to-Thai machine translation at
http://c3po.links.nectec.or.th/parsit/
This archive was generated by hypermail 2.1.5 : Tue Nov 04 2003 - 18:49:29 EST