Hello Unicoders,
this should finally go to the FAQ.
Hello JG,
on Thursday, September 13, 2001 1:23 AM, James Gardner
wrote:
> I have a Microsoft Active Server Page which is saved
> as an ANSI file.
Note that older Microsoft documentation abused the term
ANSI for Microsoft's proprietary CP 1252 code, cf.
<http://czyborra.com/charsets/codepages.html#CP1252>.
(The lower half of that code table equals ASCII, hence
it is not repeated in codepages.html; you can see it in
<http://czyborra.com/charsets/iso646.html>.) Though I do
not know Microsoft Active Server and you did not mention
the editor you were using, I guess your page is in CP 1252.
If you would give a sample URL, I could verify this conjecture.
> In this file I specify that it should use UTF8 encoding.
Does that mean, you have in your page a line saying
<meta http-equiv=Content-Type content="text/html; charset=UTF-8">
?
If so, that line does specify the encoding to the browser,
but does not cause the server to convert it to UTF�8 (as
you apparently are assuming). In other words, if you include
that line in a CP1252 coded file, you are sending the browser
astray. Read more about HTML Document Representation in
<http://www.w3.org/TR/html401/charset.html>.
Note also that (according to the official specifications)
pre-4.0 HTML could only contain Latin-1 characters, cf.
<http://czyborra.com/charsets/iso8859.html>.
What you really have to do:
- write your source in HTML 4.0 (or later) or in XML,
including an approprate document type declaration, cf.
<http://www.w3.org/TR/html401/struct/global.html#h-7.2>;
- include the above-mentioned Meta tag in your HTML source;
- store the HTML source file in UTF-8 encoding;
- make sure that your server does not generate a
HTTP header field that would contradict your charset
setting.
Then, a suitable up-to-date browser should properly display
your page, provided that all required characters are contained
in the font used for display. Cf.
<http://www.hclrss.demon.co.uk/unicode/browsers.html>,
and <http://www.hclrss.demon.co.uk/unicode/fonts.html>,
respectively.
You may wish to study some examples:
<http://www.rz.uni-konstanz.de/y2k_uralt/test/Euro-UTF.htm>,
<http://www.rz.uni-konstanz.de/y2k_uralt/test/Go-UTF.htm>.
Furthermore, I recommend to have your HTML syntax
(including the proper specification of the encoding)
checked by <http://validator.w3.org/>.
Cf. also
<http://www.hclrss.demon.co.uk/unicode/htmlunicode.html>.
> The data (text) that is put into the page when it is created
> by the server is stored as unicode.
This seems to contradict the first sentence quoted above.
Now, I am completely at loss about your real problem.
> Do I need to save a file as unicode as well as specifying utf8
> encoding to properly display unicode on the web?
Definitely yes. You have to create a standard-complying file
and you have to tell the reader to which of several possible
standards your file actually complies, so the reader can make
heads and tail of it.
> This electronic communication is confidential and for the
> exclusive use of the addressee.
So you post it to a list read word-wide??
Best wishes,
Otto Stolz
This archive was generated by hypermail 2.1.2 : Mon Sep 17 2001 - 09:32:25 EDT