RE: UTF-8 in web pages

From: Tim Greenwood (greenwood@openmarket.com)
Date: Fri Feb 05 1999 - 15:00:46 EST


The modern browsers all support UTF-8, the problem is more with font
mappings. You can explicitly map UTF-8 to a font. This font may support some
reasonable subset of Unicode characters e.g. Lucida sans Unicode, Bitstream
Cyberbit or it may support a more limited set of scripts e.g. Times Roman,
MS Gothic.

The majority of users will not do this mapping. It is thus of great interest
to see the default that the browsers pick. When I last looked (which was a
while ago) on a Japanese system Netscape Communicator reasonably picked a
Japanese font, but Internet Explorer chose Courier, thus showing all
Japanese text illegibly. Because of this I chose to not display web pages in
UTF-8, but to have our software convert back to a native encoding first.

Tim Greenwood
Open Market Inc.

> -----Original Message-----
> From: schererm@us.ibm.com [mailto:schererm@us.ibm.com]
> Sent: Friday, February 05, 1999 1:47 PM
> To: Unicode List
> Subject: Re: UTF-8 in web pages
>
>
>
>
> current versions of internet explorer, netscape, and lynx all support
> unicode encodings.
> unicode is _the_ html character set since version 3.2, i.e., all unicode
> characters are supported by html. for example, (hexa)decimal numbers in
> character entities are resolved as unicode code points.
> the default charset is still iso 8859-1 - which is a subset of unicode,
> code-point-wise.
> i guess you know
> <meta http-equiv="Content-Type" Content="text/html;
> charset=utf-8">
>
> the xml standard requires that clients are able to handle utf-8
> and utf-16.
>
>
> best regards,
>
> markus
>
> Markus Scherer IBM RTP +1 919 486 1135 Dept. Fax +1 919 254 6430
> schererm@us.ibm.com
> Unicode is here! --> http://www.unicode.org/
>
>
>
> "John O'Conner" <joconner@geocities.com> on 99-02-05 12:15:33
>
> To: Unicode List <unicode@unicode.org>
> Subject: UTF-8 in web pages
>
>
>
>
>
> I have a client that has a requirement to support several
> languages on their website and e-commerce store. I want to
> help them manage the storage of information and dynamic web
> pages by suggesting a common character set for all
> languages...Unicode.
>
> It seems like a no-brainer to select Unicode for my database
> character set because of their multi-language needs.
> However, I'm concerned about Unicode in web pages. I have
> browsed several UTF-8 pages with success, but I notice that
> the industry hasn't really picked up on UTF-8 as an HTML
> content encoding. Do any of you have any success/failure
> stories that you can share? How comfortable would you be
> recommending UTF-8 for HTML content. Oh, here's one more
> piece of information...the customer has traditionally used
> Big 5 for all their encoding needs. Actually...they've used
> an extension for their special chars in Hong Kong that don't
> seem to be available in Big 5.
>
> Regards,
> John O'Conner
>
>
>
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT