Re: UTF-8 code in HTML

From: Lars Marius Garshol (
Date: Sun Apr 16 2000 - 05:29:13 EDT

* Addison Phillips
| 2. Our scripting and cgi languages need fixing. Perl 5.6 has the
| requisite support (but it is *brand* new). So many other Web
| technologies do not. For example, I've got a guy busy next week
| lobotomizing PHP using ICU...

It's not as bad as it may seem. tcl has unicode already, Python 1.6
will have it (release in June this year), Perl has it, Java has it,
C/C++ has it, some of the Common Lisp implementations have it etc

A bigger problem is that hardly any developers understand character
set issues or even want to think about them. And it's not really
strange, because it's almost impossible to find basic information
about this.

So if anyone wants to do something, character set tutorials would be a
very useful thing to do, probably including tips on how to correctly
signal what encoding you use in HTTP/HTML/XML.

