Re: Sick HTML

From: Christopher John Fynn (
Date: Thu Apr 20 2000 - 20:51:56 EDT ...

> Christopher and Andreas, everybody in this mailing list knows the best way
> to encode Indic scripts (quite easy an answer: just look at the mailing
> list's name:-)

> But the way you quashed Sinnathurai's and Jim's links makes me think that
> you know of viable Unicode solutions usable *now* -- April 20, 2000 -- and
> with share this information with us.

I overeacted - mainly because the link was posted to *this* list - anyone
providing useful information on the shaping rules for a complex script should be
thanked not discouraged and I apologise to Sinnathurai. However I also hope that
in future anyone subscribing to the Unicode list creating such a page will at
least provide a note on the page to the effect that a non-standard font based
encoding has been used and hopefully also provide an alternate version of the
page where the characters are properly encoded.

AFAIK clients using IE and running Win2K with Devanagari and Tamil fonts etc.
installed should be able to view properly encoded pages for those scripts
properly - and hopefully people using other operating systems and other free
browsers will be able to do so in the very near future. Even if there is no
system to render them correctly, properly encoded pages for these scripts
provide useful test data for people trying to build such applications and fonts.

If subscribers to *this* list don't make an effort to start using Unicode,
rather than font based encodings or "sick HTML", on their web pages what hope is
there of getting other people to do so?

Right now I'm in the process of writing some similar pages on Tibetan and I've
been making the Tibetan script examples as .gif images *and* UTF-8 encoded
Tibetan text - although I don't know of anything available right now which is
going to render that part of these pages correctly. Sure it would be easier to
use a non-standard font based encoding or make .PDF files using software with
existing 8-bit fonts as I've done myself in the past - but I think the time has
come when at least those on this list should practice what we preach. I'd for
instance like to see UTF-8 versions of the code charts on the web
site right along side the existing HTML with .GIFs and PDF versions.

> Would you thus be so kind to provide:

> - links to *free* fonts supporting Indic scripts (*really* supporting them,
> not just the glyphs needed for making Unicode charts);

There are a number of public domain Indic script fonts available and the
software tools needed to convert them to OT are also available at no cost.

> - links to *free* browsers supporting Indic scripts now (*really* supporting
> them, no compromises please);

> - links to *free or very cheap* software to automatically convert existing
> HTML text from the currently used "font-base encoding" to Unicode;

Since font based encodings tend not to follow any recognized standard,
off-the-shelf converters are going to be hard to find - but PERL is good for
this sort of thing.

> - links to *free or very cheap* authoring tools to write HTML pages in Indic
> scripts.

You can write HTML pages with UTF-8 characters using almost any text editor.

BTW why does everything have to be *free or very cheap*? - Creating good fonts
for complex scripts, and software to make use of them, is a lot of work.
Developers of these fonts and software deserve to get paid by people who use
them just as much as anyone else (and probably more than many dot com
millionaires). If some of these people or their employers choose to make their
work freely available that's nice - but I don't think we should expect it. In
the past free software and fonts have usually appeared only once commercial
software that does more or less the same thing has been around for a while. Of
course it might do a lot for the take-up of Unicode if more of the big
corporations who are members of the Unicode consortium sponsored the development
of fonts and software which worked with complex scripts and then made all this
available *free or very cheap*.

- Chris

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT