On Tue, Feb 20, 2001 at 00:41:58 -0800, Otto Stolz wrote:
> > Yes, there are CS (classical sanskrit), CSX (CS eXtended) and now CSX+
> > 8-bit character sets for transliteration of Indic languages. CSX+
> > covers all the essential characters needed for ISO 15919, the draft
> > standard for transliteration of Indian languages.
> > http://ourworld.compuserve.com/homepages/stone_catend/translit.htm
> > Find attached a mapping file for CSX I wrote to convert a Pali
> > dictionary to Unicode (with perl's Unicode::Map module).
>
> 8-Bit encodings, and font switching, clearly is yesterdays
> technology;
But CSX *is* yesterdays technology. Indologists have been using
CS/CSX for quite some time, way before deployment of Unicode was
practical.
> If it turns out that the required characters are indeed available in
> Unicode, I'd suggest that new texts should exploit this technology,
> particularly if you are planning to publish them via the WWW.
That's why I made and posted CSX mapping. There are a LOT of old
CSX-encoded material. With this mapping I can use existing software
(like the mentioned perl module) to convert it to Unicode and use
emacs to view/edit it.
Dr. Smith wrote CSX+ mapping, but it requires upto 4 Unicode
characters for some of CSX+ characters and I'm not sure any existing
program that use mapping files can handle this map. OTOH, Dr. Smith
wrote a dedicated CSX+ to UTF8 converter (available from his site).
So I don't think it's fair to blame Indologists for shunning Unicode ;-).
SY, Uwe
-- uwe@ptc.spbu.ru | Zu Grunde kommen http://www.ptc.spbu.ru/~uwe/ | Ist zu Grunde gehen
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT