discontent about Indic scripts and Unicode

From: Hietaniemi Jarkko (NRC/Boston) (jarkko.hietaniemi@nokia.com)
Date: Tue Sep 18 2001 - 16:03:06 EDT


I happened across these links:

http://acharya.iitm.ac.in/multi_sys/exist_codes.html
http://acharya.iitm.ac.in/multi_sys/uni_iscii.html

which do contain a nice discussion about ISCII but then they
discuss Unicode in, ummm, somewhat negative terms.

Myself knowing next to nothing about Indic scripts it would be nice
to hear comments from someone who does know.

I do notice some misunderstanding about Unicode in the above links,
quoting from the first one:

> Unicode, besides permitting an 8 bit representation for each language,
adds
> an 8 bit identifier as a most significant byte to make the code 16
bits.
> Data processing software using Unicode will be able to identify the
Language
> of the text for each character and use appropriate fonts to display
them.
>
> Technically, Unicode can handle 256 different languages but in
practice,
> this number is significantly smaller. Unicode has allowed nearly 24000
characters
> of Chinese, Japanese and Korean scripts to be included in a single
set.
> Currently fewer than a hundred languages are included in the Unicode.

>
> Even though it is a sixteen bit code, Unicode usually provides for
about
> 128 characters for each language.

A messy conflation of "languages" and "characters" and "fonts". Not to
forget
"sixteen bit code".

The web site has been updated in July.



This archive was generated by hypermail 2.1.2 : Tue Sep 18 2001 - 15:15:07 EDT