Re: Bangla: [ZWJ], [VIRAMA] and CV sequences

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Oct 08 2003 - 15:31:05 CST


Gautam said:

> > The encoding of most Indic scripts is based on ISCII
> > - and that's not going
> > to change. It was adopted since ISCII was the
> > pre-existing Indian national
> > character encoding standard for these scripts.
>
> I understand that this is so. But perhaps it is
> worthwhile for us to be aware of the flaws in ISCII
> that were inherited by Unicode. It is also necessary
> to recognize the fact that the bureaucrats in a
> government are not necessarily the most competent
> people to adjudicate on how a script should be
> encoded. I wonder whether the Dept of Electronics,
> Govt of India, would have any reasons to offer
> justifying the placement of Assammese /r/ and /v/ and
> the long syllabic /r/ and /l/ in their current
> positions.

Why should they? The positions of these characters in
the Unicode code chart for the Bengali script has nothing
to do with the ISCII chart, in any case. They are
*additions* beyond the ISCII chart. In the case of
the Assamese letters, these additions separate out
the *distinct* forms for Assamese /r/ and /v/ from
the Bangla forms, and *enable* correct sorting, rather
than inhibiting it. The addition of the long syllabic
/r/ and /l/ *enables* the representation of Sanskrit
material in the Bengali script, and the code position in
the charts is immaterial.

By the way, the relevant organization now would be
TDIL, within the Indian Ministry of Communications and
Information Technology -- not the Dept. of Electronics.
But be that as it may, they have nothing to do with
the code point choices in the range U+09E0..U+09FF,
as should be clear from the documentation of the
Unicode Standard. See The Unicode Standard, Version 4.0,
p. 219, available online.

--Ken



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST