RE: Encoding Bengali Vowel forms (again)

From: Apurva Joshi (apurvaj@microsoft.com)
Date: Mon May 01 2000 - 21:23:34 EDT


Please see my response below.
Thanks,
-apurva

-----Original Message-----
From: Dhrubajyoti Banerjee [mailto:dhruba@cdac.ernet.in]
Sent: Friday, April 28, 2000 9:36 PM
To: Unicode List
Subject: Re: Encoding Bengali Vowel forms (again)

From a Bengali's point of view (who has studied Bengali in Kindergarten from
'Barnparichay' [a bengali learner]), and after queries to some students of
Bengali I find the question of including A_zophola_aa as a valid form is
not justified. It is far better to include it as a glyph in the font.
In the original Bengali script it is an (unmentioned) rule that there is no
zophola after a character in the 'Swarabarna' (Bengali list of vowels).
zophola is allowed after characters in the 'Banjonbarna' (Bengali list of
consonants)
[apurva:] Indeed. The fact that Ya_phola [aka zophola, jophola ] exists in
the 'Byanjonbarna' and not in the 'Swarabarna' is a reason to think about
considering it as a newer 'specific addition' to the Bangla block. Since it
is not just a matter of character 'order' in predetermined equivalent
sequences, but is at the basic level of defining the 'contents' of the
sequence itself.
From my limited understanding, the option of including it at the level of a
glyph in a font, transfers the responsibility of its implementation into the
hands of a shaping engine that handles Indic. While this means flexibility
at the shaping engine level, it also means each such shaping engine can do
so, using totally different rules. This I guess could also imply different
backing store contents [character sequences] for each. Thus opening
documents originally shaped by shapingEngineX, that uses shapingEngineY
[although both might be based on Unicode], has good chances of resulting in
undesired [or worse, mangled] results. Please correct me if I'm wrong here,
or if there is merit in transferring such responsibility to a shaping
engine.

As in most indic scripts(e.g. Devanagari) the 'at' symbol was incorporated
in bengali to include English sounds and to be able to write words like
'America'.In original Bengali there was no requirement for such a 'forced'
vowel. In this case Bengali was lucky because it already had a zophola with
which to take care of words like B-at, C-at etc whereas Devanagari had to
include the 'candra' (0945) sign for the same purpose to be able to write
English words.
[apurva:] Since linguists in Devanagari seem to have preferred devising a
newer form [to permit one-to-one mapping], I would not imply the use of
Ya_phola as inferior in any way. I certainly think the use of Ya_phola in
Bengali is a smart work_around using the existing set of Bangla alphabet.
However, when its use first began, I guess it was hand-composed on many
metal_type printing presses. This meant that it did not require the
typesetter to 'select a sequence of characters' and 'replace them with the
Ya_phola'. Later typesetting systems have followed their own preferred way
of generating the Ya_phola.

The following words using zophola(or jophola as it is pronounced in Bengali)
from an exhaustive Bengali dictionary, will illuminate my point better.
1)Anh(A_zophola_aa_candrabindu) :: a word to express sound of surprise or
shock.
2)Advance.
3)Advertisement.
4)Advocate
5) Amplifier
6) Alumunium
7) Acetylene
All the above words are written in Bengali using A_zophola_aa where only the
first word(which is a sound) can actually be considered a bengali word and
the rest all are derived from English.

As such there are exceptions to the common grammar in some scripts, which
are, however, included in the font encoding by many vendors (e.g. CDAC has a
single glyph for A_zophola_aa) and can be handled as an 'exception' to the
general form.
However I agree with you that the A-zophola- and E-zophola have not been
mentioned in ISCII or Unicode and at least a reference needs to be present.

D. Banerjee



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT