Here
are my replies to the relevant responses in response to my post on
the subject "Assamese and Bengali controversy in Unicode :::
Solutions"
*Mr Ewell
1.
The names of characters do not cause any kind of technical problem in
using them. Letters called “Latin” in Unicode are used to write
hundreds of languages that are not Latin. Different languages
sometimes call the same letter by different names, and this is also
not a technical problem.
*Mr Kolehmainen
The
various scripts to write the languages of Europe are indeed different
scripts, some of which are used to write many different languages.
*Mr
Shoulson
If
that truly is the concern here, then surely English should feel at
least as slighted. The word "ENGLISH" appears nowhere in
the Unicode database as the description of any character. Nor does
"ITALIAN", "DUTCH", or "FINNISH".
"FRENCH" appears only in U+20A3 FRENCH FRANC SIGN (a
currency symbol) and in U+1F35F FRENCH FRIES.
Even
"AMERICAN" shows up only in the emoji U+1F3C8 AMERICAN
FOOTBALL. I think this demonstrates that having a name on a character
in Unicode does not indicate anything about how literate a language
is or should be perceived. Conversely, whatever script the Phaistos
disc is written in has its entire known literature consisting of a
single document, but it gets a whole section in the standard.
*Mr
Everson
The
Latin script is named as Latin, and Germans are forced to use it, and
the Irish are forced to use it, and even in India where English is
one of the official languages, the Bengalis and the Assamese are
forced to use it.
You
used the Latin script in your e-mail. But you were writing in
English, not in Latin.
Why
are you not coming out shouting about THE ENGLISH AND LATIN
CONTROVERSY IN THE UNICODE STANDARD?
BENGALI
LETTER RA WITH MIDDLE DIAGONAL could be named ASSAMESE LETTER RO. But
it hasn't been, because Bengali is spoken by 230 million speakers,
and Assamese is spoken by 13 million. Moreover, the script was
encoded about two decades ago, because it had been brought in because
of its standardization in ISCII.
Do
you really think it is unfair that, 230 million speakers vs 13
million speakers, the name Bengali has been preferred? Well, tough.
Grow up. YOU DON'T KNOW HOW LUCKY YOU ARE to have your script already
encoded.
Reply
:
The
answer to these responses is exactly and accurately provided by Mr
Everson. I am telling about the "THE ENGLISH AND LATIN
CONTROVERSY IN THE UNICODE STANDARD". The Latin script developed
in ancient Roman civilisation and two nationalities are inheritors of
the Roman heritage the Italians and the Romanians. The number of
English speakers using the Latin script is far more than the Italians
and the Romanians put together. How will it be if the Latin script is
called the English script as is called so, by many ignorant people in
the third world countries.
This
has exactly happened when the script that historically developed in
ancient Assam then called Kamrup is internationally named as Bengali.
Bengali have got it from the ancient Assamese and used it by adapting
to their usage system because Assamese use it in a different way.
Worth
mentioning that Bengali may be considered a Sanskrit origin language
but Assamese is not. In the process they have omitted one important
letter making it phonetically incomplete. It is right for any
responsible international organisation be it Unicode or ISO to
misrepresent something on the ground that one community is larger and
more influential than the other ? I have explained this truth in my
report sent to the Unicode Consortium in November last year, it can
be found here and also here.
The contents of the statements of Mr Michael Everson discriminating a
smaller linguistic group in favour of a larger one, are in clear
violation of the provisions enshrined in the UNIVERSAL DECLARATION ON
LINGUISTIC RIGHTS, links to which are provided by Mr Everson himself
in his personal website.
*Mr
Everson
I'd
like to say one more thing about this waste of time.
>
Dr Satyakam Phukan
>
General Surgeon
>
Jorpukhuripar, Uzanbazar
>
Guwahati, Assam
Dr
Phukan is clearly making a lot of noise on his own behalf.
I
do not believe he speaks for most Assamese.
In
fact, here is what I believe:
The
Assamese are already using Unicode and printing newspapers and
magazines
and books and posters and all sorts of things and are not worried
about this cosmetic issue.
And
they have been doing so for years.
Reply
:
Mr
Everson is grossly misinformed about the status of this issue among
the Assamese people. Not only conscious public even the Government of
the state of Assam is seized of this issue. In February this year the
Government of Assam has requested the Government of India to move the
Unicode Consortium for obtaining a separate slot/range/block for the
Assamese script. You can find the official
communication here. Along with that a proposed Code Chart for the
purpose has been prepared and sent along with that, you can find the proposed
Code Chart here.
*Mr
Ewell
2.
Latin, Greek, and Cyrillic are different scripts, not just different
alphabets within the same script, and the analogy with
Bengali/Assamese is inappropriate. See Technical Note #26 for more
information.
Reply
: I have read the Technical Note #26, and have come to know why they
are not unified. I am highlighting the presence of large number of
duplicate characters between these interrelated script. Although you
are not accepting the Assamese and Bengali as two different scripts,
they are so and I have described in detail in my report to the
Unicode Consortium sent in November last year, it can be found here and also here.
*Mr
Everson
This
analogy (Greek-Latin-Cyrillic duplication) is false. Only *one*
letter has the same shape in all three of those scripts, O o. And
those letters are found also in the Deseret script.
And
that would cause chaos and confusion and internet theft on a massive
scale. It would be the greatest disservice we could do to the people
of Assam. It would be monstrously irresponsible.
Reply
:
Mr
Everson's assertion is totally false, there is massive duplication of
characters between the Greek, Latin and Cyrillic scripts. Deseret
script is used by polygamous Mormon sect of Utah, I do not find any
relevance to the issue in question. To the see the extent of
duplication of characters between the Greek, Latin and Cyrillic
scripts see
this Chart.
The
presence of duplication of characters between the Greek, Latin and
Cyrillic scripts is utilised by unscrupulous elements to indulge
phising and other nefarious activities. If the views expressed above
are to be followed it would mean that :
Allowing
duplication between Greek, Latin and Cyrillic scripts is a great and
responsible service to the entire humanity but allowing duplication
between Assamese and Bengali "would be the greatest disservice
we could do to the people of Assam. It would be monstrously
irresponsible."
*Mr
Ewell
3.
The order of characters in a code chart does not cause any kind of
collation problem, because binary code point order is never assumed
to be correct for language-appropriate collation.
*Mr
Everson
Collation
is important, but it is not handled by the code table. Another
standard handles collation (The Unicode Collation Algorithm, ISO/IEC
14651) and your requirements can be met there.
*Mr Kolehmainen
Your
misunderstanding related to collation is even more surprising. The
sequence of the code points is not the evident basis for the
collation, nor does the default collation (as defined for the full
UCS covering multilingual, multiscript texts by the Unicode Collation
Algorithm, UCA, and the ISO/IEC 14651) apply as such to all languages
written in the same script. The best examples of this are the wildly
different collation sequences of the many languages written using the
Latin script. The Unicode Common Locale Data Repository (CLDR) is an
excellent vehicle to publish the proper collation sequence for any
given language (and script) and region combination.
Reply:
I
welcome your response in clearing up my misconception regarding
collation, but collation error has been a problem with Assamese and
is still persisting, why! the experts can answer better. There is a
topic on this subject in the Unicode forum started by someone
anonymous.
One
character of the Assamese alphabets is not there in the Unicode Code
Chart, can it be a reason ?
*Mr
Everson
The
National Bodies who participate in ISO also maintain the same
standard, through ISO/IEC JTC1/SC2.
*Mr
Kolehmainen
You
either ignore or are surprisingly unaware of the fact that the
Unicode Standard is developed in co-operation with the International
Organization for Standardization (ISO) and the International
Electrotechnical Commission (IEC), specifically with their Joint
Technical Committee 1 for
Information
Technology (JTC1), more specifically with its SC2 (Coded Character
Sets) / WG2 (Universal Coded Character Set) that produces the ISO/IEC
10646. The Unicode Consortium is thus not at liberty to make changes
to the standard on its own.
Reply
:
Assamese
is represented in the ISO by the codes "as" and "asm".
Further information is provided below.
*Mr
Wordhingham
Isn't
the correct way of translating 'BENGALI' in Character names into
Assamese to use the the word normally used to mean Assamese? What
problems does this approach leave?
Don't
you think the Mons are offended by the Mon script being called the
'Myanmar' script?
Reply
: Translation and transliteration system is quite different between
the Assamese and Bengali. Bengali follows the Sanskrit but Assamese
is very different. Please see here and also here I have described in all details possible from my side.
Since
the basic difference the Bengali and Assamese has been ignored
transliteration of Assamese as per ISO 5919 and the Unicode is
totally erroneous please see these chart
1, chart
2 and chart
3 and compare with this chart which will show the actual
transliteration of the Assamese.
Case
of Mon and the Burmese (Myanmarese) is different from that of
Assamese and the Bengali. Mons are older inhabitants of Myanmar than
the Burmese called Bamah. The script that all the communities of
Myanmar uses is derived or copied from the southern Indian scripts.
Similar scripts are used by Tai groups, many which are in Assam .
The
script encoded as Bengali in Unicode Standard originated in the
ancient Kamrup but has been encoded in the named of the borrower the
Bengali who are now more in numbers than the rightful inheritors.
*Mr
Everson
I
am going to say this ten times, so that you understand it: The block
name and character names cannot be changed. The block name and
character names cannot be changed. The block name and character names
cannot be changed. The block name and character names cannot be
changed. The block name and character names cannot be changed. The
block name and character names cannot be changed. The block name and
character names cannot be changed. The block name and character names
cannot be changed. The block name and character names cannot be
changed. The block name and character names cannot be changed.
Reply
: The attitude reflected in these continuous assertions reminds one
of another controversy in the Unicode Standard relating to Khmer. The
problem may be different, as problem of two groups cannot always be
similar. But the attitude reflected is similar, reader can read this
piece of writing by a Khmer speaking person describing their
experience with Unicode, interesting and may give a lot of
introspection to those involved.
*Mr
Everson
Do
you think it is fair for you to be yelling and screaming about
something COSMETIC ("a rose by any other name would smell as
sweet") like BENGALI LETTER RA WITH MIDDLE DIAGONAL and ASSAMESE
LETTER RO? Don't you think you are wasting everybody's time?
Reply
:
Experts
can say how much of the problem cited by us is technical and how much
cosmetic but for us the main grievance is mis-representation of facts
which has resulted in incomplete and erroneous representation of our
script.
Another
example is the description of the "Bengali" script as being
similar to Devnagari in Unicode 6.1. Not a single character of
Assamese and Devnagari are similar except the "i" sign. The
similarity is actually with the Tibetan alphabets, both these scripts
use angles more properly acute angles in the characters. No other
script evolved in the Indian subcontinent uses forms with acute
angles, this fact was informed to the Unicode in the report sent in
November last year. The chart showing similarity between Assamese and
Tibetan compared to Devnagari can be found
here.
Just
as the UNIVERSAL DECLARATION ON
LINGUISTIC RIGHTS says "All languages are equal" it must
seen by all responsible international organisations taking up the
task of codification of the worlds languages that no linguistic group
is mis-represented, discriminated, neglected and deprived of the
their rightful place on the ground of being lesser in numbers or less
influential or because of being neglected by the central government
of a large country where they may be in minority.
Dr
Satyakam Phukan
Received on Tue Jul 10 2012 - 04:14:18 CDT
This archive was generated by hypermail 2.2.0 : Tue Jul 10 2012 - 04:14:19 CDT