Re: Cultural bias

From: Glenn Adams (glenn@spyglass.com)
Date: Sun Jan 12 1997 - 03:47:09 EST


At 08:21 AM 1/11/97 -0800, unicode@Unicode.ORG wrote:

>I haven't looked at Unicode 2.0

You should if you're interested in being informed rather than merely
opinionated.

>A acute, a acute, E acute, e acute, I acute, i acute etc., etc.,

As I believe has been pointed out, what constitutes a logical entity
is language specific. For instance, in Vietnamese, A WITH CIRCUMFLEX
is a single grapheme (a particular vowel) while A WITH ACUTE is two graphemes
(A denoting a vowel, and ACUTE denoting a rising tone).

In my opinion, it is reasonable to encode as single entities all
basic graphemes of this sort even when they may be decomposed into
constituent graphical components.

In the case of Devanagari conjuncts, there is little merit in arguing
that they are graphemic units rather than merely allographic presentations of
multiple underlying graphemes. Show me a dictionary which lists conjuncts
as distinct entries. (Perhaps you can make a case for K.SSA here).

>Why is then Devanagari forced to represent its ligatures as multiple
>characters, to be deduced from the character encoding, and with the
>requirement of (paraphrasing Glen Adams' words) of "complex character
>encoding to glyph translation" schemes ?

For one thing, the script is more complicated that Latin; are you saying
it is not?

>If Latin was encoded with the same regard that is given to Devanagari,
>then there would be no A acute character, it would have to be entered
>as <A> + <acute sign>.

Some people would advocate this level of graphical decomposition irrespective
of the correspondence between basic logical units and encoded units. As I've
mentioned above, you cannot generalize in your comparison bettween Latin and
Devanagari based on limited knowledge of how actual written languages employ the
script.

>For any other language, Arabic or Hindi, to have a glyph encoding,
>however, is a no-no, and we are told to consider the allographic
>versus the graphemic, to stop thinking like font designers, etc. etc.
>There is no rationality in this.

Actually, a great deal of practical experience in building systems to support
these scripts and perform multifarious text operations on such encoded data
went into the design. The gray edges you are complaining about are based on
the need to remain compatible with history.

>I hope the purveyors of graphemic purity have the grace to blush.

I never blush. Nor do I advocate purism of any sort. I maintain that what's
there is both necessary and sufficient to accomplish the goals at hand. I'm
always open to counterarguments based on facts; on the other hand, opinions don't
sway me much.

>Right now, the software industry and its standards are in
>the custody of the western nations, but that will not be forever.
>In other parts of the world, scripts even have religious significance;
>I do not think people will take poor representations of them lightly.

You should be praising the efforts of the Unicode Consortium to make sure
that this change comes about; namely, bring software to the entire world.
I know that is the fervent goal and commitment of all of its participants.
Rather than criticize it, use it. Then come complaining. In the mean time,
cease this nonsense about cultural bias.

Regards,
Glenn Adams



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT