Re: To submit or not to submit

From: John Hudson (
Date: Mon May 13 2002 - 02:07:50 EDT

At 22:57 5/12/2002, Amir Herman wrote:

>If this is the case, why in Unicode it have Arabic Presentation A & C to
>present the final, medial, and initial form of Arabic characters?

Ah, this is an historical oddity. Character inclusion in Unicode is
governed by a number of principles, which are not necessarily mutually
inclusive. So the 'purity' of the character encoding model is sometimes
sacrificed in the interests of another principle, such as the need to
provide ono-to-one backwards compatibility mappings with older character
sets. So Unicode effectively inherits some of the poor design or technical
limitations of pre-existing character sets. I'm sure Ken Whistler or one of
the other UTC members can explain exactly why the Arabic presentation forms
ended up being encoded. I seem to recall that there was some political
pressure (from Egypt?) that resulted in the unfortunate encoding of an
arbitrary subset of potential ligature forms, but I don't know the details
or whether this pressure was also responsible for the inclusion of the
final, medial and initial forms.

Some existing software, e.g. Adobe PageMaker and InDesign ME (made by
WinSoft in France), makes use of the presentation form encodings to enable
a very basic kind of Arabic shaping. Developers of such software to whom I
have spoken seem to be aware that this is a bit of a hack, and that a more
sophisticated and elegant Arabic shaping methodology requires only the
codepoints in the basic Arabic block. Everything else can be handled at the
glyph processing stage.

John Hudson

Tiro Typeworks
Vancouver, BC

Last words of Jesuit grammarian Dominique Bouhours:
"I am about to or I am going to die; either expression is used."

This archive was generated by hypermail 2.1.2 : Mon May 13 2002 - 02:57:31 EDT