From: William J Poser (wjposer@ldc.upenn.edu)
Date: Mon Mar 26 2007 - 14:15:00 CST
Arabic characters fall into two classes. So-called "connectors"
potentially link up both to the left and to the right. So-called
"non-connectors" potentially link only to the preceding character,
not to the following character. One says "potentially" because
whether linkage actually takes place depends on whether there is
a character to the left or right and what its own class is.
The result is that connectors have four variants: (a) isolated;
(b) left-linked; (c) right-linked; (d) doubly-linked.
(In most descriptions of the writing system the misleading
terms "initial", "medial", and "final" are used for "left-linked",
"doubly-linked", and "right-linked".)
Non-connectors have just two variants: (a) isolated; (b) right-linked;
These variants are shown in any textbook of Arabic.
In normal Unicode usage the rendering engine is supposed to take
care of this, but if you need to compute it abstractly,
the positional variants are also encoded in Unicode in the
block "Arabic Presentation Forms B" U+FE70 through U+FEFF.
If your rendering engine does not handle this, you will also
need to take into account the fact that certain characters
combine irregularly. The ligatures are to be found under
"Arabic Presentation Forms A" U+FB50 through U+FDFF.
Bill
This archive was generated by hypermail 2.1.5 : Mon Mar 26 2007 - 14:18:16 CST