From: Gregg Reynolds (unicode@arabink.com)
Date: Fri Jun 03 2005 - 14:03:29 CDT
Andreas Prilop wrote:
> In  http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf
> I read on p. 15 (= p. 204)
> 
> | In some cases, characters occur only at the end of words
> | in correct spelling; they are called trailing characters.
> | Examples include teh marbuta, alef maksura, and dammatan.
> | When trailing characters are joining (such as teh marbuta),
> | they are classified as right-joining, even when similarly
> | shaped characters are dual-joining.
> 
> In  http://www.unicode.org/Public/UNIDATA/ArabicShaping.txt
> however, the trailing characters U+0649
> http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata06.html#x0649
> and U+06BA
> http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata06.html#x06BA
> are classified as dual-joining.
> Why?
> 
FYI, to add to what Rick sent:
ALEF MAKSURA is incorrectly named.  Or you could also say the glyph is 
incorrect.  The term "alef maksura", in Arabic, denotes a grammatical 
category, not a character.  It means the (implicit) preceding vowel "a" 
should not be lengthened.  It occurs at the end of words, and it takes 
two forms:  U+0649 (dotless yeh), and U+0627 (alef).  Both are called 
(denote) alef maksura in the right circumstances.  If Unicode wants to 
call U+0649 ALEF MAKSURA then it should also allow for the alef 
letterform.  But it would be better to call it DOTLESS YEH.  The 
_letterform_ DOTLESS YEH occurs in all four forms in written Arabic. 
Note that students learning Arabic are usually taught that final dotless 
yeh _is_ alef maksura (it's simpler that way) and don't learn what it 
means nor that final alef is also sometimes called maksura.  This is 
likely true for the average student in the Arab world too, I would guess.
Regarding orthography, it is very common to see dotless yeh in final 
position used interchangably with the character (dotted) YEH (U+064A), 
especially in printed material from Egypt.  This is frequent in the 
Quran as well; for example, the common particle "fy" (FEH YEH) 
(pronounced "fee") may be written with dotless yeh instead of dotted 
YEH; in no way could this be considered and ALEF MAKSURA.  I suspect 
this reflects typographic aesthetics; it lightens the page a bit.  I 
also see this occasionally in office correspondence.  Which you might 
take as evidence that U+0649 is most "naturally" construed as a YEH form 
by native speakers.
Thomas Milo pointed out that dotless yeh occurs in the middle of words 
in the Quran.  I don't know about the "middle", but it does commonly 
carry a "stacked" element at the end of words.  For example, it is often 
surmounted by U+0670 (small "dagger" alef above) and or U+0653 (the 
MADDA mark).  You would think of it as the final character in the word, 
but of course the Unicode representation would place it before the 
"stacker" elements.
Since the name cannot be changed, I would suggest the removal of 
language referring to ALEF MAKSURA as a trailing character, and adding a 
note about its relation to YEH phonological semantics.  It might also be 
a good idea to add a note to the definition of YEH indicating that the 
dots are a stylistic matter and are optional in certain circumstances.
-gregg
This archive was generated by hypermail 2.1.5 : Fri Jun 03 2005 - 14:03:38 CDT