From: Behnam (behnam.rassi@gmail.com)
Date: Tue Aug 29 2006 - 20:56:45 CDT
I don't want to hijack this thread for a non Kurdish issue but since
it might also be beneficial to Kurdish case, I simply say that
Unicode didn't resolve the issue of U+0647. It did however,
recognized that this letter has five forms and not four. The fifth
form has no contextual behavior whatsoever therefore it requires a
separate code. But since this exceptional case doesn't fit the
general pattern of 'one code for one letter>related contextual
shapes' , it did not honor this exception by assigning a separate
code for a form of the same letter.
But it did apparently recognize that two isolated forms can't live
together under one single code and then scratched one of them... the
good one.
And the reasoning behind this choice seems pretty lame to me. Nobody
chooses small alpha instead of Greek capital alpha (or whatever) to
avoid similarity with English A.
U+0647 IS wrongly represented because that shape doesn't join to
anything. How can it be representative of double joining Arabic
letter heh?
To understand this, you must understand that the shape of initial
form of Arabic heh has nothing to do with the shape of two eyed
isolated form. The visual similarity is superficial and the
functionality is completely different.
Behnam
On 29-Aug-06, at 8:57 PM, Kenneth Whistler wrote:
> Not being an Arabic script expert, I cannot comment
> meaningfully on the details of Kurdish shaping or the
> other claims in this thread, but...
>
>> This is not a Persian letter issue. It's Arabic letter U+0647 issue
>> for Arabic, old Turkish, Persian.. and now perhaps Kurdish and there
>> may be more.
>> What is called two eyed initial form is only used as initial form and
>> doesn't need a control character.
>> What is produced by control character is only because Unicode doesn't
>> allow any other option but the real intended shape,
>
> That claim seems to me to be incorrect. The Unicode Standard
> provides information about Arabic shaping, but there is
> certainly nothing in the standard which "doesn't allow any
> other option" -- including doing the "right thing" when shaping
> for Kurdish or some other language using the Arabic script.
>
> The encoded presentation forms for Arabic and for Urdu are
> simply compatibility forms, and should certainly not be
> taken as constraining how one should shape the actual
> U+06XX Arabic letters in appropriate contexts. And the
> joining groups displayed in Tables 8-7 and 8-8 of the
> standard should *guide* basic Arabic implementations, but
> again should not be taken as tying anyone's hands from doing
> proper shaping for various styles or languages using the script.
>
>> 'abbreviated
>> form', which BTW is wrongly presented as U+0647 in Unicode PDF, is
>> never joined from the left or right.
>
> The glyph used in U+0647 was chosen deliberately as of Unicode 2.0,
> when production constraints no longer allowed the use of more
> than one representative glyph per character in the chart. Since
> Unicode 2.0,
> this choice has always been explained in the text of the
> standard. See TUS 4.0, p. 204. It is not wrongly presented -- it
> is merely *a* choice of *a* glyph for HEH, attempting to
> visually distinguish it from other related letters and U+0665
> ARABIC-INDIC DIGIT FIVE in the chart.
>
> --Ken
>
This archive was generated by hypermail 2.1.5 : Tue Aug 29 2006 - 21:03:04 CDT