From: Maha Hassan (maha.hassan96@yahoo.com)
Date: Fri May 09 2008 - 18:41:52 CDT
Thanks for the references.
But, why U+06C7 has no decomposition? I can enter from Arabic keyboard U+0648\U+0619 and get the exact glyph in U+06C7. How come u+0623 has a decomposition and not U+06C7?
What the criteria?
Thanks,
Maha
----- Original Message ----
From: Kenneth Whistler <kenw@sybase.com>
To: maha.hassan96@yahoo.com
Cc: unicode@unicode.org
Sent: Friday, May 9, 2008 2:45:54 PM
Subject: Re: Arabic Normalization chart
> I am trying to understand the normalization chart for Arabic.
> Why there are certain glyphs are not decomposed entirely under KD, for
example:
> \FBF0 ==> has KD = \064A\0654\06C7 instead of =\064A\0654\0648\0619
> \FBDB ==> KD= \06c8 instead of =\0648\0670
> am I missing something?
Yes.
U+06C7 and U+06C8 have no decompositions.
06C7;ARABIC LETTER U;Lo;0;AL;;;;;N;ARABIC LETTER WAW WITH DAMMAH;;;;
^^
06C8;ARABIC LETTER YU;Lo;0;AL;;;;;N;ARABIC LETTER WAW WITH ALEF ABOVE;;;;
^^
You cannot infer formal decompositions for letters --
particularly for Arabic -- simply by looking at the
characters in the chart. To get the normative decomposition
status of any particular character (which determines
what its NFD or NFKD or NFC or NFKC normalizations will be),
you have to look at the decomposition field in
UnicodeData.txt (or check in NormalizationTest.txt)
--Ken
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
This archive was generated by hypermail 2.1.5 : Fri May 09 2008 - 18:45:04 CDT