From: Mete Kural (metekural@yahoo.com)
Date: Fri Feb 28 2003 - 13:05:07 EST
Hello Folks,
I wanted to ask a question to those of you who have
Unicode Arabic knowledge. We have this website
http://www.quranreader.org where we are trying to
display the text of the Quran with accurately encoded
Unicode text rather than the traditional images. Some
of the characters in the Quran aren't rendered
correctly. We are letting the browser to use its
default Unicode font on the website, which is Times
New Roman Unicode for the newer versions of Internet
Explorer I think. If we used a high-quality Unicode
font for Arabic, would this solve the problem? Or is
this a bigger problem that has to do with the
rendering engine provided by the operating system?
I would like to give you an example. In Arabic when
you have a Lam And Alef together, it is rendered in a
unique way instead of the regular rendering for these
letters that kind of looks like this:
\ /
\/
/\
\/
Figure 1
In the Quran, there is sometimes this combination of
characters: Lam-Hamza-Alif
In such a case, the Lam and Alif are still rendered
the way they would be had there not been a hamza
inbetween, and the hamza is simply put above the alef
and lam in the middle which looks kind of like this:
c
\ /
\/
/\
\/
Figure 2
Note that this is different than the case as
illustrated in Figure 3 where the hamza is directly
above the alef and not "in between" lam and alef.
c
\ /
\/
/\
\/
Figure 3
So there is a subtle difference that the hamza is not
directly above the alef but rather in between the alef
and the lam. I am attaching a small gif file named
"Sample.gif" that will demostrate the subtle
difference of the positioning of the hamza. Attached
are two words from the Quran. Look for the second word
where the hamza is in between the alef and the lam
instead of directly above the alef.
When we encode this case with this combination of
Unicode characters: 0644-0627-0621
in Internet Explorer, instead of showing it like
Figure 2, it totally seperates all letters and shows
it like this:
| |
| |
| C \__/
which is totally wrong.
Which one do you think is the problem here?
1) We are not encoding this combination of characters
in the correct way.
2) This is a font-related problem.
3) This is a bigger problem for which the rendering
engine on the operating system has to be modified.
Thank you very very much,
Mete Kural
This archive was generated by hypermail 2.1.5 : Fri Feb 28 2003 - 13:42:41 EST