From: John Hudson (john@tiro.ca)
Date: Sat Jan 05 2008 - 01:32:38 CST
arno wrote:
> a chairless hamza after a dual joining Arabic letter followed by a
> joining Arabic letter is ALWAYS either transparent (between lam and
> alef) or inserts a tatweel like connection between the two letters
> ALWAYS
Yes, I understand all of this. I avoid the term tatweel, though, because elongation models
in Arabic are style-specific and insertion of a horizontal extender is
technology-specific, i.e. there are styles in which tatweel is inappropriate and there are
technologies that implement elongation without inserting extender glyphs.
> = in MSA it is a typo, that's why your fonts do not behave
> properly, because the designers do not envision the case (whenever
> somebody write it on the machine, she immediately corrects it);
I don't understand this section of your message.
> As far as Arabic is concerned -- and this is of course an important
> qualification -- all your arguments against modifying the official
> joining behaviour of chairless hamza are baseless.
Note careful distinction of character and glyph in the following discussion:
Let's say we have a typical Arabic font. And we want to display the sequence lam +
chairless hamza + alif. You suggest that the definition of U+0621 be changed so that it
will not interrupt the joining of lam + alif, i.e. that it will be transparent (which we
all agree, I think, is how it should be). What happens in most current fonts is that there
is a single glyph associated with U+0621, and this *glyph* interrupts the formation of the
lam+alif combination, which is most often handled as a ligature glyphs (although not in
the SIL fonts). So one ends up with a disconnected sequence:
لءا
This is not merely an issue of the current Unicode properties for this character, but also
of existing implementations of those properties in fonts and other software. Simply
changing the Unicode properties doesn't make these implementations go away or magically
make them work with the new properties.
So let's say that a layout engine that implements your proposed new properties for U+0621
encounters a typical Arabic OpenType font that does not. The layout engine treats U+0621
as transparent, and applies appropriate shaping to the lam and alif. Even so, there is no
guarantee that the lam+alif ligature in the font will correctly form, because the hamza
*glyph* interrupts the lookup sequence
Lam.init Alif.fina -> Lam_Alif
The sequence of glyphs needs to be present in order for the ligature to form, but you are
probably going to end up with
Lam.init Hamza Alif.fina
In order for the hamza to be ignored in this sequence, it needs to be treated in the font
as a non-spacing mark -- which is of course exactly what you want it to be in this context
-- but that means it has to be defined as such in the font GDEF table. Since the font has
been built around the assumption that U+0621 is a non-joining character, the glyph is not
defined as a non-spacing mark and is not ignored during ligature formation.
But let's pretend for a moment that the lam+alif ligature does form correctly: you are
left with this (typically quite large) hamza glyph that is now reordered after the
ligature glyph (which is what happens when a ligature forms while ignoring an intermediary
glyph) and has no information in the font telling the layout engine what to do with this
glyph: no substitution information that will convert it into a non-spacing mark glyph, no
positioning information that will locate it correctly relative to the lam+alif.
Whether chairless hamza is addressed at the new character level (as suggested by Khaled)
or at the character properties level (as suggested by you) or at the display level (as
suggested by me), layout engines and fonts are going to need to be updated to take handle
it correctly. My concern is that the solution should avoid actually breaking existing
implementations, and the way to avoid this seems to be to not tamper with U+0621. I
wouldn't mind seeing U+0621 formally deprecated -- i.e. maintained in the standard, but
not recommended to be used -- and a new character with correct properties and
implementation guidelines introduced to replace it. That would effectively freeze the
current implementations for that character, allow them to continue to function as they
have been, and not introduce new demands for the handling of this character that existing
layout engines and fonts cannot meet.
John Hudson
-- Tiro Typeworks www.tiro.com Gulf Islands, BC tiro@tiro.com The Lord entered her to become a servant. The Word entered her to keep silence in her womb. The thunder entered her to be quiet. -- St Ephrem the Syrian
This archive was generated by hypermail 2.1.5 : Sat Jan 05 2008 - 01:36:40 CST