Re: RTL PUA? from Philippe Verdy on 2011-08-23 (Unicode Mail List Archive)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Tue, 23 Aug 2011 23:05:39 +0200

2011/8/23 John Hudson <john_at_tiro.ca>:
> Behdad Esfahbod wrote:
>
>>> I can see the advantages of such an approach -- performing GSUB prior to
>>> BiDi
>>> would enable cross-directional contextual substitutions, which are
>>> currently
>>> impossible -- but the existing model in which BiDi is applied to
>>> characters
>>> *not glyphs* isn't likely to change. Switching from processing GSUB
>>> lookups in
>>> logical order rather than reading order would break too many things.
>
>> You can't get cross-directional-run GSUB either way because by definition
>> GSUB in an RTL run runs RTL, and GSUB in an LTR run runs LTR. If you do
>> it
>> before Bidi, you get, eg, kerning between two glyphs which end up being
>> reordered far apart from eachother. You really want GSUB to be applied on
>> the
>> visual glyph string, but which direction it runs is a different issue.
>
> Kerning is GPOS, not GSUB.
>
> But generally I agree. My point was that Philippe's suggestion, although it
> could be the basis of an alternative form of layout that might have some
> benefits if fully worked out, is a radical departure from how OpenType
> works.

Rereading closely the OpenType spec, in fact I don't see any major
problem if even the Bidi algorithm is applied last, even after
applying not only the GSUB's (ligaturing, custom Indic reordering of
multipart vowels or ra forms), but also the GPOS (yes, this is for
kerning, i.e. base-to-base, but also for mark-to-base and mark-to-mark
positioning).

I admit that this wouldviolate some existing rules implied in some
implementations, but at least it would offer some more intererests.
However, if one really wants to implment kerning between LTR runs and
RTL runs (e.g. between an Arabic letter and a Latin letter), one would
need to make sure that Bidi reordering has been performed before GPOS
(and this is really the case...).

Processing such kerning pairs would require another convention than
the "resolved" direction. It would require that such kerning pairs are
scanned only so that the first item of the pair will always be the
left-most. GPOS is in fact more powerful than that because it can also
involve more than simple pairs, using contexts longer on both the
right and the left of tested glyphs.

But the existence of such complex positioning rules would create
difficulties for the actual readers of the rendered text, because he
will not know from which side he must start to read a word that
displays for example a run of Latin letters on one side, and a run of
Arabic letters on the other side. Let's say that he starts by reading
the Arabic part, in normal order, how to read the LAtin part of this
strange «word».

It's is still not a stupid case: such positioning problems occur at
the boundaries of words, where there are whitespaces. Once you have
"resolved" the direction of those whitespaces, there's then a boundary
with the next word which may use another direction. What happens on
those whitespaces is that you may find typographic elements (such as
swashes) which should not overflow on the next part.

Currently it is assuled that writers will use a larger whitespace
character if needed, to avoid collisions. But if the whitespace is
very narrow, or is zero-width, the problem resurrects immediately of
kerning, in its traditional typographic definition, which is to
improve the legibility of the rendered text, to exhibit a visually
constant spacing between words and between letters, so that
inter-letter separation will not be confused with interword
separation.

I admit that this (extremely rare) problem is much less critical with
the Arabic script (because it is always cursive and most letters in
the same word are joined), but this means that the probem may be more
significant between Latin and Hebrew, or more probably between Greek
and Hebrew (in very old historic texts, where even the Greek script
did not have a strong LTR directionality, and where whitespace was not
always used between words).
Received on Tue Aug 23 2011 - 16:08:26 CDT

This archive was generated by hypermail 2.2.0 : Tue Aug 23 2011 - 16:08:27 CDT