Re: RTL PUA?

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Mon, 22 Aug 2011 16:23:27 +0200

2011/8/22 Shriramana Sharma <samjnaa_at_gmail.com>:
> On 08/22/2011 08:24 AM, Peter Constable wrote:
>>
>> I'm not saying that there shouldn't be_some_  software that can do
>> what you expect. But there will likely be some different views on
>> what ought to be included within that "some".
>
> Peter, given that both AAT and Graphite have provisions for assigning custom
> properties including BC to PUA characters, it seems Uniscribe is the only
> one missing out. Those advocating RTL PUA areas seem to reject AAT and
> Graphite as "hacks" or "wow *one* application" [*].

I don't know really if AAT allows storing custom character properties
for PUAs. Graphite has some provisions, but its not technically
documented as a directional character property, in the sense given by
Unicode, because Graphite also works at the level of glyph id's and
not the level of code points and characters.

In addition, your statement about Uniscribe is incorrect. This
concerns in fact OpenType in a more general way. But I suspect that
the strong opposition given by Peter Constable is not really about how
or if OpenType can integrate some table to convey the direction
property of PUAs, but that he is more concerned about how the
Uniscribe implementation is layered in its architecture.

I don't think that even the Uniscribe API (in Windows) has to be
modified. In fact when Peter says that the Bidi processing and the
OpenType layout engine are in separate layers (so that the OpenType
layout works in a lower layer and all BiDi processing is done before
any font details are inspected), I think that this is a perfect lie:

At least the Uniscribe layout already has to inspect the content of
any OpenType font, at least to process its "cmap" and implement the
font fallback mechanism, just to see which font will match the
characters in the input string to render.

If it can do that, it can also inspect later a table in the selected
font to see which PUAs are RTL or LTR. And it can do that as a source
of information for BiDi reordering, which does not reorders characters
really, but assign them ordering indices that are then also later
tuned with the help of knowledge of other characters properties. In
fact these indices are also marked with additional attributes (for
example when a character is multipart, each part becoming a separate
glyph that reorders differently fro mthe other part, such as with some
Indic vowels).

There's already a strong intrication of the glyph reordering and the
BiDi reordering. Glyph reordering is also font dependant (it depends
on the presence or not of some features). All this intrication needs
to be part of the OpenType layout engine (in Windows, Uniscribe is not
the only one layout engine, even if it is used in full or part).
Uniscribe is in fact structured in several layers in its API, and the
application using Uniscrieb can implement the BiDi algorithm
separately, including between those layers (for example after
Uniscribe's text segmentation). Some complex cases (notably managing
the line breaks when formatting paragraphs) requires these separate
layers of processing, in which the BiDi algorithm will be intricated
(the BiDi algorithm does not really reorder the characters for a
complete text, it computes an even/odd indicator (based on Bidi
embedding levels), that helps determine runs of characters that share
the same RTL or LTR resolved direction. But those runs can be splitted
(notably by linebreaks, whose position highly depends on the metrics
of glyphs, and the OpenType substitutions of default glyph id's (from
the cmap) by ligatures or by alternate forms, some of which being
mandatory for specific scripts, some not and only selected in a
dictionary way by the application (that enables some features
specifically, or select a precise alternate).

-- Philippe.
Received on Mon Aug 22 2011 - 09:28:57 CDT

This archive was generated by hypermail 2.2.0 : Mon Aug 22 2011 - 09:29:02 CDT