Re: RTL PUA?

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Sun, 21 Aug 2011 22:31:38 +0200

2011/8/21 Peter Constable <petercon_at_microsoft.com>:
> From: verdyp_at_gmail.com [mailto:verdyp_at_gmail.com] On Behalf Of Philippe Verdy
>
>> A GSUB operation will only be used if it is specified in the correct feature
>> table. The problem here is which feature to use: "rtlm" or "ltrm" ? It's
>> impossible to know because it first depend on the layout engine to KNOW
>> exactly if the run of text is RTL or LTR.
>
> The layout engine already _has_ to know the bidi level of a run regardless.
>
>
>> Without a font-level support of BiDi properties of PUAs (or unassigned
>> characters),
>
> I'm trying to tell you that, wrt mirroring, that's already defined in the OpenType spec.
>
>
>> the layout engine will assume the wrong guess from the "default" property
>> value. And then it won't find the expected GSUB operation, because it won't
>> match it in the correct feature subtable.
>
> As I explained in an earlier message, the layout engine doesn't use the "default" property value but the resolved bidi level.

Once again, you refuse to understand my arguments. What I'm saying is
that OpenType CANNOT resolve the bidi level of PUAs (with the
exception where we use additional BiDi controls, which remains a hack,
because it adds unnecessary unvisible markup around the encoded texts,
and complexifies the use of strings and substrings).

You can turn the problem as you want, but PUAs (as well as unknown
characters) still have default properties that, in fine, will get used
in absence of a more precise definition (i.e. an explicit override) of
the actual BiDi property needed for the character.

> Btw, in the past few weeks, you've written several posts in which you make assertions about how rendering implementations work and, in some cases, why more is needed. And then I or others have to spend a bunch of time writing responses so that you get the correct understanding and, more importantly, so that others don't get mislead. It would be a lot easier if you just asked, "How is this done?"

Ok, you've replied, but not completely.

And at least on this point, Michael Everson is also right when he says
that PUAs do not properly handle RTL scripts only because of their
default BiDi property value. But I don't maintain his idea of encoding
new PUAs, when in fact we can effectively provide the additional
character properties needed, for example in fonts, without changing
the default proerty of PUA (I son't support it at all, and probably
you too) and without allocating more (unneeded) PUA block(s) for RTL
scripts (and also without hacking on top of another existing set of
RTL assigned characters).

I did not post any assertion about how OpenType could be used, just
wanted to explain that with the current specifications, it cannot
*currently* resolve the problem (and Michael Everson certainly fully
agrees with that, but he can reply as well if he thinks that I
misinterpret his last few messages).

We really need a raliable way to transport a PUA agreement in such a
way that it can be understood by a computer. An encoded font can
transport this information reliably, which at least must include some
necessary character property values, and it offers a smooth way for
transitions during all the encoding process of new scripts (notably
during the experimentation), as well as after that, for its adoption
for more general use (before a large majority of users can use updated
implementations of their text renderers, that will provide
automatically those properties for newly encoded characters and
scripts.

Simply because it's MUCH easier to upgrade a font (especially a PUA
font which is not part of the core fonts of the operating system),
than to upgrade a rendering engine (bound to the OS, for the case of
Microsoft APIs and libraries in Windows). An extensible set of
properties, managed with a good rule of priorities to avoid hacks or
non-compliant implementations, can certainly accelerate the
development and adoption rate by many years, can improve the number of
experimentations possible, can help avoiding errors during the
encoding process for new characters and scripts.

It could reduce this delay from about 10 years (during which even if
the script or characters are encoded, it will not be available or
usable reliably), to just a few months (even anticipating the final
encoding in the UCS, by a reliable way to represent it as PUAs,
managed with help of a PUA font, and after the UCD encoding, with a
font that provides the upward upgrade for older implementations of the
layout engine only knowing an older UCD version)

I ma completley convinced that we don't need more PUAs due to
continuous lack of support in existing softwares. But softwares can
still be updated to provide the support with the help of transitional
subtables in fonts (that can easily be ignored by newer engines that
won't require such extension tables), for integrating the additional
character properties.

Philippe.
Received on Sun Aug 21 2011 - 15:36:34 CDT

This archive was generated by hypermail 2.2.0 : Sun Aug 21 2011 - 15:36:35 CDT