From: John Hudson (john@tiro.ca)
Date: Tue Nov 18 2008 - 14:43:13 CST
Andreas Stötzner wrote:
> It is highly unrealistic to assume that at least some of the principal 
> fonts will come with sufficient anchor point programming ever. Few 
> well-sponsored specialists like John Hudson may be so lucky to labour on 
> this for months...
It may be helpful to get some idea of the actual amount of work involved 
in adding GPOS mark attachment positioning for arbitrary base+mark 
sequences to an OT font. To do the job well, on a typical family of four 
fonts (roman, italic, bold, bold italic) supporting three scripts 
(Latin, Greek, Cyrillic) and all the combining mark characters up to 
Unicode 5.0, should take between 2-4 weeks depending on the nature of 
the design and how quickly one works.
The marks need to be categorised based on shared anchor positions, e.g. 
above-centre, above-centre.cap*, below-centre, above-right, with 
separate anchors for ogoneks and other marks that attach to the base. In 
practical terms, the above-centre and below-centre anchors are the most 
important and will probably account for almost all the real-world uses. 
Even if one added only these to fonts, one would have gone a long way to 
supporting arbitrary combinations that might occur in any text.
* If, as I do, one has variant forms of marks for above uppercase and 
other tall letters, these need to be substituted contextually in the 
<ccmp> GSUB feature. This means, of course, that one only needs to 
provide anchors for such mark variants above uppercase and tall letters, 
and not for the regular mark glyphs.
One needs to decide how to handle the precomposed diacritic characters 
as bases for additional marks. My preference is to contextually 
decompose these into simple bases plus marks when followed by a 
combining mark, e.g. (in VOLT syntax):
        Aacute -> A acutecomb
        | <MARK-any>
[Note that I decompose when any mark follows, rather than just another 
above mark, since I ensure that the anchored mark position is the same 
as the composite accent position in the precomposed glyph: there should 
be no visual difference between the /Aacute/ precomposed glyph and the 
GPOS rendering of /A/acutecomb/.]
This approach greatly reduces and simplifies the number of bases on 
which one needs to define anchors. However, if one is concerned about 
some layout engines (incl. at least some of Adobe's) that have problems 
processing one-to-many glyph substititions (decompositions), then one 
will need to put anchors on all potential bases including precomposed 
glyphs such as /Aacute/. This is significantly more work.
As I wrote earlier, I think the bottleneck on increasing support for 
GPOS mark positioning is a workflow and tool issue. While it is nice to 
have real-world test cases and hence some knowledge about what base+mark 
sequences will actually occur in text, the benefit of the GPOS anchor 
approach is that it does not rely on such knowledge: it can, and should, 
be able to handle arbitrary combinations. What font developers need to 
figure out are ways to leverage existing data to derive GPOS anchor 
positioning and/or to automate parts of the workflow. One obvious way to 
do this, since one generally wants GPOS mark positioning to accurately 
mimic positioning within precomposed glyphs, is to leverage component 
x,y offset positions as GPOS anchor locations.
This is made very much easier if combining mark glyphs, rather than 
spacing accents, are used as components in precomposed diacritics, e.g. 
the /Aacute/ glyph should be a composite of /A/ and /acutecomb/ (U+0301) 
not /acute/ (U+00B4). This is contrary to the evolved practice of many 
font developers and to some tool preconceptions, but these are easily 
revised.
Obviously it is also necessary for mark components to be positioned 
consistently, both on their own zero-widths and within composites with 
the same base. In other words the x,y offset of components such as 
/acutecomb/, /gravecomb/, /circumflexcomb/ and other above-centre marks 
should be identical when applied to that same base letter such as /a/.
[This was not the case in recent font data I was working on for a 
client, and it made the work of anchor definition much more difficult 
than it should have been, and I had to abandon the goal of always having 
the GPOS mark positioning mimic the positioning within the precomposed 
glyphs.]
John Hudson
-- Tiro Typeworks www.tiro.com Gulf Islands, BC tiro@tiro.com You can't build a healthy democracy with people who believe in little green men from Venus. -- Arthur C. Clark
This archive was generated by hypermail 2.1.5 : Tue Nov 18 2008 - 14:45:51 CST