From: John Hudson (john@tiro.ca)
Date: Wed Jun 14 2006 - 23:18:26 CDT
Michael Everson wrote:
>> My approach has been to handle such things at the font level using
>> either ligatures, if dealing with a known and discreet set of
>> sequences for a particular language (e.g. ch with underscore for
>> Ethiopic transcription), or using contextually subsituted beginning,
>> middle and end glyphs for arbitrary sequences (e.g. overscores for
>> Greek nomina sacra). In these cases, I think it is certainly desirable
>> to handle the underties or overties in charactar encoding.
> So you would handle sch with triple undertie how?
Before I suggest some answers to that, let me clarify that I was drawing attention, contra
Ken's post, to the fact that at least least some instances of tied sequences might be
properly addressed in text encoding, rather than as Ken suggests by markup. In the case of
Amharic transcription, for example, you have separate C and H characters with combining
macrons below (U+0331) and then you have CH with an underscore, and these are clearly
related orthographic conventions, the latter being encodable as a sequence of C with
macron below and H with macron below and displayable as a ligature. So when something like
tied underscores have a logical place within an orthography, I think it makes sense to
deal with them in terms of text encoding.
Ken suggests that
Once spanning mechanisms go beyond two base characters,
it is no longer useful to try to treat them as encoded
characters, as it is increasingly unlikely that appropriate
rendering mechanisms will be available for them.
and hence
the representation of such in digital text should be
handled by style and markup, rather than by seeking
solutions in character encoding.
The trouble I have with this is that 'style and markup' is not a magic solution that make
rendering problems and limitations disappear. Indeed, in my experience the relationship of
markup to rendering is often much more complicated and difficult than the relationship of
character encoding to rendering, which has well established mechanisms within font
technologies.
There are obvious cases in which tying marks should be handled at some level above text
encoding and typical font shaping, e.g. in music notation, whether for voice or
instrument, or mathematical layout. Nomina sacra is an interesting edge case, because one
can easily see how it might be sensibly handled in markup and tying lines drawn by
applications independent of text encoding. But as it happens we have a working mechanism
using the combining overline character (U+0305), and as I understand it the productive use
of this character is presumed in the Coptic encoding.
Of course, the productive use of the overline is easy, since the line is straight. Really
good typographic representation requires beginning, middle and end glyph variants to make
nomina sacra look nice, but a roughly acceptable rendering can be achieved simply by
putting the base glyphs and combining marks together in a row.
The case of curved ties, as in the sch example is more complicated, but not insoluble. The
important thing, I think, is to give up on the idea of encoding combining tie marks, e.g.
U+0361, altogether, and instead encode sequences of tied characters with individual
combining marks (as in the Amharic example above) with a control character such as ZWJ
indicating desired ligation. So the sch with undertie might be encoded as e.g.
0073 032E ZWJ 0063 032E ZWJ 0068 032E
Obviously, this is only going to be optimally rendered using a specialised font, but such
sequences are specialised by nature.
From a font design perspective, there are two methods for rendering such a sequence.
Either a whole sequence can be rendered as a ligature, or sequences of arbitrary length
can be handled by making the middle section of the tie straight and only the terminals
curved. This is similar to the approach taken with growing delimiters in mathematical
typsetting.
John Hudson
-- Tiro Typeworks www.tiro.com Vancouver, BC john@tiro.ca I am not yet so lost in lexicography, as to forget that words are the daughters of earth, and that things are the sons of heaven. - Samuel Johnson
This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 23:34:20 CDT