This message is cross-posted to the OpenType and Unicode lists. Feel free 
to forward it to any other lists or individuals to whom it might be of 
interest.
There has recently been discussion on the Unicode list of Latin ligatures 
and the appropriateness or inappropriateness of using the Zero Width Joiner 
(ZWJ) character to specify ligation in Latin-script documents. There is a 
difference of opinion between those who believe that
a) Ligatures are classifiable and some classes of ligatures should be 
active by default for normal Latin-script text (e.g. the f-ligatures that 
are designed to improve the spacing of letters sequences involving that 
letter). Ligatures should be activated or deactivated in line layout, via 
the application of layout features (e.g. OpenType GSUB) to normal text such 
that
        o f f i c e -> o ffi c e
This facilitates the typesetting of any electronic document, without 
reference to authorial decision regarding the use of ligatures, according 
to long-standing typographic traditions, publishing house-styles, etc.
Such classes of ligatures have been in continuous use as standard elements 
of Latin-script typography for more than 500 years. They improve the 
'colour' of text and while it is important for users to be able to 
deactivate them, particularly when text is set at large sizes, they should 
be active by default.
Other classes of ligatures are principally decorative, e.g. the historic ct 
and st ligatures, and these have not been in continuous use as standard 
elements of Latin-script typography. These should also be activated via 
layout features, but these features should not be active by default.
The current registered OpenType Layout features provide separate mechanisms 
for these different classes of ligatures: Standard Ligatures <liga> and 
Discretionary Ligatures <dlig>.
and those who believe
b) The use of all ligatures is discretionary and exceptional, and should be 
determined by the author of the document using the ZWJ character to signify 
ligation of two or more characters, such that
        o f f i c e -> o f f i c e
but
        o f ZWJ f ZWJ i c e -> o ffi ce
It is acknowledged that there are scripts and orthographic traditions, e.g. 
Runic and Old Hungarian, in which ligature use is not standard but is 
encountered as a freely applied manuscript element, i.e. the same sequence 
of letters may be ligated in one occurence but not in the next. Such usage 
has been amply documented by Michael Everson. Latin ligatures should be 
treated in the same way, and their application should be explicit in text 
rather than in layout.
There are clearly documents in which the presence or absence of ligation in 
specific circumstances is important to the correct display and/or 
understanding of the text. Authors citing older documents, especially in 
palaeographic studies, need to be able to indicate whether ligatures were 
used or not used, or used inconsistently, in the original. This information 
needs to travel with the electronic document, and must not be subject to 
changes to layout in downstream applications.
My own opinion is that both views include valid needs. As a professional 
typographer and type designer, I fully endorse the first view: some 
ligature classes are standard elements of well-formed typography, as 
appropriate to the typeface in use, and are not optional in 'normal' texts 
(i.e. texts in which the presence or absence of ligatures is not 
significant to the content). On the other hand, while I reject the notion 
that all ligature use should be authorially determined, I believe that 
there are many legitimate circumstances for such determination, especially 
in the area of manuscript and document studies, and that there needs to be 
a mechanism for fonts to provide correct shaping *independent* of existing 
mechanisms for standard or discretionary ligatures in 'normal' text.
Since the use of the ZWJ character to signify ligation, implies a clear 
authorial directive that ligation *must* be used to correctly represent the 
sequence. I would like to propose that font developers interested in 
supporting the use of ZWJ in ligation should do so in the Required 
Ligatures <rlig> feature (currently used principally for obligatory Arabic 
ligatures). This provides a separate mechanism for forming such ligatures, 
that will not be affected by the deactivating of Standard or Discretionary 
Ligature features. A set of lookups for a common Latin font might look like 
this:
<rlig>
        f ZWJ f ZWJ i -> ffi
        f ZWJ f ZWJ j -> ffj
        f ZWJ f ZWJ l -> ffl
        f ZWJ f -> ff
        f ZWJ i -> fi
        f ZWJ j -> fj
        f ZWJ l -> fl
<liga>
        f f i -> ffi
        f f j -> ffj
        f f l -> ffl
        f f -> ff
        f i -> fi
        f j -> fj
        f l -> fl
Similarly, I propose the use of the <rlig> feature for any script, e.g. 
Runic,  in which the ZWJ is expected to be used to signify ligation.
(Note for font developers: the <rlig> lookups should precede the <liga> 
lookups. There may be circumstances, e.g. using a calligraphic font with 
many ligatures, in which a user may want to use the ZWJ character to assert 
a preference for, e.g. an r_d ligature over an i_r ligature in the word 
'bird', even though the i_r ligature may precede the r_d in the <liga> 
lookups.)
Note: use of ZWJ in a document obviously cannot be made with any 
expectation of an appropriate ligature being present in a font. A document 
might contain the sequence 'o r ZWJ d o', but this can only be correctly 
rendered in a font that contains an r_d ligature. If such a ligature is not 
available, the sequence will be appear as 'ordo' to the reader, because the 
ZWJ is invisible and occupies no space. This is as intended. The important 
thing is that the desire for ligation travels with the text, and can be 
correctly rendered when an appropriate font is used.
Implementation issues:
In order to support the use of ZWJ ligation as outlined in this proposal, 
font developers need to include the ZWJ character in their fonts, and 
appropriate <rlig> lookups mapping ZWJ sequences to any ligatures present 
in the font. I also recommend that font developers include the Zero Width 
Non-Joiner (ZWNJ) character in their fonts, and this is likely to be used 
in tandem with ZWJ by authors who wish to explicitly indicate the absence 
of ligation. The presence of the ZWNJ is sufficient: no layout information 
is required to inhibit ligation. The addition of these characters and 
<rlig> lookups to any font that already contains <liga> or <dlig> lookups 
is a trivial task.
Layout engines that provide text processing for the Latin script need to 
support the <rlig> feature and to apply it as they would for Arabic.
John Hudson
Tiro Typeworks		www.tiro.com
Vancouver, BC		tiro@tiro.com
Language must belong to the Other -- to my linguistic community
as a whole -- before it can belong to me, so that the self comes to its
unique articulation in a medium which is always at some level
indifferent to it.              - Terry Eagleton
This archive was generated by hypermail 2.1.2 : Sat Jul 06 2002 - 11:35:57 EDT