From: Karl Pentzlin (karl-pentzlin@acssoft.de)
Date: Fri Jan 26 2007 - 07:57:44 CST
Am Freitag, 26. Januar 2007 um 02:05 schrieb John H. Jenkins:
JHJ> In any event, I reiterate: Ligature formation in Latin is a matter of
JHJ> stylistic preference.
Not in any event. For typesetting German Fraktur, ligature formation
is a matter of orthographic rules.
In this context, the German term "Ligatur" has two closely related but
different meanings (which nevertheless translate both as "ligature"
into English) :
1.) A closed and orthographically exactly defined set of types showing
two or three letters visually linked in a standardized way. The use of
this ligature types is defined by orthographic rules. Thus, the whole
ligature is subject of the orthography.
I call this "O-Ligature" within the remainder of this text.
2.) The usual meaning: compounds of two or more characters, which are
preferred over the single characters for esthetical reasons, while
the single components stay being the subject of the orthography.
I call this "E-ligature" within the remainder of this text.
Simplified, in Fraktur ligatures (in the first sense) are to be used
within syllabes but not at syllabe boundaries.
E.g., when typesetting "finden" (to find), using a "fi" O-ligature is
required, i.e. a type with a dotless i under the f bow linked with
the f bar (which is a reference glyph and not a variant chosen for
esthetical reasons).
When typesetting "Schilfinsel" (island full of reed), using a "fi"
O-ligature is an orthographic error. This does not prevent the font
designer to create an E-Ligature which contains a dotted i not linked
with the f bar, but is looking better than a sequence of unrelated
f and i.
Now, how to encode Fraktur in Unicode plain text?
Premises:
a.) The presentation forms U+FB00...U+FB05 are to be avoided (as
usual); they anyway are only a true subset of the O-ligatures
required for correct Fraktur typesettung, e.g. lacking "tz".
b.) No semantical analysis shall be required to determine the
correct orthographic types.
c.) The representation of the text in non-Fraktur fonts shall
be as interferenced as little as possible.
Premise b. rules out the possibility "let the presentation software
determine whether an O-ligature is appropriate to present a given
sequence of base characters".
Two possibilities:
1.) Require ZWJ to mark O-ligatures where required.
2.) Assume an O-ligature whenever a sequence of the base letter
occurs, and require ZWNJ where the O-ligature would be erroneous.
In my eyes, the possibility 2. is to be strongly preferred. It does
not interfere with texts where the difference of O-ligatures and
E-ligatures are irrelevant, and does not affect the known semantics
of ZWJ/ZWNJ in any way. Thus, it fulfills premise c.
Moreover, it is easier to type in, as the cases where ZWNJ is to be
typed in 2. are rarer as for the ZWJs in 1., and they are more likely
to be seen as exceptions which are to be cared rather than the normal
cases of character sequences where O-ligatures are appropriate.
Thus, a well designed OpenType Fraktur font could have a "fi" glyph
appropriate as O-ligature for the sequence "f"+"i". This works
also for non-German texts where this glyph is expected in any cases.
It could have an E-ligature showing an unconnected but specially
designed "fi" glyph for the sequence "f"+ZWNJ+"i".
I know of no existing Fraktur font using either of the two mechanisms.
Even the fonts which have the required O-ligatures anyway (and thus
are not only pseudo-Fraktur fonts) have them as single
characters on arbitraty code points (that is, there seems to be no
truly Unicode compatible Fraktur fonts unless you admit PUA use).
<dreaming>
Besides the ligature problem, premise c. could be fulfilled for
Fraktur texts if there were variant selectors for U+0073 s,
determining "round s in any case" or "use long form when using
a broken letter font only; use round form in all other cases",
leaving U+017F "long s" as a completely different letter which
inherently has the long form in any cases.
</dreaming>
To standardize Fraktur handling in Unicode, maybe something like
an UTR would be appropriate.
- Karl Pentzlin
This archive was generated by hypermail 2.1.5 : Fri Jan 26 2007 - 07:59:12 CST