Latin ligatures and Unicode

From: Eberhard Pehlemann (e.pehlemann@gmx.de)
Date: Sun Dec 19 1999 - 06:58:53 EST


Hi!

As a newcomer to the Unicode discussion list I listened to the discussion for a
while. Today I would like to ask a question to which addresses the definition of
Unicode codes for ligatures in the latin script. (This is probably a
'historical' subject to Unicode, but I can't find a clear answer in the printed
version of "The Unicode Standard, Version 2.0".)

I am dealing with so-called "Fraktur" or "Blackletter" fonts that encode
character shapes used in historical european typesetting, especially in german
typesetting of the last century. These fonts must contain glyphs for the long s
and for several ligatures, at minimum for ch, ck, ff, fi, fl, ft, ll, sch, si,
ss, st (all using the long s), tt and tz.

All of you know about the problems of handling such glyphs with traditional
8-bit-fonts. So I try to find out the best way to handle these glyphs with
Unicode and 16-bit-fonts.

Although chapter 2.2 (Unicode Design Principles) of the Unicode book states that
Unicode distinguishes between charcters and glyphs and will only code
characters, I find some strange character code definitions in later sections of
the book:

A. The letter long s:

    U+017F LATIN SMALL LETTER LONG S

Is this really another character than U+0073 LATIN SMALL LETTER S ?

B. Several ligatures:

    U+FB00 LATIN SMALL LIGATURE FF
    U+FB01 LATIN SMALL LIGATURE FI
    U+FB02 LATIN SMALL LIGATURE FL
    U+FB03 LATIN SMALL LIGATURE FFI
    U+FB04 LATIN SMALL LIGATURE FFL
    U+FB05 LATIN SMALL LIGATURE LONG S T
    U+FB06 LATIN SMALL LIGATURE ST

These code definitions seem to contradict Figure 2-1 (Characters Versus Glyphs)
on page 2-5 of the Unicode book.

Having found these code definitions, several questions arise:

1. For what reasons have the glyphs mentioned in (A.) and (B.) become Unicode
characters?

2. Will other latin ligature glyphs like ch, ck, sch and so on also become
Unicode characters in the future? Should I submit such a proposal? (I feel that
this would be the worst way of handling ligatures.)

3. How should people who create or use blackletter fonts handle the ligatures
mentioned above? Should they (in their text documents) e.g. code c and h as two
characters, separated by a zero-width-joiner (U+200D ZERO WIDTH JOINER) or
zero-width-ligator (as proposed by ISO/IEC JTC1/SC2/WG2 in document N2141) and
access the desired glyph from an OpenType font file using the corresponding
tables in that file?

4. What is the future of handling latin ligatures with Unicode?

I hope that some of you will take a few minutes to answer these questions.

Thanks, Eberhard



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT