From: John Hudson (tiro@tiro.com)
Date: Sat Nov 02 2002 - 13:19:41 EST
At 07:24 11/2/2002, Thomas Lotze wrote:
>On Sat, 2 Nov 2002 07:18:43 -0000
>"William Overington" <WOverington@ngo.globalnet.co.uk> wrote:
>
> > In relation to regular Unicode the policy is that no more ligatures
> > are to be encoded. My own view is that this should change. However,
> > that is unlikely to do so.
>
>I agree with you. Ligatures may have semantics that can be composed from
>characters already Unicode encoded, but they are separate glyphs whose
>shape cannot be inferred from that of others but has to be designed
>separately and stored somewhere in a font.
Thomas, please go and read the FAQ and the relevant parts of the Unicode
Standard before you start agreeing with William. Yes, ligatures are
separate glyphs, but not every glyph in a font needs to be encoded. A ct
ligature is a variant glyph representation of the characters c and t; it
does not need to be encoded, because it is possible to display the
character sequence ct with a ligature using font layout features. Unicode
is a *character* encoding standard, not a glyph encoding scheme. As
previously noted, the handful of Latin ligatures included in the Alphabet
Presentation Forms block are included only for backwards compatibility with
non-Unicode standards that did not have a good character/glyph distinction.
Please also note, and this is very important, that using Private Use Area
codepoints for elements that are meant to represent sequences of characters
in normal text, such as ligatures is a REALLY BAD IDEA. This has been
explained to William dozens of times, but he appears to be too wrapped up
in his own erroneous brilliance to listen to reason. If you use PUA
codepoints for glyph variants in text, you immediately lose all the
benefits of a clean character/glyph distinction: you cannot sort text, you
cannot spellcheck text, you cannot search text, and you have absolutely no
guarantee that another user is going to be able to correctly display your text.
Let me put it another way. Think about the paradigm you are working within
if you encode every glyph variant in a font. I see you standing in front of
a tray of metal type, hunting and picking for the little bit of lead that
*looks* correct. You pick up a bit of metal that has a ct ligature on the
end of it, and you put it in your composing stick. The semantic
relationship of that piece of metal to the letters c and t exists only in
your mind. The piece of metal is dumb: it carries no meaning. That is the
paradigm you are working in if you are typesetting text on a computer using
PUA codepoints for glyph variants. A PUA codepoint in a stream of text is
as meaningless as the piece of metal with a ligature on the end. You are
applying an analogue, metal type paradigm to digital text processing, and
in the process you are losing most of the benefits of using a computer.
Does that make any sense?
If you are interested in learning more about font layout features for glyph
variants, and how a smart font format like OpenType works with the Unicode
Standard, you might find this article at the Microsoft Typography website
useful:
http://www.microsoft.com/typography/developers/opentype/default.htm
John Hudson
Tiro Typeworks www.tiro.com
Vancouver, BC tiro@tiro.com
It is necessary that by all means and cunning,
the cursed owners of books should be persuaded
to make them available to us, either by argument
or by force. - Michael Apostolis, 1467
This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 14:48:31 EST