Re: Normalisation and font technology

From: Juliusz Chroboczek (
Date: Wed May 29 2002 - 06:59:25 EDT

JH> Apple recently started applying normalisation to file names in Mac
JH> OS X, with the result that the content of folders can now only be
JH> correctly displayed with fonts that contain the necessary AAT
JH> table information

That's very surprising. Especially considering the excellent job they
did with Openstep 4.0.

Even if you work with fully decomposed characters internally, mapping
to precomposed glyphs at display time is a triviality.

And even if you don't find a suitable precomposed glyph or a suitable
entry in the smart font, for a large number of combining classes you
can provide legible albeit not necessarily typographically satisfying
output by semi-randomly positioning the components.

JH> Do you really want word processing applications or web browsers
JH> that can only correctly display text in a handful of fonts on a
JH> user's system?


Please note that this is not software meant for actual use; it is just
an experiment to show that we don't need heavy artillery in order to
implement reasonable typesetting for the GLC subset of Unicode.

JH> This in turn suggests that if text is going to be decomposed in
JH> normalisation, it should be recomposed in a buffered character
JH> string prior to rendering.

The approach taken in Cedilla is different. The text is typeset as a
sequence of Combining Character Sequences (CCS). Given a (normalised)
CCS ``b c1 c2 ... cn'', Cedilla first attempts to find a precomposed
glyph; if that fails, it attempts to find a precomposed glyph for
``b c1 ... c(n-1)'', and compose it with the glyph for ``cn''.

All of that happens on the fly, there's never any need to do
buffering. With suitable memoisation (caching), only a tiny fraction
of the execution time is spent on searching for the right glyphs.

Cedilla implements a number of other techniques for conjuring suitable
glyphs; the main difficulty was finding the right ordering of the
various fallbacks. It turns out that it is more important to avoid
the ransom-note effect than find the best glyph.


This archive was generated by hypermail 2.1.2 : Wed May 29 2002 - 05:33:01 EDT