From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Jan 09 2006 - 03:32:50 CST
On Mon, 9 Jan 2006, Anto'nio Martins-Tuva'lkin wrote:
> At < http://www.somtrans.be/ >, the logo of this shipping company shoes
> the word "Rederij" set with over spaced letters, but "IJ" is, as usual,
> kept toghether.
The logo is presented as an image, not as styled text, so we cannot know
whether they used "I" and "J" or the IJ ligature.
> Would it be wise to typeset this *always* as U+0049 U+200D U+004A? Or
> would U+0132 be a better choice? (Ditto for lower case.)
Theoretically, U+0132 is a compatibility character with U+0049 U+004A
as the compatibility decomposition. Being a compatibility decomposable
character, it is not recommended except in the representation of existing
data in conditions where you need or wish to retain the difference between
the character sequence "IJ" and the IJ ligature at the character level.
Note that although the U+0132 indicates a ligature character, its
decomposition does not include U+200D (word joiner) or any other
indication of the ligature behavior. I'm not sure why this is so, but
I can understand it as a consequence of treating U+0132 as a _particular_
ligature of "I" and "J", following an orthographic and typographic
tradition. Its full typographic meaning could not be expressed formally
anyway, since it's not just a matter of applying _some_ ligature behavior.
The _specific_ ligature behavior might be expressible at other protocol
levels, such as typesetting instructions that map the sequence "IJ" to a
particular ligature glyph or render it in a particular style.
Then the pragmatics. Using U+200D simply doesn't work here. I tried it on
Word 2002, creating a piece of text with I, U+200D, J and then setting the
letter spacing to a large value. The result does not differ from the one I
get without the U+200D. This does not surprise me, since Word mishandles
U+200C (zero width non-joiner) and U+200D (zero width joiner) as invisible
controls for _line breaking_ (for allowing or disallowing a line break
within a string), not according to their defined semantics for ligature or
cursive behavior. I also tried it in an HTML document, using
I‍J and setting letter-spacing to a large value in CSS.
On Internet Explorer, the U+200D has no effect. On Mozilla Firefox, the
spacing between I and J is _larger_ than between other letters, obviously
because Firefox applies the letter-spacing property to the spacing between
I and the invisible U+200D as well as to the spacing between U+200D and J.
(We can't really blame it, since the letter-spacing property is vaguely
defined, though it's natural to expect that invisible characters be
ignored.) On Opera, the situation is much the same as on Opera, but U+200D
may appears as a visible symbol (thin vertical line with a cross-like
top), depending on the font - but in any case, there is _more_ spacing
than without the presence of U+200D.
I guess the most relevant question is whether the use of the IJ ligature
is considered as the most appropriate way of writing Dutch. If it is, then
using U+0132 and U+0133 is adequate: there is no other way to represent
the ligatures, except perhaps in some publishing software. And the CLDR
data for the Dutch language,
http://unicode.org/cldr/data/common/main/nl.xml
includes the ij ligature (U+0133) in the exemplarCharacters set (which is
a rather strong position: it means that the character is required for
proper writing of Dutch, not just a desirable feature). Naturally,
U+0132 is implicitly included as an uppercase counterpart.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Mon Jan 09 2006 - 03:35:42 CST