From: Doug Ewell (dewell@adelphia.net)
Date: Tue Feb 11 2003 - 00:27:26 EST
Kenneth Whistler <kenw at sybase dot com> wrote:
> Long ago
> it was decided that it would not be a good idea to extend
> formal character decomposition to such base letterform shape
> changes or bars across letters. (Note that Latin characters
> with bars: barred-b, barred-d, barred-i, barred-u, barred-l,
> and the like are also not decomposed formally. Similarly for
> Latin letters with hooks, and so on.)
>
> So formal canonical decompositions are almost entirely
> confined to separable, accent-like diacritics (acute,
> grave, diaeresis, and so on). The only significant exceptions are
> the cedilla and ogonek, which attach smoothly to letter
> bottoms without otherwise distorting them, and which
> often have graphic alternates that are, indeed, separated
> diacritics (comma-like and reverse-comma-like forms).
I always wondered why the with-acute and with-circumflex letters were
decomposable but something like U+0141 LATIN CAPITAL LETTER L WITH
STROKE was not. After all, Unicode has combining "overstruck
diacritics" like U+0337 COMBINING SHORT SOLIDUS OVERLAY; isn't that what
one would use to compose an L-stroke? Same for the Maltese and Sami
letters that use a horizontal stroke instead of a diagonal. It always
seemed kind of random to me.
Ken's reply explains why Cyrillic descenders and the like, which distort
or deform the base character in some way, are not decomposable, and I
can buy that, but I still don't see why stroke overlays are lumped in
with that group. They don't distort the base form any more than
cedillas and ogoneks do -- and isn't this a glyph issue anyway?
Of course, the important thing is that they are NOT decomposable, for
whatever historical reason, and won't be in the future.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.5 : Tue Feb 11 2003 - 01:13:08 EST