From: Peter Kirk (peterkirk@qaya.org)
Date: Tue Dec 30 2003 - 16:13:16 EST
On 30/12/2003 11:44, John Hudson wrote:
> At 11:15 AM 12/30/2003, Peter Kirk wrote:
>
>>> Even if it were verified, it isn't a good case for encoding a
>>> separate character *equivalent* to a combination of two existing
>>> characters: that's a glyph variant ligature.
>>
>>
>> Actually, I don't think so. The separate character was not formed by
>> merging the dot into the letter, rather the distinction was made in a
>> different way.
>
>
> In modern digital font development, ligation refers to the mechanism
> of display, not the visual appearance, which is largely irrelevant. A
> ligature is any glyph that represents two or more characters,
> typically arrived at by a ligation lookup. If I wanted a special sin
> glyph *equivalent* to the character sequence <shin, sindot>, I would
> ligate the two characters to that single glyph, either directly
>
> shin sindot -> sin
>
> or via a two-stage stylistic variant lookup associated with a
> different typographic feature
>
> shin sindot -> shin_sindot
> and then
> shin_sindot -> sin
>
>
I understand this, and, as I answered separately, I don't think this is
the appopriate mechanism in this case as the suggested ligature is not
fully equivalent to the sequence.
But if it were, this ligature would be very interesting and problematic
because it is a ligature between a base character and a diacritic. This
is not a problem if it is always used, in a particular font, but it is
problematic if the ligature is optional. This is because ZWNJ and ZWJ
cannot be used between base characters and diacritics because they break
the combining sequence. We came across this problem before with Hebrew
script, but in a rather different (and less ambiguous) context, that of
the need for a ligature between meteg and hataf vowels.
I wonder if there are other, better defined, cases of ligatures between
base characters and diacritics in other scripts, i.e. cases where there
is an optional alternative to base character plus diacritic which does
not look like the base character plus the diacritic. Candidates like ø
as an alternative for ö are ruled out because they are already
separately encoded. I have certainly seen glyphs rather like U+0255 used
for c cedilla. In the light of recent discussions, I can easily imagine
a script or style like Sutterlin having a special ligated form for u
umlaut, but that this ligature must not be used, rather two dots should
be written above the letter as in normal Latin script, in the name Saül
in which the dots represent a diaeresis rather than an umlaut.
OpenType etc fonts are currently able to make these distinctions
consistently, with the mechanisms John described above; but these
mechanisms fail if there is a need for the ligature to be optional, as
ZWNJ and ZWJ cannot be used.
Are there any real examples where this might be necessary?
As this is a more general issue, I am coying it back to the main Unicode
list.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Tue Dec 30 2003 - 16:59:04 EST