From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Feb 11 2008 - 15:44:30 CST
Bob Hallissy wrote:
> Unicode 5.0 addresses this directly on page 500, where it says:
I guess you mean page 255 - at least the text is there (at the bottom of
page) in the online version.
> the recommended way of representing such text is to place
> U+034F combining grapheme joiner (CGJ) between the ligature tie and
> the combining mark that follows it, as shown in Figure 7-10.
That sounds like a tricky ad hoc rule, effectively turning any normal
diacritic mark into a "double" diacritic mark (double in the sense of
associating with a pair of characters). But since it's in the standard,
it's standard (even if nobody implements it).
> Because this is a relatively recent Unicode ruling, there are not yet
> many fonts that will render this sequence correctly. but there is no
> question that this is the correct encoding.
In this particular case, the document describing the transliteration
scheme seems to say that the ligature tie consists of two separate
diacritic marks, though they are apparently assumed to join visually.
This is a character-level issue (at the level of abstract characters),
and there is no defined correspondence between those marks and U+0361.
Consequently, at the level of a coded character set, we would have the
problem of two common letters, each with a diacritic, and a dot above
that should appear above this pair.
And I don't see how this could be encoded in Unicode as currently
defined; to encode it, we would need a new diacritic. Well, along with
the principle mentioned above, a dummy double diacritic would do: if
there were a combining diacritic mark that acts as a "double" diacritic
but has no visual rendering, then we could put it and a normal combining
diacritic between any characters and expect the visible diacritic to
appear as associated with the pair of characters!
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Mon Feb 11 2008 - 15:47:00 CST