From: Doug Ewell (dewell@roadrunner.com)
Date: Sat Feb 23 2008 - 16:15:42 CST
Karl Pentzlin <karl dash pentzlin at acssoft dot de> wrote:
> Which of the possible solutions is to be preferred (assuming that
> there is clear evidence presented for a superscript ):
>
> 1. Encode a COMBINING LATIN SMALL LETTER U UMLAUT (which implies that
> such a letter is not considered as precomposed, as there is no obvious
> decomposition now - U+0367 U+0308 does not apply)
> 2. Encode a COMBINING SMALL DIARESIS (or COMBINING SUPERSCRIPT
> DIARESIS) with an informative note: suited for combinations with
> combining letters, e.g. to mark them as umlaut
> 3. Expand the semantics of ZWJ/ZWNJ in a way
> - that U+006F U+0367 ZWJ U+0308 yields the wanted entity,
> - that ZWNJ after such entities "switches back" to the application
> of subsequential diacritics to the whole entity.
> 4. something completely different.
>
> I prefer 2. as it handles this case without inventing any new
> mechanism and also enables superscript / with a single new character,
> and does not raise any questions about precomposedness of combining
> letters.
I prefer (1) because there don't seem to be enough of these cases
(outside of this one work by Hotzenköcherle) to justify a productive
mechanism, and because the whole notion of stacking combining marks in
this "Russian doll" way adds a great deal of implementation complexity
in exchange for a small edge-case benefit.
-- Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
This archive was generated by hypermail 2.1.5 : Sat Feb 23 2008 - 16:19:21 CST