RE: missing characters: combining marks above runs of more than 2 base letters

From: Peter Constable <petercon_at_microsoft.com>
Date: Sun, 20 Nov 2011 19:50:25 +0000

From: unicode-bounce_at_unicode.org [mailto:unicode-bounce_at_unicode.org] On Behalf Of Philippe Verdy

> This arc is a true phonetic mark of a contextual elision...

> Exactly similar to other phonetic symbols like the elision tie

There are two kinds of arc shown in the image:

- arcs that span a space
- arcs that span a range of letters

Whether or not there is notation similar to the latter in IPA or other phonetic notations is irrelevant to the question of whether it is to be represented as plain text: even if a phonetic notation does graphically mark spans of text in certain cases, the fact that it is phonetic notation does not imply that it is, therefore, appropriate for that particular convention to be representable in plain text.

Note that UTR 20 discusses semantic and presentation effects that are suitable for representation as characters versus markup and makes the point that, in XML, effects that involve spans of text should be represented using markup rather than characters that set and unset state. Those are, of course, recommendations about a markup language, not plain text. But the argument used works in both directions: things that involve spans of text are best handled as markup, while things that are very local (e.g. spanning no more than a grapheme cluster) may be more suitable for representation as characters.

Peter
Received on Sun Nov 20 2011 - 13:54:01 CST

This archive was generated by hypermail 2.2.0 : Sun Nov 20 2011 - 13:54:03 CST