Re: missing characters: combining marks above runs of more than 2 base letters

From: Ken Whistler <kenw_at_sybase.com>
Date: Fri, 18 Nov 2011 17:53:37 -0800

On 11/18/2011 5:24 PM, Philippe Verdy wrote:
> This arc in the example is definitely NOT mathematics

Nor did I say it was.

> (even if you
> have read a version where it was attempted to represent it using a
> Math TeX notation in this page, an obvious error because it used an
> angular \widehat and not the appropriate sign).

Irrelevant.

> This arc is a true
> phonetic mark of a contextual elision (the intermediate letter(s) are
> not to be pronounced, even though they are still written to explicit
> the phonetically elided word(s) and keep their usual orthography).

The fact that the function of the mark is to indicate a contextual
elision is
also essentially irrelevant to the analysis of whether such marking consists
of a mark (character) in text or a mark-up (non-character) of text.

The issue to pay attention to is whether the scoping of the modification of
text is cleanly delimited to a single character at a time, or is in
principle
extensible across n characters.

>
> Exactly similar to other phonetic symbols like the elision tie (an arc
> adjoininig two words to elide its separating space), or the apostrophe
> (which replaces completely the elided letters).
>
> And obviously a true candidate for plain-text: it provides
> simultaneouly two readings of the text, one is purely phonetic (and
> accurate for poems that have an essential and very strong rythmic
> structure), another is semantic (by the orthography kept). All letters
> have to be present in some way, even if some of them are marked for
> the expected phonetic.

And is obviously *not* a true candidate for plain text representation.
This kind
of markup for simultaneous alternative readings of text is precisely where
representation by a richer mechanism makes sense. And this is merely the
veriest toe in the water for what I am referring to as "text scoring".

For an example of the complexity of various approaches to these kinds of
problems,
see:

http://www.ilc.cnr.it/EAGLES/spokentx/node31.html

And here is an example of a well worked-out, systematic, multi-level
scoring system
for prosodic information, the ToBI annotation conventions:

http://www.cs.columbia.edu/~agus/tobi/labelling_guide_v3.pdf

--Ken
Received on Fri Nov 18 2011 - 19:56:05 CST

This archive was generated by hypermail 2.2.0 : Fri Nov 18 2011 - 19:56:06 CST