From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Sep 10 2004 - 18:59:42 CDT
From: "Asmus Freytag" <asmusf@ix.netcom.com>
> On the other hand, all aspects to *coloring* of characters
> do not belong in the plain text stream - but that was not
> the question.
>
> I think suggested solutions that define markup that apply to
> combining characters but place that markup outside of the
> combining sequence would be a better answer than protocols
> trying to put markup inside the combining character sequence.
>
> My personal take is that the UTC might make a recommendation
> to that effect, but it's not part of the standard proper.
> It's not clear that the issue has practical urgency - if
> I should be mistaken on that, I'd like to find out how and why.
Placing markup out of the combining sequence seems attractive, apparently,
but exposes to other difficulties about how to refer to parts of combining
sequences (I did not say "parts of characters", because I agree that
combining characters are not part of characters, but effectively true
abstract characters per the Unicode definition), when combining sequences
are themselves subject to transformations like normalization.
A solution would be to specify in the markup which normalization to apply to
the combining sequence before refering to its component characters, with
some syntax like:
<font style="color:red nfd(2,1);">e&combining-acute;</font>
which would resist to normalization of the document such as NFC in:
<font style="color:red nfd(2,1);">&e-with-acute;</font>
Here some syntax in the markup style indicates an explicit NFD normalization
to apply to the plain-text fragment encoded in the text element, before
specifying a range of characters to which the style applies (Here it says
that color:red applies to only 1 character starting at the second one in the
surrounded text fragment, after it has been forced to NFD normalization.
May be this seems tricky, but other simplified solutions may be implemented
in a style language, such as providing more basic restrictions using new
markup attributes:
<font style="combining-color:red">&e-with-acute;</font>
where the new "combining-color" attribute implies such prenormalization and
automatic selection of character ranges to which to apply coloring. May be
there are better solutions, that will not imply augmenting the style
language schema with lots of new attribute names, such as in:
<font style="color:combining(red)">&e-with-acute;</font>
Here also, Unicode itself is not affected. But markup languages and
renderers are seriously modified to take new markup property names or values
into account.
This archive was generated by hypermail 2.1.5 : Fri Sep 10 2004 - 19:00:40 CDT