Is there a definition or guideline for the distinction between plain
text and rich text?
For example, in the expression 3², the exponent is a single character,
"superscript two". Semantically, this expression is equivalent to
3^2, using a visible character to indicate exponentiation and then
leaving the exponent in normal notation. Both seem to me clear
examples of plain text.
But if the circumflex were replaced by an invisible character that
meant "the following number should be superscripted", would that still
be plain text? Or would it be formatting that should be relegated to
markup?
What about a character that inhibited the composition of following
Hangul jamo into a syllable? That seems to me to be markup, but if it
could be replaced by a medial ZWNJ, I'm no longer sure.
Is the ZWNJ another tricky case? One could say that it's an invisible
formatting character whose role is simply to control how other
characters are displayed, and thus it should be markup? For that
matter, perhaps the normal space is a type of markup, especially when
it triggers the use of a final variant in the previous character.
Finally, aren't the LTR and RTL characters markup? What if we wanted
characters that put a run of text into vertical directionality?
One candidate guideline would be that plain text never include
anything that affects non-adjacent characters. But isn't that just
the equivalent of requiring repetition of markup for each character?
For example, if you wanted to write 3²⁽ⁿ⁺¹⁾ with m instead of n, the
plain text would be 3^2^(^m^+^1^), using ^ as a superscripting prefix.
If that is acceptable as plain text, then perhaps the Unicode
superscripted characters should all decompose into a superscripting
prefix.
Maybe I just need more sleep...
Received on Fri Oct 14 2011 - 00:54:07 CDT
This archive was generated by hypermail 2.2.0 : Fri Oct 14 2011 - 00:54:13 CDT