From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Thu Jun 26 2008 - 14:53:43 CDT
On 6/26/2008 11:15 AM, Kent Karlsson wrote:
>> super- or subscript digits in cases like H2O, m², ¹4C, or 10²³
>> is even remotely wrong and should use markup instead. That
>> would be quite unnecessary overkill, when font coverage of
>> these characters is quite sufficient (and survives markup
>> stripping, though not compatibility mapping; hoping that
>> my examples will survive e-mail).
>>
The formatting required for simple superscripts like exponents or
chemical formulae is widely supported and does not require MathML.
The upside of using plain-text superscript characters for simple
situations in otherwise plain text is that they normally don't suffer
from translation of format (e.g. HTML to plain text) and thus retain the
semantic.
However, that's theory except for superscript 1, 2 and 3 - because
there's apparently still a lot of transcoding between Unicode and 8859-1
going on, which would kill all others. Your example is perfect. When I
first got your message, all super/subscripts were as you intended them,
but your reply to yourself translated all except the above mentioned
three into their plain digit equivalents.
Also, I noticed that the superscript 4 when I saw it, came from a
different font that uses slightly higher and smaller glyphs, making the
14 look almost like a 1 to the fourth power. Also, the spacing of these
is just not right when you use a monospaced font. In a true superscript,
the effective font size would be reduced to match the size of the small
glyphs, but in a plain text case, the characters must fit into the full
sized display cell, meaning that using more than one of them will look
odd (2 3 instead of 23).
The plain text ones have their uses for quick and dirty footnote symbols
and for indicating squared units in otherwise non-mathematical texts as
well as similar *simple* usages. Such fallbacks are best limited to
single digits of the 8859-1 subset to avoid the surprises you ran into.
In addition, as you had noted earlier, the full repertoire of super and
subscript characters are the proper choice for phonetic notations (e.g.
digits used as tone marks). Such notations require preservation of
specific semantics across formatting languages. They require much more
extensive Unicode support as well as special fonts, and they wouldn't
survive transcoding anyway, meaning the issues you encountered with your
examples aren't as relevant in that field of application.
A./
PS: in the late 90's a request had been forwarded from people
maintaining a chemical database to add a small number of additional
Greek subscripts. The rationale was that they type of database was not
able to handle any markup. The request never went anywhere, for lack of
specific input from the submitters beyond an initial discussion, and it
is unknown how they solved their problem. The database was intended for
regulatory purposes, so one assumes that some solution was found, but
there has been no information.
This archive was generated by hypermail 2.1.5 : Thu Jun 26 2008 - 14:55:25 CDT