Why incomplete subscript/superscript alphabet ?
doug at ewellic.org
Mon Oct 3 13:47:09 CDT 2016
Steve Swales wrote:
>> I happen to think this would be exactly the wrong thing to do,
>> completely contrary to the principles of plain text that Unicode was
>> founded upon. But you never know what might gain traction, so stay
> I guess I don’t see how it is fundamentally different from other
> variant selector uses within Unicode,
Good question. Other variation selectors -- I assume this means U+FE00
through U+FE0F, plus the Plane 14 variation selectors, plus the
Mongolian and ideographic selectors -- are defined and registered for
use with specific, individual base characters. There are a lot of
combinations defined for "text style" and "emoji style," with more
probably on the way, but even in this seemingly open-ended field,
variation selectors are valid only in defined combinations.
The concept here was to invent combining characters for superscript,
subscript, blackletter, etc. that could be applied to any base
character. This is fundamentally different from "valid only in defined
> and the ability to write properly formatted mathematical and chemical
> formulas (for example) in a plain text environment like text messaging
> seems like a fairly compelling use case.
It certainly does. That's why UTC did the extensive research, way back
in the 2000 time frame, to determine what characters were appropriate in
mathematical contexts before encoding the Mathematical Alphanumeric
Symbols. They came up with Latin letters for a wide variety of styles,
and digits, Greek letters, and a few others for a subset of those
styles, that were agreed to have special meaning in mathematical
notation. They did not make the set open-ended, as if arbitrary
characters such as & or ₰ had similar special meaning.
Basic chemical formulas like H₂SO₄ or [ClO₂]⁺ can be written in
plain Unicode text. At some point the line between basic and non-basic
has to be drawn, just as with arbitrarily stacked superscripts in math,
and some sort of fancy-text solution has to take over.
Doug Ewell | Thornton, CO, US | ewellic.org
More information about the Unicode