Being able to do "plain text" math is one of the goals of the Unicode
Technical Committee now. Since the publication of Unicode 2.0, three years
ago, we have had a lot of expert input on what plain text math capabilities
are needed, and also, where our existing repertoire of math operators is
insufficient. (We are, incidentally, also interested in evaluating and
improving our other technical symbol collections, but so far have not had
the long and sustained input from experts in other fields, as we had for
mathematics).
Full layout of mathematical expressions will need some form of markup,
although many formulas that do not need the full generality can be laid out
correctly if the mathematical operator characters in Unicode are
interpreted semantically.
Semantics for formatting that one needs to distinguish e.g. between
summation sign and sigma. They look the same, but summation sign can take
limit expressions etc.
Another aspect of semantics is the mathematical semantics. Here it's
necessary to make enough distinctions so that, if a small and large form of
an operator can occur in the same text, that they can be distinguished by
their character code without recourse to font information. Doing so, allows
plain text searches for math formula.
Caveat: If and where mathematicians have used 'operator overloading', to
borrow a C++ term, and deliberately used the same operator with different
mathemtical meaning in another sub-discipline, we would not sub-divide the
character, as the larger context would be enough to determine its meaning.
Our foremost goal has therefore been to complete our repertoire and where
necessary introduce additional distinctions for the two reasons I mentioned.
In the case of ASTERISK, the analysis that is needed, and that, as far as I
have seen, has not been made, is to present evidence that cases exist (or
are easily conceivable) where *both* the ASCII asterisk and yet another
asterisk are needed in the same text, and with consistent distinction in
use or formatting.
Ricardo has said that one could use the proposed asterisk in conjunction with
the ASCII asterisk do denote a regular expression of zero or more asterisks.
This is the one example that cannot serve, since by extension, it would
require an infinite series of asterisks (suppose I wanted to define a
regular expression consisting of zero or more instances of the proposed
asterisk!).
Typographically, asterisk may indeed show a variation betweem full-size and
superscript forms. For standard text fonts, the full-size form of asterisk
occurs only occasionally.
In the vast majority of fonts on my system, as well as in the Unicode
Standard, and ISO/IEC10646-1, ASTERISK is clearly depicted as a
superscripted symbol (i.e. it's 1/2 height and extends upwards from the
centerline of the font, which is just slightly below the x height). The
asterisk and superscript 2 have the same location and dimension. Therefore,
unless Ricardo is proposing a character that has the same dimension as a
*superscripted* SUPERSCRIPT TWO, my conclusion would be that we already
_have_ the character he wants, and that he is using a poor font for his
purpose.
A./
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:47 EDT