From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Fri May 23 2003 - 00:57:29 EDT
At 12:33 AM 5/23/03 +0200, Philippe Verdy wrote:
>From: "Asmus Freytag" <asmusf@ix.netcom.com>
> > Styled text uses markup. However, for specialized texts, such as
> > mathematics, where loss of style-markup can completely eradicate the
> > meaning of the text, several symbol sets have been added to Unicode, where
> > the symbols look like styled letters, but function very differently (i.e.
> > as mathematical symbols).
>
>This creates some conflicting interest: which semantic for characters
>encoded in fonts: a style semantic if one wants to present Latin or
>Cyrillic text with a Gothic style?
There's no conflict: if it's text, you use style. If it's a symbol, you use
the character.
>Also there are exceptions in the new mathematic block, where some letters
>were not encoded considering that they are already available in other
>blocks, either as Letter-like symbols or as plain characters.
The existing Letterlike symbols contained the high frequency (in terms of
use) for these symbols. Not re-encoding the existing ones underscores that
they map to a single set of mathematical symbols and are not alphabets for
styled text.
>What would happen to a mathematic text rendered with a Gothic style ? One
>could not make a semantic distinction between plain characters symbols,
>and Gothic style symbols.
Please treat http://www.unicode.org/reports/tr25/ which addresses this issue.
>The current encoding assumes that mathematic text use only fonts in a
>basic style using only the "representative glyphs" shown in charts.
>Depending on the fonts available to render a particilar text style
>(independantly of its abstract charactersemantic), such distinction will
>be hard to make.
Again, see the TR.
>This just proves that mathematical symbols use also the plain standard
>scripts, whose rendered style is then suddenly important. If accuracy in
>semantics was needed, clearly we would need to define separate mathematic
>characters for the basic style, but Unicode chose to unify them...
In the context of mathematics, not all font choices are suitable. I think
that's something the mathematicians understand already, and the explanation
for the non-experts is in the TR.
>Conclusion: mathemetical symbols is a separate script, but Unicode unifies
>this set incoherently as it assumes a default style for all scripts. So
>can we say that general purpose fonts for extended Latin with Gothic style
>are Unicode-compliant?
Yes, but they may not be usable for mathematics. There are many fonts that
are not usable for many purposes. Publishing an academic thesis with a
'wedding invitation' style font (fancy script style) would not be apropos,
in fact many institutions regulate the acceptable style for the text
portion of such documents rather minutely.
>Also it is not clear how serif and sans-serif variants of mathematical
>symbols will behave with other non mathematic text, and where we can say
>that the encoded text is mathematic and where it is not, so where a
>required style MUST be applied.
As usual, you will find that character encoding as such never solves *ALL*
possible problems. Character encoding is concerned with being able to
express the core semantic differentiation required to carry the content. A
full document will usually require additional information (typically markup).
>May be this should require defining new "BEGIN MATHS" and "END MATHS" (or
>"BEGIN TEXT") abstract characters and encode them (as format control
>characters) for the same semantic reasons Unicode defined and encoded the
>"Invisible Function Application" or "Invisible Comma" or "Invisible
>Multiplication Operator" (I'm not sure if they are their exact name, so
>look in UCD if you need them).
It's INVISIBLE SEPARATOR and INVISIBLE TIMES. Which you would have known if
you had read the TR.
A./
>
This archive was generated by hypermail 2.1.5 : Fri May 23 2003 - 01:50:39 EDT