RE: Clones (was RE: Hexadecimal)

From: Jim Allan (jallan@smrtytrek.com)
Date: Tue Aug 19 2003 - 11:18:47 EDT

  • Next message: Jill.Ramonsky@Aculab.com: "RE: Hexadecimal again (was RE: Clones)"

    Jill Ramonsky posted on the minus sign:

    > Yeah, I know. But like I said, who uses this?

    Books are normally produced today using computer typesetting. Look in
    any mathematics text or any well printed book for minus signs. Hyphens
    and minus signs are distinct (except when showing computer programming
    in a non-spacing font). Hyphen and minus sign have always been different
    characters.

    TeX and SGML and other pre-Unicode legacy typographical systems support
    this difference which has always existed.

    On common computer systems like the Macintosh and Windows which didn't
    support the difference globally in their standard character sets in
    pre-Unicode days it was customary to use the en-dash instead of a minus
    sign in formatted text. Or you switched to special math-symbol fonts
    when entering mathematical signs and other symbols.

    Style sheets and books of tips for word processing and desktop
    publishing almost always go into some detail about the various kinds of
    dashes and the minus sign. So does the Unicode manual in its section on
    punctuation.

    > And I also have to ask ... if I am actually WRITING a C++ compiler, should I
    > allow the use of MINUS SIGN to mean minus sign? (Actually, that question may
    > be answered by the specification of C++, so let's push it a bit further. If
    > I am inventing some successor language to C++, and am free to invent my own
    > specification, should I _then_ allow the use of MINUS SIGN?)

    The symbols to be used for any computer language are part of the
    definition of that computer language. Currently you can't legally use
    U+2212 for any computer language I know of.

    However I will be surprised if computer languages do not start to take
    advantage of the additional characters that are universally available
    though Unicode.

    > I only ask that the charts make clear what each
    > character is FOR, in sufficient detail that the answer to questions like the
    > above becomes obvious.

    Currently the manual assumes that a user who wants to use a character
    will mostly already know what it is FOR or the user wouldn't want to use
    it. That's a reasonable assumption to make to avoid expanding the manual
    to five or six volumes at least. A small amount of typographical and
    usage information on some characters is provided for the convenience of
    font makers.

    I would personally love to see an expanded version of the Unicode
    manual, a sort of multi-volume encylopedia of characters and their
    history and uses.

    Meanwhile Unicode tells us that a particular glyph is a normal glyph for
    MINUS SIGN. That really should be enough. Most people know that math
    symbols are generally not (yet?) implemented to actually DO their
    function on computers. And it is hardly necessary of the purpose of the
    manul that, for examples, under % we should be told about its use for
    modulus or introducing a comment in some computer languages.

    You don't complain that the charts doen't tell you what U+00D7
    MULTIPLICATION SIGN is for or U+00F7 DIVISION SIGN or U+0026 AMPERSAND.

    As to supporting all of Unicode, see 2.12 in
    http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf.

    Must a cell phone, for example, support all of Unicode?

    Must every font contain every Unicode character?

    Partial support is quite conformant provided that what is supported is
    supported according to the standard and data is not corrupted.

    That doesn't mean that full support and impecable rendering is not
    desireable. It is in the long run. But a lap top user who generally uses
    only English may not wish have disk space taken up by East Asian fonts
    or top-of-the line publishing software that handles east Indian scripts
    impeccably.

    Government software for various governments may purposely support only a
    particular subset of the Unicode character set.

    Jim Allan



    This archive was generated by hypermail 2.1.5 : Tue Aug 19 2003 - 12:39:13 EDT