Re: Case mappings

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Wed Feb 02 2011 - 05:37:39 CST

  • Next message: William_J_G Overington: "Re: Ruby below?"

    QSJN 4 UKR wrote:

    > Uppercase(MicroSign)=MegaSign.

    Not really. The uppercase letter corresponding to MICRO SIGN is GREEK
    CAPITAL LETTER MU. Even though capital mu is probably identical with Latin
    capital m in all fonts where they coexist, they are distinct characters with
    no mapping defined between them in Unicode. They are regarded as distinct
    because they belong to different scripts and have different lowercase
    mappings. It would be incorrect to use capital mu as the symbol of the SI
    prefix “meta-”.

    The reason why the mapping goes that way is that MICRO SIGN is defined as
    compatibility equivalent to GREEK SMALL LETTER MU. It would cause problems
    to let compatability-equivalent characters have different uppercase
    mappings.

    Besides, there is really no suitable uppercase presentation for the “micro-”
    prefix. The only way to render an expression like “µs” in uppercase-only is
    to spell it out, “MICROSECOND”, and such expansions do not belong to
    character-level definitions. So, basically, MICRO SIGN should not be
    converted to uppercase at all, but the Unicode Standard does not express
    such limitations. It only defines what the mapping is. And there’s really no
    other feasible option than to formally define it as the same as for the
    Greek letter.

    > Am I the only human being in the world
    > who can't understand it?

    Hardly. It’s a messy business.

    > Am I an idiot?

    No, idiots cannot write.

    > I think µ is not μ because and only because they have different case
    > mappings. Actually µ(micro) unlike μ(greek) does not CWU. Why i am
    > wrong???

    I don’t quite follow, but as I wrote, the crucial decision was to treat
    MICRO SIGN as compatibility equivalent to GREEK SMALL LETTER MU. The case
    mapping is based on this. The decision can be criticized on the grounds that
    the characters often have distinctively different glyphs, so that the MICRO
    SIGN could be treated as a derivative of the Greek letter, rather than a
    compatibility character. But the decision has been made and won’t be
    changed, since a change would break too many stability principles and
    assumptions.

    Converting text to uppercase is always a matter of judgment. You should not
    assume that such a conversion can always be made without changing or
    distorting the information content. In fact, uppercase-converting “ms” as an
    SI notation would be, in a sense, worse than uppercase-converting “µs”. The
    latter would produce “ΜS”, a mix of Greek and Latin letters, therefore
    suspicious, and definitely incorrect as an SI notation even at the character
    level – the SI does not use capital mu at all. But uppercase-converting “ms”
    produces “MS”, which looks innocent and is correct as an SI notation, though
    it means megasiemens and not millisecond.

    > I fell from a chair and hurt stuck, pay me compensation!

    I’m sure there’s a disclaimer in the Unicode Standard about such issues.

    -- 
    Yucca, http://www.cs.tut.fi/~jkorpela/ 
    


    This archive was generated by hypermail 2.1.5 : Wed Feb 02 2011 - 05:40:44 CST