Re: Case mappings

From: Doug Ewell (doug@ewellic.org)
Date: Fri Feb 11 2011 - 10:02:19 CST

  • Next message: Doug Ewell: "RE: Characters"

    QSJN 4 UKR <qsjn4ukr at gmail dot com> wrote:

    > There are several different applications of the letter cases. They
    > are used stylistically, for example, the using a capital or title
    > letters in the headers, grammatically, when the capital letter
    > identifies the beginning of the sentence, the proper name, any name
    > in German, and semantically, for example, in SI units or chemical
    > symbols.

    This is exactly why it is inappropriate to apply case-change operations
    indiscriminately to arbitrary snippets of text. This is not unique to
    SI prefixes (or units) or Unicode compatibility characters; it's not
    even really a computer problem. It would be just as inappropriate, as
    Jukka pointed out, to uppercase a symbol like "ms" which consists of
    ordinary letters, whether in Unicode or in handwriting.

    > To support all these cases, it would be nice to use special control
    > characters in the text, which would indicate where the change in the
    > case is admissible and where is not. Or to use for the SI, chemical
    > and mathematical notation and - for capitalization of proper names
    > (???) - those characters who have no case mapping, U+1D400 etc.

    Modifying all existing electronic text to include such an invisible
    control character, and requiring all users and processes to enter it
    reliably, and modifying all keyboards to include a key for this new
    character, doesn't seem particularly likely at this time. Better to
    teach users to use common sense when applying text-transformation
    operations like uppercasing.

    > What the hell good on the stability of the Unicode standard, if it
    > excludes the possibility of using it.

    Using a character encoding standard does require a modicum of knowledge
    about how plain text works.

    --
    Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
    RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s ­
    


    This archive was generated by hypermail 2.1.5 : Fri Feb 11 2011 - 10:03:52 CST