RE: alpha, print, graph, blank, etc.

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Tue Apr 22 2003 - 07:33:23 EDT

  • Next message: Marco Cimarosti: "RE: *Complete* Big5 to Unicode mappings"

    Mark Davis wrote:
    > The POSIX/C-style property names (punct, alpha, lower, upper,
    > digit, xdigit, alnum, cntrl, graph, print, space, blank) are
    > not well specified, and don't really map well to the broader
    > types of characters available in Unicode/10646. For example,
    > there is no provision for titlecase, [...]

    My 0.2 euros: IMHO, title-case letters should be treated as *both*
    upper-case and lower-case. I.e., my suggestion is that:

            - is[w]lower() returns TRUE for both lower-case and title-case
    letters;
            - is[w]upper() returns TRUE for both upper-case and title-case
    letters;
            - is[w]alpha() returns TRUE for any Unicode letter (general category
    L*).

    For applications unaware of the existence if "title-case" letters, this
    saves the basic semantics of is[w]alpha() (namely, "Is it a letter?"), and
    one of the most basic semantics of is[w]lower() and is[w]upper() (namely,
    "Can this character be converted to lower/upper-case?").

    For applications aware of the existence if "title-case" letters, the
    is[w]upper(), is[w]lower(), and is[w]alpha() can be used in combination to
    determine the exact "case type" of any letter:

            if (iswalpha(c))
            {
                    if (iswupper(c) && iswlower(c))
                    {
                            printf("This is a title-case letter (Lt).\n", c);
                    }
                    else if (iswupper(c) && !iswlower(c))
                    {
                            printf("This is an upper-case letter (Lu).\n", c);
                    }
                    else if (!iswupper(c) && iswlower(c))
                    {
                            printf("This is a lower-case letter (Ll).\n", c);
                    }
                    else /* if (!iswupper(c) && !iswlower(c)) */
                    {
                            printf("This is letter with no case distinctions (Lo
    or Lm).\n", c);
                    }
            }
            else
            {
                    printf("This is not a letter.\n", c);
            }

    Unfortunately, there is no corresponding trick to obtain a "to-title-case"
    functionality, apart a non portable construct such as:

            c1 = towctrans(c2, wctrans("Title-case"));

    Anyway, converting to title case is something less fundamental than
    upper/lower-casing, and it only makes sense at the string level.

    _ Marco



    This archive was generated by hypermail 2.1.5 : Tue Apr 22 2003 - 08:24:23 EDT