From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Tue Apr 22 2003 - 07:33:23 EDT
Mark Davis wrote:
> The POSIX/C-style property names (punct, alpha, lower, upper,
> digit, xdigit, alnum, cntrl, graph, print, space, blank) are
> not well specified, and don't really map well to the broader
> types of characters available in Unicode/10646. For example,
> there is no provision for titlecase, [...]
My 0.2 euros: IMHO, title-case letters should be treated as *both*
upper-case and lower-case. I.e., my suggestion is that:
- is[w]lower() returns TRUE for both lower-case and title-case
letters;
- is[w]upper() returns TRUE for both upper-case and title-case
letters;
- is[w]alpha() returns TRUE for any Unicode letter (general category
L*).
For applications unaware of the existence if "title-case" letters, this
saves the basic semantics of is[w]alpha() (namely, "Is it a letter?"), and
one of the most basic semantics of is[w]lower() and is[w]upper() (namely,
"Can this character be converted to lower/upper-case?").
For applications aware of the existence if "title-case" letters, the
is[w]upper(), is[w]lower(), and is[w]alpha() can be used in combination to
determine the exact "case type" of any letter:
if (iswalpha(c))
{
if (iswupper(c) && iswlower(c))
{
printf("This is a title-case letter (Lt).\n", c);
}
else if (iswupper(c) && !iswlower(c))
{
printf("This is an upper-case letter (Lu).\n", c);
}
else if (!iswupper(c) && iswlower(c))
{
printf("This is a lower-case letter (Ll).\n", c);
}
else /* if (!iswupper(c) && !iswlower(c)) */
{
printf("This is letter with no case distinctions (Lo
or Lm).\n", c);
}
}
else
{
printf("This is not a letter.\n", c);
}
Unfortunately, there is no corresponding trick to obtain a "to-title-case"
functionality, apart a non portable construct such as:
c1 = towctrans(c2, wctrans("Title-case"));
Anyway, converting to title case is something less fundamental than
upper/lower-casing, and it only makes sense at the string level.
_ Marco
This archive was generated by hypermail 2.1.5 : Tue Apr 22 2003 - 08:24:23 EDT