From: Mark Davis (mark.davis@jtcsv.com)
Date: Wed Apr 23 2003 - 20:01:54 EDT
I updated the page based on some feedback and comparisons against TR 14652,
plus some more delving into the POSIX standard.
http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/posix_classes.
html
I took Mario's suggestion, which is also followed by the TR, about titlecase
letters. Also expanded xdigit, since POSIX requires that it be a superset of
digit. And added Symbols to punct, since that is how the POSIX locale
handles them.
The main open issue is:
Posix requires that alpha include upper and lower, but in Unicode Alphabetic
does not include Lowercase and Uppercase. The differences are that Uppercase
includes U+24B6..U+24CF CIRCLED LATIN CAPITAL LETTER A..Z, and Lowercase
includes U+24D0..U+24E9 CIRCLED LATIN SMALL LETTER A..Z. None of these are
in Alphabetic. The three choices are:
1. Ignore the POSIX requirement
2. Add the missing characters to alpha
3. Subtract the excess characters from lower and upper respectively.
There are a couple of other issues listed on the page.
Märk Dāvĭs
This archive was generated by hypermail 2.1.5 : Wed Apr 23 2003 - 20:51:22 EDT