alpha, print, graph, blank, etc.

From: Mark Davis (mark.davis@jtcsv.com)
Date: Mon Apr 21 2003 - 21:30:58 EDT

Next message: Kenneth Whistler: "Re: Combining Grapheme Joiner"

Previous message: Eric Rasmussen: "Combining Grapheme Joiner"
Next in thread: Marco Cimarosti: "RE: alpha, print, graph, blank, etc."
Maybe reply: Marco Cimarosti: "RE: alpha, print, graph, blank, etc."
Maybe reply: Mark Davis: "Re: alpha, print, graph, blank, etc."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

The POSIX/C-style property names (punct, alpha, lower, upper, digit, xdigit,
alnum, cntrl, graph, print, space, blank) are not well specified, and don't
really map well to the broader types of characters available in
Unicode/10646. For example, there is no provision for titlecase, nor for a
distinction between symbols and punctuation. These categories aren't really
set up to make distinctions among combining marks, nor many of the other
Unicode Properties.

However, many programs use the POSIX-style properties, so for compatibility
it is best to come up with uniform set of recommendations for how they
should be interpreted in a Unicode context. This also relates to Java, since
many of the methods on Character ultimately derive from trying to match some
of the POSIX categories.

The following compares current Perl, ICU, Java, Windows, and the POSIX spec,
and tries to derive a recommendation for the best definition, given the way
people use the properties in practice. Note that these are only current
snapshots, since those environments may change their definitions, especially
as they upgrade beyond Unicode 3.x.

http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/posix_classes.
html

Feedback is welcome.

Mark

Next message: Kenneth Whistler: "Re: Combining Grapheme Joiner"
Previous message: Eric Rasmussen: "Combining Grapheme Joiner"
Next in thread: Marco Cimarosti: "RE: alpha, print, graph, blank, etc."
Maybe reply: Marco Cimarosti: "RE: alpha, print, graph, blank, etc."
Maybe reply: Mark Davis: "Re: alpha, print, graph, blank, etc."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Apr 21 2003 - 22:02:24 EDT