>POSIX doesn't include information about any specific encoding,
>UTF-8 or otherwise. . .
> Yes, yes, I know UTF-8 and Unicode/UCS are universal
>encodings, but from POSIX's point of view, that's irrelevant.
That's just what's wrong with POSIX from the perspective of an implementer
of the Unicode Standard.
If you want to write code set DEpendent software, POSIX
definitely won't give you any help. It has a completely
different design philosophy than does Unicode-specific
software. There are pros and cons to each.
Unicode has well defined character semantics that
are considered a property of the character itself and therefore not locale
dependent. A shorthand notation to kick the standard library into supporting
these is indeed called for. . .
I like the fact that Unicode defines character semantics and
that it considers such semantics to be properties of the
character regardless of locale. IMO, there isn't that much
advantage to POSIX's ability to make character semantics
However, POSIX's defined behavior has been in place for a
long time, and is based at least in part on what users and
companies thought was the correct behavior. I remember arguing...
I mean, debating :-)...with non-i18n engineers five or six
years ago about what should be defined as "alpha" characters
in the en_US locale. The locale was built with ISO 8859-1, and
I thought "alpha" should include some or all of the characters
with diacritics in the Latin-1 repertoire. The reaction was
mass consternation and hysteria. No, everyone "knew" the
American locale only included English A-Z and a-z; code
depended on that behavior.
Standards are asked to support different behavior and philosophies.
You want something that makes Unicode pre-eminent. This being the
Unicode mailing list, there probably are lots of others who agree.
But there are others out there who need/want to support other
encodings, and a code set independent design like POSIX meets
Sandra Martin O'Donnell
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT