Re: UTF-8, ISO C Am.1, and POSIX

Date: Tue Aug 12 1997 - 15:27:24 EDT

> We have in the ISO POSIX WG been thru all POSIX standards to see
> what changes we should do to the standards to accompdate UCS.

   Markus Kuhn wrote:
   I guess, pretty much the only thing required in the POSIX standard for UTF-8
   is a standardized way to tell the locale mechanism that the character encoding
   used is UTF-8. UTF-8 is a little bit more than yet another character
   table, so there should be some locale flag or something like this that
   allows me to tell libc that UTF-8 is the used encoding.
The original question was what changes, if any, are needed in
POSIX to accommodate UCS. There aren't any that I can think of,
if we assume an implementation is using UTF-8 as the multibyte
external code and UCS as an internal wide character format.
Given that, there's no reason POSIX needs a flag or anything
else to make it aware it's using UTF-8. POSIX is designed to
be code set independent.

   . . .
   What's the state of the standardization with regard to specifying in a
   locale that we use UTF-8? How does enUS.UTF-8 look like?

Different from what most other implementations are using. Using
the values in your example, most would write this as en_US.UTF-8.

   It might also be useful, if POSIX would clairfy, how all the new
   ISO C Am. 1 functions for wide streams and multi-byte strings work in
   detail if we have selected the UTF-8 encoding in the locale. . .

POSIX doesn't include information about any specific encoding,
UTF-8 or otherwise. It is designed to work with a variety of
encodings, so it doesn't make sense for it to include specific
details of how it might work with a UTF-8-based locale anymore
than it would make sense for it to include details of how it
might work with an ISO 8859-1-based locale or a Japanese EUC-based
locale. Yes, yes, I know UTF-8 and Unicode/UCS are universal
encodings, but from POSIX's point of view, that's irrelevant.
They're just encodings.

Sandra Martin O'Donnell

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT