From: Deborah Goldsmith (goldsmit@apple.com)
Date: Mon Nov 08 2004 - 20:57:04 CST
I think he's saying he wants to convert to NFC *from* Mac OS X data, in
which case the fact that Mac OS X's file system normalization is not
strict NFD doesn't really matter. Also, he says he's running on
Solaris, which would make it a tad difficult to call a Mac OS X API.
ICU should do the trick.
It's worth pointing out that there is no such thing as "precomposed
Unicode". Normalization form C (NFC) could be called "as precomposed as
possible." There are some sequences of Unicode that can only be
expressed using combining marks.
Deborah Goldsmith
Internationalization, Unicode liaison
Apple Computer, Inc.
goldsmit@apple.com
On Nov 8, 2004, at 5:17 PM, Markus Scherer wrote:
> Tay, William wrote:
>> Is there any C library available that converts the decomposed UTF-8
>> byte
>> streams into the pre-composed equivalent?
>
> MacOS X does decompose filenames, but it does not use standard Unicode
> normalization (because it was
> designed before Unicode's normalization was finalized.) I suggest you
> search the mailing list
> archive for this list for more details. You probably need to use a
> MacOS system function.
>
> ICU has options for normalization (some defined with internal
> constants only) which may or may not
> match, or get close to, MacOS filename normalization:
> http://oss.software.ibm.com/cgi-bin/icu/nbrowser
>
> markus
>
This archive was generated by hypermail 2.1.5 : Mon Nov 08 2004 - 20:58:33 CST