Re: UTF-8 on NT

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Sep 04 2001 - 16:24:51 EDT


No. On Windows NT/2000/XP/CE, everything is UTF-16 Unicode, for all locales.
Locales and codepages are separate, as they should be.

You should compile your programs with UNICODE and _UNICODE defined to use the native Unicode kernel functions.

UTF-8 is not possible - as far as I know - as a char* system encoding, but you can convert to and from it with codepage number 65001. (Before Windows 2000, there was a bug in this conversion when using Unicode code points >=0x10000, but those are only recently in real use.)

Please see the MSDN and Microsoft documentation for details.

If you are looking for using the same functions on Windows, Linux, etc. then you need a cross-platform Unicode library.
See http://www.unicode.org/unicode/onlinedat/products.html#3
and http://oss.software.ibm.com/icu/

Best regards,
markus



This archive was generated by hypermail 2.1.2 : Tue Sep 04 2001 - 17:32:19 EDT