From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Oct 27 2005 - 11:43:58 CST
From: "Peter Constable" <petercon@microsoft.com>
> One interesting quirk: in Windows, UTF-8 (code page 65001) and UTF-7
> (code page 65000) can be considered "ANSI" code pages.
Interesting. It's true that UTF-8 and UTF-7 are compatible with the Win32 A 
interface, because it can use strings encoded with 8-bit CHAR units.
But is there now a locale supported by Windows in which the "ANSI" codepage 
is 65001 (UTF-8) (though I doubt that there's any that use codepage 65000, 
because it would break the compatibility with lots of Win32 API as it cannot 
be used to pass 7-bit ASCII-encoded strings) ?
Note that the "ANSI" codepage (ACP) cannot be changed in Windows. It is 
fixed for the built localization of the system. But this is not the case of 
the "OEM" codepage (OCP) that can be used in the Windows console and is also 
used in the basic FAT filesystem (also used by the VFAT extension found in 
"FAT32" that adds the support for UTF-16LE, long filenames and other 
extensions for the format and size of the allocation map and of clusters), 
or for the "Boot" OEM codepage which is also be different from OCP and 
cannot be changed as well (the Boot codepage is typically distinct from the 
default OEM codepage on Asian versions of Windows, and is used in kernel 
drivers and for communication with the kernel debug console).
Anyway, I also think that 65001 cannot be used safely as the OEMCP for the 
filesystem (only because the backslash character "\" cannot be correctly 
encoded the way it is recognized by the FAT filesystem); the same would 
apply to the Win32-A registry API that also requires this character (unless 
all these Win32-A APIs are always implemented by an prior string conversion 
from the current OCP to UTF-16 before calling the corresponding Win32-U 
API), although it could be used for display on a console window (but the DOS 
and BIOS emulation layer will not work correctly with this codepage, and 
this would affect severely the input of characters from the keyboard, for 
example in the "EDIT.COM" program).
This archive was generated by hypermail 2.1.5 : Thu Oct 27 2005 - 11:45:38 CST