From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Nov 15 2003 - 11:45:15 EST
From: "Michael (michka) Kaplan" <michka@trigeminal.com>
> If you install the "Chinese (Traditional) - Unicode" IME as an input
method,
> than any program that is prepared to accept Unicode input will handle the
> input of this interesting IME that is expecting UTF-16 code units.
Although
> obviously intended for CJK, it can be used fo any UTF-16 code point.
In Windows, there are two sets of APIs: the legacy ANSI or OEM Win32 APIs
that support the native multibyte character set of the native platform, and
the more recent Unicode APIs introduced in NT kernels, and partly supported
by Windows 95/98/98SE/ME.
I was told that Chinese applications were running with the "ANSI" code page
similar to GBK at least, or GB18030 (of recent versions of Windows after
2000), with the MBCS support in both cases, and that Unicode was only
supported with UTF-16 APIs, with a built-in conversion table to transcode
GB* with UTF-16.
For East-Asian systems, the ANSI and OEM codepages are identical (unlike
European systems where there's a distinction between theOEM codepage used in
console Apps, and the ANSI codepage used in GUI apps).
So, depending on the API on which the application is built (ANSI/OEM with
_MBCS, or _UNICODE), the input capability of programs differ. This also
affect areas like the filesystem naming capabilities (limited in FAT12 for
floppies and FAT16 for Windows 3.x and NT4, extended with Unicode on FAT32
in Windows 9x/ME and NTFS for NT4/2K/XP/2003), and Windows provides in fact
two simultaneous input systems: the OEM charset and encoding in console
apps, or the ANSI charset for GUI apps handling WM_CHAR events. But where is
the input system for Unicode code points?
Basically, there does not seems to exist such input system, but instead
support of Unicode between IMEs and GUI components like the RTF input box.
For output, the solution is not more simple: the components display Unicode
UTF-16 strings with the _UNICODE APIs, and native ANSI or OEM encodings with
the non-Unicode Win32 APIs.
Aren't you oversimplifying the question, by considering only the most modern
versions of Windows? I don't think that these versions have deprecated the
ANSI charset; at least it is needed on P.R.China systems to support GB18030
(or one of its subsets, like GBK or the legacy Microsoft codepage for
simplified Chinese).
This archive was generated by hypermail 2.1.5 : Sat Nov 15 2003 - 12:29:15 EST