I coded a very simple test of the level of support for
wide character strings in a C/C++ compiler. Unfortunately
it shows that there are still compilers out there that
will have trouble even at the entry level of Unicode functionality.
wchrtest.cpp:
//============================
#include <stdio.h>
#include <stddef.h>
wchar_t *foo = L"\x4e00 = jit7 in Hakka";
void main(void)
{
int i = 0;
while(foo[i])
{
printf("%04X ", foo[i]);
if (foo[i] < 128)
printf("%c", foo[i]);
else if (foo[i]>=0x4e00 && foo[i] < 0x9fff)
printf("<Han character>");
printf("\n");
i++;
}
}
//======================= END
Compiled with Borland 5.01 compiler under Win 95 gives:
wchar_t *foo = L"\x4e00 is jit7 in Hakka";
(error) Numeric constant too large
(warning) Hexadecimal value contains more than 3 digits
The program compiles and executes just fine under
Watcom 10.6
Next, I'll be trying it under Microsoft C++ 4.0.
Anyone want to try other compilers & platforms? If the program
compiles, then try running the executable - the output should look
like -
===========================
4E00 <Han character>
0020
0069 i
0073 s
0020
006A j
0069 i
0074 t
0037 7
0020
0069 i
006E n
0020
0048 H
0061 a
006B k
006B k
0061 a
=================================================
It has to compile and execute - making wchar_t at least 16 bits is
required!
Let's get these compilers going for the future!
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT