From: Steve Summit (scs@eskimo.com)
Date: Wed Sep 20 2006 - 21:15:16 CDT
William Poser wrote:
> I'm confused as to the sense in which C and C++
> "don't support the Unicode character model". It is
> very easy to manipulate objects of type wchar_t,
> arrays thereof, linked lists thereof, and so forth.
Indeed (or, as others have pointed out, to manipulate objects
of type int16_t or int32_t if you want that extra degree of
explicitness).
What Standard C doesn't give you (I don't know as much about C++)
is the full-featured set of Unicode-compatible library routines
you might expect to have provided for you up-front. Yes, there
are wcstomb and mbtowcs, but you can't be sure they convert to
and from UTF-8. Yes, there are iswupper and towlower and the
others in <wctype.h>, but you can't be sure they'll exactly
implement the relevant Unicode character classes. And so on.
Of course, you can always either roll your own routines, or use a
third-party library like ICU, so C's lack of "built-in" support
may not be a serious problem for you in practice. (Or it might be.)
This archive was generated by hypermail 2.1.5 : Wed Sep 20 2006 - 21:16:27 CDT