Conversion of DBCS / MBCS characters to UTF8

From: Doug Ewell (dewell@compuserve.com)
Date: Wed Jan 19 2000 - 10:06:29 EST


Kedar Moghe <kmoghe@quark.com.sg> wrote (apparently to individual list
members rather than to the list itself):

> Can you help me to get the direct mapping between Two Byte and UTF8
> so that we can develop our own function or otherwise any free source/
> library which can be used for commercial purpose.

The official mapping tables between 8-bit character sets (including DBCS)
and Unicode are available at ftp.unicode.org. This Question is Asked
quite Frequently on this list, and it probably should be included in the
FAQ on the Unicode web site.

Conversion from 16-bit Unicode (UCS-2 or UTF-16) to UTF-8 is done by a
very efficient algorithm. You are unlikely to encounter significant
performance problems by treating this conversion as a separate step,
although you can certainly convert directly from DBCS to UTF-8 if you
write your own function to do so.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT