Eric Ray wrote:
> 1. The library does not really evaluate the Japanese characters
> to make logical decisions. We believe base64 encode the
> character array to avoid any "bad things happening in the code"
> (such as hitting a null value or other values that could
> potential cause problems).
Hint: consider revising your project on the light of the fact that both
Unicode (ISO 10646) and the Japanese character set (JIS X 0208) have
ASCII-compatible "multibyte" formats.
Unicode's ASCII-compatible format is called UTF-8. The most popular JIS
ASCII-compatible format is called EUC.
ASCII-compatible means that all byte in the ASCII range (0-128) are only
used for ASCII characters. So, among other things, no "bad things" happen
with null terminators or control characters.
For UTF-8, see Unicode's FAQ
<http://www.unicode.org/unicode/faq/utf_bom.html> or read the historical RFC
which proposed it <http://www.faqs.org/rfcs/rfc2279.html>.
BTW, base64 was also the base of an obsolete Unicode format called UTF-7.
Searching UTF-7 on the web, you'll find a few information and lots of bitter
comments about why this approach is obsolete.
_ Marco
This archive was generated by hypermail 2.1.2 : Mon Mar 11 2002 - 04:55:27 EST