Re: EUC-UTF8 is possible!

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Mar 17 2007 - 15:15:22 CST

Next message: Alexej Kryukov: "Re: Vista Fonts"

Previous message: Dan Kogai: "EUC-UTF8 is possible!"
In reply to: Dan Kogai: "EUC-UTF8 is possible!"
Next in thread: Rick McGowan: "Re: EUC-UTF8 is possible!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Dan Kogai <dankogai at dan dot co dot jp> wrote:

> I am really surprised to find that EUC and UTF-8 can be mashed up
> easily.
>
> The secret is \xFF. This byte NEVER appears in EUC or UTF-8. So you
> can define the combo character as follow;
>
> EUC_UTF8_CHAR = EUC_CHAR | \xFF + UTF8_CHAR

No no no no. Please don't do this. Nobody else will implement it and
you will be effectively limited to using it internally within your own
programs.

Just use UTF-8, or if saving bytes is that important to you, use SCSU or
a general-purpose compression technique. See UTN #14 for more on
Unicode text compression.

As someone who has created a number of alternative encoding schemes, I
assure you that a scheme that "looks like" EUC or "looks like" UTF-8
will cause you much more trouble than a completely new scheme that can't
be confused for anything else.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages

Next message: Alexej Kryukov: "Re: Vista Fonts"
Previous message: Dan Kogai: "EUC-UTF8 is possible!"
In reply to: Dan Kogai: "EUC-UTF8 is possible!"
Next in thread: Rick McGowan: "Re: EUC-UTF8 is possible!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Mar 17 2007 - 15:17:48 CST