Mark Davis 🍍 <mark at macchiato dot com> wrote:
> Actually, if the goal is to get as many characters in as possible,
> Punycode might be the best solution. That is the encoding used for
> internationalized domains. In that form, it uses a smaller number of
> bytes per character, but a parameterization allows use of all byte
> values.
That might work well if the goal is to find a compact encoding to 7-bit
code units, then express 8 such code units in 7 bytes. It would
certainly be more economical than UTF-7-over-7, which is fine for ASCII
and awful for anything else.
I don't usually think of Punycode as an ideal general-purpose
compression encoding, especially with lines of arbitrary length or
consisting primarily of non-ASCII content (Cristian's example), but it's
certainly worth experimenting. One advantage might be that encoders and
decoders for Punycode already exist, probably in greater numbers than
for SCSU.
-- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org | @DougEwell Received on Fri Apr 27 2012 - 15:29:41 CDT
This archive was generated by hypermail 2.2.0 : Fri Apr 27 2012 - 15:29:54 CDT