Re: A UTF-8 based News Service

From: Keld Jørn Simonsen (keld@dkuug.dk)
Date: Fri Jul 13 2001 - 06:25:26 EDT


On Fri, Jul 13, 2001 at 02:14:25AM +0100, David Starner wrote:
> > As someone involved in the service I often wish there was some
> > form of "compressed" Unicode encoding. The 3-byte penalty that
> > Ethiopic bears under UTF-8 turns into higher bandwidth that web
> > hosting services meter and charge for by the megabyte. For a
> > popular site this soon makes UTF-8 a costly option to support.
> >
> > A system analagous to iso-8859-x whereby Ethiopic and other scripts
> > in the 3 byte range could be shifted back into the 2 byte range
> > might help (generally only English and Ethiopic is desired together).
> >
> > Fortunately there is mod_gzip for Apache. I would appreciate any
> > information about other options.
>
> What about UTF-16? Encode all characters as 2 bytes, and your problem is
> solved, and UTF-16 should be supported by all recent Unicode-supporting web
> browsers.

UTF-16 is not just 2 bytes, it is sometimes 2 and sometimes 4 bytes.
IETF is recommending UTF-8 as the prime charset in all Internet protocols.

Kind regards
Keld



This archive was generated by hypermail 2.1.2 : Fri Jul 13 2001 - 07:31:50 EDT