Re: UTF-8 codification

From: Doug Ewell (dewell@compuserve.com)
Date: Wed May 24 2000 - 09:49:40 EDT


"Daniel CAUNE" <d.caune@citb.bull.net> wrote:

> Where can I find a white paper about UTF-8 codification ? Is there a
> such document on the Unicode Organisation Web site ?

You know, I've been meaning to mention that. There is no definition of
UTF-8 anywhere on the Unicode Web site, except for incidental references
in Technical Reports 16, 17, 18, and 22, and none in the Unicode 3.0
book, except for a pointer to the sample implementation on the CD-ROM.

If UTF-8-encoded Unicode is going to become the worldwide standard we
want it to be, it should really be easier to find the UTF-8 algorithm
on the Unicode Web site. Questions like Daniel's are going to come up
again and again until it is.

The best UTF-8 reference I know of on the Web (other than RFC 2279,
which defines it) is Markus Kuhn's page at:

    http://www.cl.cam.ac.uk/~mgk25/unicode.html

Markus's descriptions of Unicode and UTF-8 are very accurate, easy to
understand, and applicable to all systems (despite the title and
introductory matter, which leave you with the initial impression that
the material will only be useful to Unix/Linux implementors).

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT