Re: UCS-4, UCS-2, UTF-16, UTF-8

From: Doug Ewell (dewell@compuserve.com)
Date: Thu Feb 17 2000 - 12:36:37 EST


Joerg Knappen <KNAPPEN@ALPHA.NTP.SPRINGER.DE> wrote:

> what's the point behind UTF-32? There is no such thing as a
> transformation involved, not even cutting off the fourth octett (The
> UTF-32 range fits very well in three octetts; and you can use even
> less bits internally). So it boils down to yet another label for
> character sets.

UTF-32 is UCS-4 with additional semantics, namely that values beyond
U-0010FFFF are excluded. The point is to enforce the limited range,
and possibly to allow some kind of internal optimization of the kind
Jörg alluded to, based on the knowledge that the range is limited.

-Doug



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT