Re: UCS-4, UCS-2, UTF-16, UTF-8

From: Doug Ewell ([email protected])
Date: Thu Feb 17 2000 - 12:36:37 EST


Joerg Knappen <[email protected]> wrote:

> what's the point behind UTF-32? There is no such thing as a
> transformation involved, not even cutting off the fourth octett (The
> UTF-32 range fits very well in three octetts; and you can use even
> less bits internally). So it boils down to yet another label for
> character sets.

UTF-32 is UCS-4 with additional semantics, namely that values beyond
U-0010FFFF are excluded. The point is to enforce the limited range,
and possibly to allow some kind of internal optimization of the kind
J�rg alluded to, based on the knowledge that the range is limited.

-Doug



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT