From: David Starner (dvdeug@fullnet.net)
Date: Thu Apr 03 2003 - 15:01:13 EST
On Thu, Apr 03, 2003 at 09:05:23PM +0200, Pim Blokland wrote:
> Why is there no UTF-24?
Why? UTF-24 will almost invariably be larger then UTF-16, unless you are
talking a document in Old Italic or Gothic. The math alphanumberic
characters will almost always be combined with enough ASCII to make
UTF-8 a win, and if not, enough BMP characters to make UTF-16 a win.
Modern computers don't deal with 24 bit chunks well; in memory, they'd
take up 32 bits a piece, unless you declared them packed, and then
they'd be a lot slower then UTF-16 or UTF-32. And if you're storing to
disk, you may as well use BOCU or SCSU (you're already going
non-standard), or use standard compression with UTF-8, UTF-16, BOCU or
SCSU. SCSU or BOCU compressed should take up half the space of UTF-24,
if that.
-- David Starner - dvdeug@email.ro It's the terror of knowing/What this world is about Watching some good friends/Screaming 'Let me out' -- Queen, "Under Pressure"
This archive was generated by hypermail 2.1.5 : Thu Apr 03 2003 - 15:39:11 EST