RE: Latin w/ diacritics (was Re: benefits of unicode)

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Apr 18 2001 - 15:49:06 EDT


Carl Brown said, in support of Michka cringing about segments:

> I agree.
>
> If these folks really want Unicode everywhere I will write Unicode for the
> IBM 1401 if they are willing to foot the bill. Seriously I would never
> agree to such a ludicrous idea.

Exactly. How about an Apple II or a PDP-8 while you're at it?

>
> Can you imagine a Unicode 3.1 character properties table that uses 16bit
> addressing?

But this is not actually the problem. Anybody who approaches Unicode
character property tables with tries, for example, already has
a segmented table architecture that works just fine within a
16-bit addressing model, as long as none of the individual pieces
exceeds the 64K limit. I've written such tables, and they worked just
fine on Windows 3.1 -- even a Unicode 3.1 version of such
a table would work fine.

The table problems tend to arise when dealing with character set
conversion for Asian character sets. For those, table designs that
work well with 32-bit addressing tend to break down for 16-bit
addressing, since it is so easy to exceed 64K data chunks.
Collation tables and fonts would be other areas likely to cause
difficulties.

>
> Unicode take lots of memory.

Well, sort of. There are lots of tricks to programming Unicode support
that limit the memory usage for the various tables and algorithms
needed to support a large character set. Let's not start down the
road of overemphasizing the system requirements of Unicode.

Compared to the memory requirements for video, sound, and for data
caching on servers, the memory requirements for Unicode per se
tend to be down in the noise -- with the exception of those big
CJK fonts.

--Ken

>
> Carl
>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT