Re: unidata is big

From: Geoffrey Waigh (gpw@uniserve.com)
Date: Sun Apr 21 2002 - 06:28:32 EDT


> I would just like to know if someone could give me a tip on how to
> structure all the unicode-information in memory?
>
> All the UNIDATA does contain quite a bit of information and I can't see
> any obvious method of which is memory-efficient and gives fast access.

a) you see if there is a Unicode friendly library you can use that already
does this for you.

b) you write a program to parse the file and extract what your application
needs. With clever data encoding you can pack most of the fields of
UNIDATA into a very tight space. Long ago in the Unicode conference
proceedings somebody illustrated how they used trie structures to
efficiently
build the lookup tables - the boring parts of the encoding space have
shorter branches than the areas where every codepoint is different from
it's neighbour.

Geoffrey



This archive was generated by hypermail 2.1.2 : Sun Apr 21 2002 - 07:42:41 EDT