Re: unidata is big

From: Mark Davis ([email protected])
Date: Tue Apr 23 2002 - 16:59:10 EDT

Previous message: Mark Davis: "Re: How many printable characters in 3.2.0?"
In reply to: Geoffrey Waigh: "Re: unidata is big"
Next in thread: Theo Veenker: "Re: unidata is big"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

One of the Dublin papers talks about how this is done in ICU:
http://www.unicode.org/iuc/iuc21/a347.html

Mark
—————

Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Geoffrey Waigh" <[email protected]>
To: <[email protected]>
Sent: Sunday, April 21, 2002 03:28
Subject: Re: unidata is big

> > I would just like to know if someone could give me a tip on how to
> > structure all the unicode-information in memory?
> >
> > All the UNIDATA does contain quite a bit of information and I
can't see
> > any obvious method of which is memory-efficient and gives fast
access.
>
> a) you see if there is a Unicode friendly library you can use that
already
> does this for you.
>
> b) you write a program to parse the file and extract what your
application
> needs. With clever data encoding you can pack most of the fields of
> UNIDATA into a very tight space. Long ago in the Unicode conference
> proceedings somebody illustrated how they used trie structures to
> efficiently
> build the lookup tables - the boring parts of the encoding space
have
> shorter branches than the areas where every codepoint is different
from
> it's neighbour.
>
> Geoffrey
>
>
>

Previous message: Mark Davis: "Re: How many printable characters in 3.2.0?"
In reply to: Geoffrey Waigh: "Re: unidata is big"
Next in thread: Theo Veenker: "Re: unidata is big"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Apr 23 2002 - 17:40:22 EDT