From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue May 06 2003 - 12:28:07 EDT
Theodore H. Smith wrote:
> I'm unfamiliar with "trie".
Some pointers to ICU source code. You can find these in the download or via WebCVS. See
http://oss.software.ibm.com/icu/download/ and http://oss.software.ibm.com/icu/develop/cvs.html
For the latter, just append the pathnames below to http://oss.software.ibm.com/cvs/icu/~checkout~/icu/
ICU uses an internal API "UTrie" for storing several data structures. See source/common/utrie.h and
utrie.c. Note that this is _internal_ because it is not easy to use. The hard part is understanding
the "folding function" that you need to provide; we have an RFE to add default folding functions. If
you want to use it, then the best is to look at its usage across ICU. The presentations that others
pointed to also explain it a bit.
The "ICU Data" chapter of our User Guide contains at its bottom a table which points to where the
binary data formats are described for how we store character properties, normalization data, etc.
See http://oss.software.ibm.com/icu/userguide/icudata.html
The character properties APIs are implemented in source/common/uchar.c and uprops.c.
Note that you can build ICU with many features turned off to reduce the library size, and build the
data library with many or most items omitted. It will still be larger than 54kB though...
Hope this helps,
markus
This archive was generated by hypermail 2.1.5 : Tue May 06 2003 - 13:30:01 EDT