From: Daniel Ehrenberg (microdan@gmail.com)
Date: Wed Aug 22 2007 - 11:21:27 CDT
Hi,
I'm reading UAX 29 in order to implement grapheme boundaries (and
later word and sentence boundaries) for a Unicode library for the
Factor programming language. So far, for grapheme boundary detection,
I have a basically direct implementation of the conditions listed for
boundaries, where I iterate through the string, checking each
connectedness condition, and if they all fail, returning a grapheme
break. This implementation works, but I'm wondering about a
table-based implementation, which could be faster and allow tailoring
(my implementation doesn't really allow that, except for rewriting
it). The UAX frequently references table-based implementations, but it
never describes what they are exactly or how I might go about
implementing them. I tried finding the code in ICU for it, but I'm
somewhat new at C++ and could not locate where the tables were
generated.
If someone could help me in this, that would be great.
Daniel Ehrenberg
This archive was generated by hypermail 2.1.5 : Wed Aug 22 2007 - 11:24:59 CDT