You can use an ICU RuleBasedBreakIterator with custom rules. If you like to
try that, it would be best to join the icu-support mailing list.
http://userguide.icu-project.org/boundaryanalysis
http://icu-project.org/apiref/icu4j/com/ibm/icu/text/RuleBasedBreakIterator.html
(There is also a C++ version, and a C wrapper.)
http://site.icu-project.org/contacts
ICU's grapheme cluster break rules:
http://bugs.icu-project.org/trac/browser/icu/trunk/source/data/brkitr/char.txt
Best regards,
markus
Received on Thu Mar 14 2013 - 10:40:36 CDT
This archive was generated by hypermail 2.2.0 : Thu Mar 14 2013 - 10:40:37 CDT