From: Mark Davis (mark.edward.davis@gmail.com)
Date: Wed Jan 07 2009 - 13:48:11 CST
Even if it was, if you have good additional test cases, we'd welcome them.
Mark
On Wed, Jan 7, 2009 at 11:03, Daniel Ehrenberg <microdan@gmail.com> wrote:
> I'm sorry, this was an error on my end. Ignore that message.
>
> On Wed, Jan 7, 2009 at 12:38 PM, Daniel Ehrenberg <microdan@gmail.com>
> wrote:
> > I'm implementing UAX #29 word breaking (without tailoring). Right now,
> > I've implemented the algorithm except that I treat rules like
> >
> > Numeric (MidNum | MidNumLet) × Numeric
> >
> > as
> >
> > (MidNum | MidNumLet) × Numeric
> >
> > The funny thing is, though, that all unit tests in WordBreakTest.txt
> > pass. But a string like "foo: bar" segments as /foo:/ /bar/. By my
> > reading of the UAX, this is incorrect, and the correct word
> > segmentation would be /foo/:/ /bar/. For my own project, I'll add some
> > additional unit tests, unless I've misread the standard. It seems to
> > me like these tests should be added to the WordBreakTest.txt file, and
> > I'd be glad to supply them. Is this possible?
> >
> > Dan
> >
>
>
>
This archive was generated by hypermail 2.1.5 : Wed Jan 07 2009 - 13:50:23 CST