From: Mark Davis ⌛ (mark@macchiato.com)
Date: Thu Aug 27 2009 - 16:50:41 CDT
Because SB5 doesn't match at the start, you keep on going through the rules,
and finally end up not breaking. I agree that the wording is not as clear as
it should be.
I added an example to show that. Hover over the characters and breaks in #15
in:
http://unicode.org/~mdavis/SentenceBreakTest-5.2.0d16.html#samples
Note to all: we can add additional examples that illustrate tricky cases;
Peter Edberg and Laurentiu Iancu have actions to do so, but we can take
additional ones if they are sent in very soon. Here are the other current
samples:
http://unicode.org/~mdavis/GraphemeBreakTest-5.2.0d16.html#samples (none
currently)
http://unicode.org/~mdavis/LineBreakTest-5.2.0d16.html#samples
http://unicode.org/~mdavis/WordBreakTest-5.2.0d16.html#samples
Mark
On Thu, Aug 27, 2009 at 13:32, Eric Muller <emuller@adobe.com> wrote:
> Eric Muller wrote:
>
>
> When considering whether there is a sentence break according to SB12 (but
> not
>
>
> -> SB11
>
> SB4), is it correct that the Sep, CR and LF have to be understood as
> "possibly followed by any number of Extend or Format"?
>
> In other words, in a string <Sep EX | Numeric>, is the position | a
> boundary or not?
>
>
> -> <STerm Sep Extend | Numeric>
>
>
> Eric.
>
> (PS: yes I am having a hard time with sentence break)
>
>
>
This archive was generated by hypermail 2.1.5 : Thu Aug 27 2009 - 16:53:21 CDT