From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Jul 25 2007 - 15:11:16 CDT
Philippe Verdy wrote:
> The line breaking opportunities does not seem to handle some special cases
> related to undesirable line breaks that are currently allowed.
> This comes for example with parentheses, that currently always allow line
> breaks after or before them and text they surround.
That's not how I read UAX #14. I could be wrong, of course, but
reading the Example Pair Table, it seems clear that the table
specifies that such junctures are *indirect* line break opportunities,
but then that is the same treatment you get for any pair
of alphabetic characters in sequence, also.
And in particular, the relevant rules are:
LB28 Do not break between alphabetics.
AL × AL
LB30 Do not break between letters, numbers, or ordinary symbols and
opening or closing punctuation.
(AL | NU) × OP
CL × (AL | NU)
Those rules seem *already* to be doing exactly what you seem to
be asking for.
Skipping over a fascinating excursion into French topynymy...
> I can give another more common example where such linebreaks are
> undesirable:
> "un (ou plusieurs) mot(s)"
> Note how the "s" plural mark in "mots" is marked as an alternative; it is
> not separable from the word it normally completes. inserting a linebreak
> between "mot" and "(s)" would be wrong.
And UAX #14 does not suggest that one do so. See LB30 cited above.
> I propose disallowing line breaks around ***BOTH*** sides of:
> * (parentheses), or parenthese-like characters like
> * [square brackets],
etc., etc.
This is already handled correctly in UAX #14.
--Ken
This archive was generated by hypermail 2.1.5 : Wed Jul 25 2007 - 15:14:27 CDT