On Sat, 26 Jan 2019 15:45:54 +0000
James Kass via Unicode <unicode_at_unicode.org> wrote:
> Perhaps I'm not understanding, but if the desired behavior is to
> prohibit both line and word breaks in the example string, then...
>
> In Notepad, replacing U+0020 with U+00A0 removes the line-break.
I believe the problem is that "δ’ αρχαια" should have non-blank
*words*. With U+2019, one gets 3. Line-break suppressing spaces don't
help with word-breaking, because they are not treated as letters.
A clunky solution would be to have a sequence <delta,
control-joining-words, U+2019>. However, there is no such
thing as a 'control-joining-words' if one complies with the TUS
injunction in Section 23.3, "The word joiner should be ignored in
contexts other than line breaking". A robust, trainable spell-checker
will treat this institutionally racist injunction with the contempt it
deserves.
It's interesting that the spellings "'bus" and "'phone" have died.
They would once have hit the word-boundary problems when "bus" and
"phone" were rejected.
Richard.
Received on Sat Jan 26 2019 - 19:15:32 CST
This archive was generated by hypermail 2.2.0 : Sat Jan 26 2019 - 19:15:33 CST