From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Sat Jan 17 2009 - 12:22:07 CST
On 1/17/2009 2:54 AM, Jukka K. Korpela wrote:
>
> I don’t think any software should implement UAX #14 as such, except
> programs specifically designed to test the effects of UAX #14. It is
> absurd, for example, to break the expression “-1” after the
> HYPHEN-MINUS character.
Nobody argues that point. UAX#14 is conceived of as a baseline from
which to customize to get better results.
That doesn't explain why Microsoft chose to disregard the very clear
semantics of HYPHEN. That seems very much like a bug, and unfortunately,
has the effect of making text where HYPHEN-MINUS is replaced by the more
appropriate characters MINUS and HYPHEN work less well than retaining
the undifferentiated HYPHEN-MINUS. That's definitely contrary to the
expectations of those who asked for the encoding of HYPHEN early in the
history of Unicode.
HYPHEN and MINUS (and EN DASH) were introduced to allow authors to
unambiguously encoded whether "-123" is a negative number, or something
that can be wrapped to a new line before the 123.
It is because of the random, and occasionally spotty nature of some of
the early support of Unicode that specifications such as UAX#14 are even
necessary. If everyone got it right, there would be no need for it.
A./
PS: "-1" is very short. A smart line-layout algorithm would not accept
such a line break, even if it is a formal line-break opportunity. That's
true whether the example is "-1" or "-a". Wrapping a single character to
a new line after a hyphen does not look good, but Word (2003) does it
anyway (after HYPEHN-MINUS).
According to UAX#14, a line-layout algorithm gets to decide which line
break opportunities to use - UAX#14 is designed to provide the
candidates, but not the algorithm for selection of the actual line
breaks. That is often confused in discussing UAX#14.
This archive was generated by hypermail 2.1.5 : Sat Jan 17 2009 - 12:25:34 CST