From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Sep 05 2005 - 01:04:17 CDT
Doug Ewell wrote:
>Anto'nio Martins-Tuva'lkin <antonio at tuvalkin dot web dot pt> wrote:
>
>  
>
>>Microsoft Internet Explorer 6 has rendered the sequence U+0021 :
>>EXCLAMATION MARK and U+2026 : HORIZONTAL ELLIPSIS with a soft line
>>breake in between. Is this the expected behaviour? At least it doesn't
>>happen that way with U+0021 U+002E U+002E U+002E...
>>    
>>
>
>Well, obviously that's not the expected behavior,
>
It is - as far as Unicode is concerned. The line breaking classes of 
U+0021 and U+2026 are EX and IN, respectively. Although IN is short for 
"inseparable", this really means that characters in this class are 
inseparable from other characters in the class and from some other 
characters by special rules. The line breaking rules in UAX #14 
involving IN are LB 16 (preventing a break between AL, ID, IN, or NU and 
IN) and LB 18 c (preventing a line break between Korean syllable block 
and IN).
A quick check from Table 2 of UAC #14 also shows that between EX and IN 
there is "_", a direct break opportunity. (Whether this is a good thing 
or not is a different issue.)
HTML specifications and browsers do not claim conformance to the Unicode 
Standard, though. (The so-called document character set of HTML is 
defined in terms of Unicode, but this really means only that character 
references of the form &#decimal; and &#xhexadecimal; are interpreted by 
mapping the numbers to Unicode code points. There is no requirement that 
processing of characters take place by Unicode rules.)
IE and some other browsers have started applying some of the Unicode 
line breaking rules, but this is regarded by many as a problem rather 
than a useful thing, especially since browsers apply the rules 
indiscriminately. Such behavior is described on my page 
http://www.cs.tut.fi/~jkorpela/html/nobr.html
>and in fact the
>following page looks just fine to me on Internet Explorer 6.0.2800.1106
>under Windows Me:
>
>http://users.adelphia.net/~dewell/bang-ellipsis.html
>
>  
>
It exhibits the behavior described. When you make the browser window 
narrow enough, a line break will appear between the exclamation mark and 
the horizontal ellipsis (e.g., in the heading).
The original message from Anto'nio Martins-Tuva'lkin says that it is a 
forwarded message, quoting a message posted to the list on October 22, 
2004. I was unable to find such a message in the archives. Does someone 
know what is going on? (There are also other "forwarded messages" posted 
recently.)
This archive was generated by hypermail 2.1.5 : Mon Sep 05 2005 - 01:05:39 CDT