Re: How is NBH (U0083) Implemented?

From: Jukka K. Korpela <jkorpela_at_cs.tut.fi>
Date: Fri, 5 Aug 2011 06:50:00 +0300

Doug Ewell wrote:

> Sorry, make that:
>
> "For many years, there was hardly any system that did not implement
> ISO 8859-1 but implemented Unicode, but there were systems that did
> the opposite."

So? It was, and it still often is, better to use ISO 8859-1 rather than
Unicode, in situations where there no tangible benefit, or just a smal l
benefit, from using Unicode. For example, many people are still conservative
about encodings in e-mail, for good reasons, so they use ISO 8859-1 or, as
you did in your message, windows-1252.

On the other hand, this isn’t comparable to ZWNBSP vs. WJ. These control
characters do the same job in text, as per the standard, so the practical
question is simply which one is better supported. ISO 8859-1 and Unicode
perform very different jobs, so that using ISO 8859-1, you limit your
character repertoire (at least as regards to directly representable
characters, as opposite to various “escape notations”). If you don’t need
anything outside the ISO 8859-1, the choice used to be very simple, though
nowadays it has become a little more complicated (as e.g. Google Groups
seems to munge ISO 8859-1 data in quotations but processes UTF-8 properly)

> I'd be interested in seeing a partial list of systems or applications
> that implement U+FEFF as ZWNBSP, with all of the (non-BOM) semantics
> that implies, or existing texts that use U+FEFF for that purpose. I'd
> be surprised if there were many.

I won’t make any statements about full compliance, but in Microsoft Office
Word 2007, U+FEFF alias ZWNBSP does its basic job (inside text) in most
situations whereas U+2060 alias WJ seems to be not recognized at all and
appears as some sort of a visible box. So to have a job jone, there is not
much of a choice. (Word 2007 fails to honor ZWNBSP semantics after EN DASH,
which is bad, but it does not make it useless in other situations.)

Yucca
Received on Thu Aug 04 2011 - 22:54:11 CDT

This archive was generated by hypermail 2.2.0 : Thu Aug 04 2011 - 22:54:13 CDT