Re: U+2011 and U+2010

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Jun 12 2001 - 22:14:00 EDT


Patrick Andries asked:

> The Unicode Standard 3.0 (page 150) says that "U+2011 NON-BREAKING HYPHEN is
> present for compatibility with existing standards" as if it shouldn't really
> be encoded. But isn't its relation to U+2010, the same as the one that
> opposes SPACE to NO-BREAK SPACE, i.e. a semantic (behavioural) one ?

It went in, initially at least, for compatibility with XCCS (Xerox Character
Code Standard):

Unicode XCCS

U+002D HYPHEN-MINUS 000/055 Neutral dash
U+00AD SOFT HYPHEN 357/043 Discretionary hyphen
U+2010 HYPHEN 041/076 Hyphen
U+2011 NON-BREAKING HYPHEN 357/042 Nonbreaking hyphen

"Compatibility" in this sense doesn't necessarily mean "shouldn't have
been encoded".

In fact, in this particular case, if I recall, the distinctions were
probably considered to be good practice, and not something to be mapped
away. XCCS was often a *model* for early Unicode, rather than a character
encoding that forced the grudging inclusion of many icky "characters"
that we would have preferred didn't have to be there.

Keep in mind that U+00A0 NO-BREAK SPACE is *also* a compatibility
character -- for compatibility with ISO 8859-1, among other character
sets.

--Ken

>
> Patrick Andries
> Saint-Hubert (Québec)



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT