RE: Latin ligatures and Unicode

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Dec 22 1999 - 15:21:33 EST


Gregg Reynolds asked:

> > -----Original Message-----
> > From: John Cowan [mailto:jcowan@reutershealth.com]
> > Sent: Wednesday, December 22, 1999 11:44 AM
> >
> > "Reynolds, Gregg" wrote:
> >
> > > But with ZWNBSP, we have no semantics with respect to
> > joining behavior, or
> > > if we do it's well-hidden.
> >
> > ZWNBSP has no effect on joining behavior, correct. You were saying
>
> Not "no effect", but "no semantics". "No semantics" means (to me) that in
> practice implementers get to do as they please. If the standard says, as
> Mark just noted in a message, that they are to be ignored for the purposes
> of join analysis, then I stand corrected; but I haven't been able to find
> anything (admittedly I'm looking at v. 2) that says this. Wouldn't be
> surprised to find its in there somewhere, but I would like to know where.

The Unicode Standard, Version 3.0, page 314:

"The zero-width spaces are not to be confused with zero-width joiner characters.
U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER have no effect
on word boundaries, and ZERO WIDTH NO-BREAK SPACE and ZERO WIDTH SPACE have
no effect on joining or linking behavior. In other words, the zero-width
joiner characters should be ignored when determining word boundaries; ZERO
WIDTH SPACE should be ignored when determining cursive joining behavior."

This is a clarification of the text currently found in the Unicode
Standard, Version 2.0, page 6-68:

"... These properties are mutually orthogonal: U+200B ZERO WIDTH SPACE does
not affect joining or direction; the joiners neither cause a word-break nor
have a direction; ..."

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:57 EDT