From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Nov 19 2003 - 14:02:50 EST
On 19/11/2003 01:49, Pim Blokland wrote:
>In the online 4.0 book, chapter 15
>
>http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf
>
>the definition for Word Joiner says:
>
>
>
>>Until Unicode 3.1.1, U+FEFF was the only code point with word
>>joining semantics, but because it is more commonly used as
>>byte order mark, the use of U+2060 [word joiner] to indicate
>>word joining is strongly preferred for any new text.
>>
>>
>
>
>
Perhaps this depends what is meant by "word joining semantics". I would
presume this to imply that a word boundary is not permitted at this
point, but in fact on the current definitions in UAX29
(http://www.unicode.org/reports/tr29/tr29-5.html) ZWNBS, WJ and NBSP are
all treated as word boundary characters.
>However, a couple of paragraphs up, the definition for No-Break
>Space says:
>
>
>
>>U+00A0 [No-Break Space] behaves like the following coded
>>character sequence: U+FEFF [Zero Width No-Break Space] +
>>U+0020 [Space] + U+FEFF [Zero Width No-Break Space].
>>
>>
>
>Is this something that has slipped by the editors? Or am I missing
>something?
>
>Pim Blokland
>
>
Does this equivalence hold when combining characters are applied to the
NBSP? Is the sequence <NBSP, CC> (recommended for spacing diacritics,
where CC is any sequence of combining characters) equivalent to <ZWNBS,
SP, ZWNBS, CC>? Or should the equivalence be to <ZWNBS, SP, CC, ZWNBS>?
Is it legal to combine combining characters with ZWNBS, or WJ, and how
should this be rendered?
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Wed Nov 19 2003 - 15:03:40 EST