L2/09-053

Public Review Issue #130 - Word Break Property for ZWSP

The Unicode Technical Committee is considering changing the Word_Break property value for ZWSP from the value WB=Format to the value WB=Other (WB=XX). The effect of this would be to have the ZWSP act as a word-separator in the default word break algorithm, and is consistent with its usage in Thai, Lao, and other scripts that don't use spaces for separating words.

200B ; Format # Cf ZERO WIDTH SPACE

to

200B ; Other # Cf ZERO WIDTH SPACE

This proposed change would affect behavior for word breaks (UAX #29: Text Segmentation), but would not affect the behavior of the Unicode Line Break algorithm (UAX #14).

The background for this is that in 2003, the behavior of ZWSP was made more consistent with its general semantics as a Format character, without making a change in the Word_Break property that would allow it to continue to function as a word separator.

For more information, see UAX #29: Text Segmentation: http://unicode.org/reports/tr29/#Default_Word_Boundaries