From: Javier SOLA (lists@khmeros.info)
Date: Sun Nov 27 2005 - 19:16:24 CST
>>> How appropriate would ZWSP be in the middle of words like 'Myanma(r)'
>>> and 'Yangon'?
>>
ZWSP indicates a breaking opportunity. This would be innapropriate if
the word should not be broken at the end of a line,
as in Myan
mar.
(which is probably the case).
I am not an expert in Myanmar (even if I am trying to make it render in
ICU). I would tend to see ZWNJ and ZWJ as part of a cluster, and not as
word separators. A ZWNJ could be the last character of a cluster... and
this signals that the cluster is finished... but it is not a word
separator. A ZWNJ at the end of the first cluster of a two-cluster word
would not be a separator (if the word should not be divided).
ZWNJ is an element used in the standard order of components; ZWSP could
never be.
I would assume that two different renderings (with and without ZWNJ)
would lead to different IDNs. IDNs are first expanded (character by
character) and then compared byte-by-byte. and this would lead to not
matching two strings if one of them has an extra character (the ZWNJ). I
do not think that the BIND program used for DNS resolution can do any
type of normalisation... and I agree that - as it is contemplated in the
standard order of components - ZWNJ should be usable in IDNs.
In Khmer this would be more problematic, as the ZWNJ is mostly used to
break font ligatures (such as LETTER UO + VOWEL I in moul style fonts),
but the word is exactly the same.
Javier
This archive was generated by hypermail 2.1.5 : Sun Nov 27 2005 - 19:16:21 CST