From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Nov 19 2005 - 06:56:55 CST
From: "Richard Wordingham" <richard.wordingham@ntlworld.com>
> Well, that rules out about half the words in Burmese! I suppose there's
> the work around of replacing the virama - U+1039 U+200C ('VIRAMA' ZWNJ) -
> by U+1039 U+005F ( 'VIRAMA' LOW LINE) - extremely unnatural for a
> language that doesn't have spaces between words.
Is the space separation really a problem for IDN usage, where it is arguable
that explicit word separation is effectively needed at least to avoid
colision of name spaces?
After all, the normal space is also forbidden in Latin domain names, so we
use an hyphen: this hyphen does not have the traditional semantics found in
normal language (where it is used for compound words), but it is a syntaxic
feature that decomposes labels into lists of non-compound word tokens to be
used in domain names.
What I mean there: does Burmese need ZWNJ in the *middle* of a word or only
between words to avoid collisions with the next word? If this occurs in the
middle of a word, does it create a sort of compound word which would be
interpreted differently if they word was broken into two tokens separated by
a space? If this does not change the semantic, then even that ZWNJ can be
excluded from IDN: you can use the syntaxic ASCII hyphen to separate the two
tokens, instead of using ZWNJ.
This archive was generated by hypermail 2.1.5 : Sat Nov 19 2005 - 06:59:02 CST