From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Jun 26 2003 - 16:41:02 EDT
Peter replied to Karljürgen:
> Karljürgen Feuerherm wrote on 06/25/2003 08:31:41 PM:
>
> > I was going to suggest something very similar, a ZW-pseudo-consonant of
> some
> > kind, which would force each vowel to be associated with one consonant.
>
> An invisible *consonant* doesn't make sense because the problem involves
> more than just multiple written vowels on one consonant;
I agree that we don't want to go inventing invisible consonants for
this.
BTW, there's already an invisible vowel (in fact a pair of them)
that is unwanted by the stakeholders of the script it was
originally invented for:
U+17B4 KHMER VOWEL INHERENT AQ
This is also (cc=0), so would serve to block canonical reordering
if placed between two Hebrew vowel points. But I'm sure that if
Peter thought the suggestion of the ZWJ for this was a "groanable
kludge", Biblical Hebraicists would probably not take lightly
to the importation of an invisible Khmer character into their
text representations. ;-)
> in fact, that is
> a small portion of the general problem. If we want such a character, it
> would notionally be a zero-width-canonical-ordering-inhibiter, and nothing
> more.
The fact is that any of the zero-width format controls has the
side-effect of inhibiting (or rather interrupting) canonical reordering
if inserted in the middle of a target sequence, because of their
own class (cc=0).
I'm not particularly campaigning for ZWJ, by the way. ZWNJ or even
U+FEFF ZWNBSP would accomplish the same. I just suggested ZWJ because
it seemed in the ballpark. ZWNBSP would likely have fewer possible
other consequences, since notionally it means just "don't break here",
which you wouldn't do in the middle of a Hebrew combining character
sequence, anyway.
> And I don't particular want to think about what happens when people start
> sticking this thing into sequences other than Biblical Hebrew ("in
> unicode, any sequence is legal").
But don't forget that these cc=0 zero width format controls already
can be stuck into sequences other than Biblical Hebrew. In some
instances they have defined semantics there (as for Arabic and
Indic scripts), but in all cases they would *already* have the
effect of interrupting canonical reordering of combining character
sequences if inserted there.
--Ken
This archive was generated by hypermail 2.1.5 : Thu Jun 26 2003 - 17:15:03 EDT