From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Nov 19 2003 - 20:44:00 EST
From: "Peter Kirk" <peterkirk@qaya.org>
> >Of course this is not a standard normalization form, but using this
pseudo
> >combining class may help render the last two coded strings (in my quote
> >above) equivalently in renderers.
> >This works even in the case where there are multiple diacritics (noted
CC1
> >and CC2 below):
> > <NBSP,CC1,CC2>
> >is then treated as if it was:
> > <WJ,SP,WJ,CC1,CC2>
> >and then the pseudo-normalization had given:
> > <WJ,SP,CC1,CC2,WJ>
> >or:
> > <WJ,SP,CC2,CC1,WJ>
> >(depending on the canonical reordering of CC1 and CC2, i.e. of their
> >relative combining class)
>
> This trick doesn't work if any of the CC's are in combining class zero.
Of course, but which combining character of combining class 0 does need to
combine with NBSP in a way that affect renderers?
Do you think about sequences like <NBSP,CGJ>?
Or about issues when rendering <07A6;THAANA ABAFILI;Mn;0;NSM;;;;;N;;;;;>
after <NBSP>
which of wourse would be handled only as <WJ,SP,WJ,THAANA ABAFILI> ?
Or about: <0901;DEVANAGARI SIGN CANDRABINDU;Mn;0;NSM;;;;;N;;;;;> after
<NBSP>
rendered as if it was <WJ,SP,WJ,CANDRABINDU> ?
Or about <0903;DEVANAGARI SIGN VISARGA;Mc;0;L;;;;;N;;;;;> after <NBSP>
which is this time a "Mc" character ?
Or about all the Indic vowels which do not seem to be really combining on
NBSP but would be rendered as a space followed by a defective isolated form
of the vowel (so without vowel glyphs reordering around the space) ?
Just curious...
If we just say that <NBSP> behaves in all cases in renderers as if it was
<WJ,SP,WJ> where WJ is reordered with a pseudo-combining class 256, it
solves much problems with the interpretation of NBSP, and it looks like if
NBSP was a space letter; however NBSP is not a "Lo" character but really a
"Zs" whitespace and thus justifiable out of the end margin; also NBSP does
not prohibit word break but only line breaks), so it is more like if it was
in fact: <LJ,SP,LJ> where LJ is a line-joiner, distinct also from ZWJ
(zero-width joiner) used to hint ligatures but which does not brohibit any
break.
This archive was generated by hypermail 2.1.5 : Wed Nov 19 2003 - 21:26:37 EST