From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Mar 10 2003 - 21:47:48 EST
Antonio asked:
> On 2003.02.25, 19:36, Asmus Freytag <asmusf@ix.netcom.com> wrote:
>
> > At 12:55 PM 2/25/03 +0000, Anto'nio Martins-Tuva'lkin wrote:
> >
> > > Most (all?) of them are composable, either by means of letter +
> > > slash (OSLI) or by ZWJ (for things like "Pta" or "Pts", if
> > > anything),
> >
> > Using ZWJ for such things is frowned upon. The ZWJ [is] not a general
> > purpose compositor.
>
> Sorry. I mean such an invisible character that would keep those letters
> toghether, even when the inter-character space is expanded, like as if
> they were in the same "lead type". (The same thing I'd use decompose
> U+0133 into i+THING+j.)
>
> What Unicode character should be used for this, then?
>
> > The ZWJ may be used to request a ligature between two characters,
>
> Isn't this the role of CGJ (combining grapheme joiner)? «Indicates that
> the adjoining characters are to be treated as a graphemic unit.»
While the language has been confusing, the intent is the following.
ZWJ/ZWNJ are used for control of cursive connection (as for Arabic),
to affect exact glyph shaping in various Indic scripts,
and to request ligation or non-ligation in various scripts.
Think of them as a non-displaying "joining context" which is
used by a rendering engine (or font) to influence the exact
display of glyphs -- and in particular their visible connection
to one another.
CGJ (COMBINING GRAPHEME JOINER) is used to connect two (or more)
characters together into a *logical* unit for the purposes
of some processing. It is intended to create exceptional
units (only if required) for processes such as boundary
determination or sorting.
Think of it as a character "gluer" that has no impact on
display, per se.
--Ken
This archive was generated by hypermail 2.1.5 : Mon Mar 10 2003 - 22:30:55 EST