ZW(N)J in Indic scripts (was Re: Re: What constitutes "character" ? New Problem)

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Fri Nov 23 2001 - 09:10:56 EST


John Hudson wrote on Nov 10 2001:
> At 13:19 11/10/2001, Dhrubajyoti Banerjee wrote:
>
> >>Would somebody tell me what a ZWJ control is and how to include it in
> >>documents i create for Unicode compliant softwares.
> >>
> >>Please comment on the above and give a possible solution.
> >
> >ZWJ is the Zero Width Joiner and prevents the joining of consecutive
> >characters on output.
>
> I think mean ZWNJ (Zero Width Non Joiner) if you wish to prevent the
> joining of consecutive characters on output.
>
> Regarding the data sorting issue you raised, I believe the ZWNJ should be
> ignored in all data sorting: it is a formatting control character.

Sorry for coming back on this issue so late, and sorry if the issue has
already been dealt with.

In Unicode Indic scripts, both ZWJ and ZWNJ have special meanings:

Encoding: consonant + virama + ZWJ
Displays: half_consonant

Encoding: consonant1 + virama + ZWJ consonant2
Displays: half_consonant1 + full_consonant2

Encoding: consonant1 + virama + ZWNJ + consonant2
Displays: full_consonant1 + virama + full_consonant2

So, ZWJ mandates a half form (no ligature allowed), while ZWNJ mandates a
visible virama glyph.

_ Marco



This archive was generated by hypermail 2.1.2 : Fri Nov 23 2001 - 10:10:52 EST