Re: Difference between Bidi_Class 'R' and 'AL'

From: Mark Davis ☕ <mark_at_macchiato.com>
Date: Wed, 24 Aug 2011 09:03:54 -0700

The difference between them is subtle (and I've long been convinced that
having the distinction was a mistake, but that's water under the bridge).

It is in their effect on European numbers that occur after them, in
http://www.unicode.org/reports/tr9/#W2 (and following).

Mark
*— Il meglio è l’inimico del bene —*

On Wed, Aug 24, 2011 at 08:35, Doug Ewell <doug_at_ewellic.org> wrote:

> UAX #44, Table 13 ("Bidi_Class Values") includes the following
> descriptions:
>
> R - Right_To_Left - any strong right-to-left (non-Arabic-type) character
> AL - Arabic_Letter - any strong right-to-left (Arabic-type) character
>
> But I can't find any definition, here or elsewhere, of what constitutes
> an "Arabic-type" or a "non-Arabic-type" letter.
>
> Looking in UnicodeData.txt, I see that Arabic, Syriac, and Thaana
> letters are assigned a value of 'AL', while other RTL letters, including
> Hebrew, N'Ko, Samaritan, Mandaic, and some archaic scripts in plane 1
> are 'R'. Clearly, shaping behavior and ligation isn't what makes a
> letter or script "Arabic-type" or "non-Arabic-type."
>
> How would I make this distinction for an arbitrary letter or script,
> other than by association with an existing letter or script already
> designated as one or the other? (Yes, I admit, I am thinking about RTL
> scripts in CSUR.)
>
> --
> Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
> www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­
>
>
>
>
>
Received on Wed Aug 24 2011 - 11:07:42 CDT

This archive was generated by hypermail 2.2.0 : Wed Aug 24 2011 - 11:07:43 CDT