Default bidi ranges

From: Martin J. Dürst <duerst_at_it.aoyama.ac.jp>
Date: Wed, 09 Nov 2011 18:18:16 +0900

I tried to find something like a normative description of the default
bidi class of unassigned code points.

In UTR #9, it says
(http://www.unicode.org/reports/tr9/tr9-23.html#Bidirectional_Character_Types):

Unassigned characters are given strong types in the algorithm. This is
an explicit exception to the general Unicode conformance requirements
with respect to unassigned characters. As characters become assigned in
the future, these bidirectional types may change. For assignments to
character types, see DerivedBidiClass.txt [DerivedBIDI] in the [UCD].

The DerivedBidiClass.txt file, as far as I understand, is mainly a
condensation of bidi classes into character ranges (rather than giving
them for each codepoint independently as in UnicodeData.txt). I.e. it
can at any moment be derived automatically from UnicodeData.txt, and is
as such not normative.

Why is it then that the default class assignments are only given in this
file (unless I have overlooked something)? And why is it that they are
only given in comments? I'm trying to create a program that takes all
the bidi assignments (including default ones) and creates the data part
of a bidi algorithm implementation, but I don't feel confident to code
against stuff that's in comments. Any advice? Is it possible that this
could be fixed (making it more normative, and putting it in a form
that's easier to process automatically)?

Regards, Martin.
Received on Wed Nov 09 2011 - 03:25:46 CST

This archive was generated by hypermail 2.2.0 : Wed Nov 09 2011 - 03:25:52 CST