JTC1/SC2/WG2 N2875
L2/04-352
Saurashtra Linebreaking and Other Properties
Source:
Rick McGowan on behalf of UTC
Date:
September 15, 2004
This is to accompany document ISO/IEC JTC1/SC2/WG2 N2549 (L2/03-098) Proposal to encode the Saurashtra script in the UCS by Michael Everson and Jeyakumar Chinnakkonda Krishnamoorty.
The Saurashtra characters have properties similar to Devanagari and the other Indic scripts. These properties for UnicodeData.txt are covered in L2/03-225.
Linebreaking Property Values
The table below shows the values for the Linebreaking property, as they would be defined in the LineBreak.txt file of the UCD.
1x0000;CM # SAURASHTRA SIGN ANUSVARA1x0001;CM # SAURASHTRA SIGN VISARGA1x0002;AL # ASHTRA LETTER A1x0003;AL # ASHTRA LETTER AA1x0004;AL # ASHTRA LETTER I1x0005;AL # ASHTRA LETTER II1x0006;AL # ASHTRA LETTER U1x0007;AL # ASHTRA LETTER UU1x0008;AL # ASHTRA LETTER VOCALIC R1x0009;AL # ASHTRA LETTER VOCALIC RR1x000A;AL # ASHTRA LETTER VOCALIC L1x000B;AL # ASHTRA LETTER VOCALIC LL1x000C;AL # ASHTRA LETTER E1x000D;AL # ASHTRA LETTER EE1x000E;AL # ASHTRA LETTER AI1x000F;AL # ASHTRA LETTER O1x0010;AL # ASHTRA LETTER OO1x0011;AL # ASHTRA LETTER AU1x0012;AL # ASHTRA LETTER KA1x0013;AL # ASHTRA LETTER KHA1x0014;AL # ASHTRA LETTER GA1x0015;AL # ASHTRA LETTER GHA1x0016;AL # ASHTRA LETTER NGA1x0017;AL # ASHTRA LETTER CA1x0018;AL # ASHTRA LETTER CHA1x0019;AL # ASHTRA LETTER JA1x001A;AL # ASHTRA LETTER JHA1x001B;AL # ASHTRA LETTER NYA1x001C;AL # ASHTRA LETTER TTA1x001D;AL # ASHTRA LETTER TTHA1x001E;AL # ASHTRA LETTER DDA1x001F;AL # ASHTRA LETTER DDHA1x0020;AL # ASHTRA LETTER NNA1x0021;AL # ASHTRA LETTER TA1x0022;AL # ASHTRA LETTER THA1x0023;AL # ASHTRA LETTER DA1x0024;AL # ASHTRA LETTER DHA1x0025;AL # ASHTRA LETTER NA1x0026;AL # ASHTRA LETTER PA1x0027;AL # ASHTRA LETTER PHA1x0028;AL # ASHTRA LETTER BA1x0029;AL # ASHTRA LETTER BHA1x002A;AL # ASHTRA LETTER MA1x002B;AL # ASHTRA LETTER YA
1x002C;AL # ASHTRA LETTER RA
1x002D;AL # ASHTRA LETTER LA
1x002E;AL # ASHTRA LETTER VA
1x002F;AL # ASHTRA LETTER SHA
1x0030;AL # ASHTRA LETTER SSA
1x0031;AL # ASHTRA LETTER SA
1x0032;AL # ASHTRA LETTER HA
1x0033;AL # ASHTRA LETTER LLA
1x0035;CM # ASHTRA VOWEL SIGN AA1x0036;CM # ASHTRA VOWEL SIGN I1x0037;CM # ASHTRA VOWEL SIGN II1x0038;CM # ASHTRA VOWEL SIGN U1x0039;CM # ASHTRA VOWEL SIGN UU1x003A;CM # ASHTRA VOWEL SIGN VOCALIC R1x003B;CM # ASHTRA VOWEL SIGN VOCALIC RR1x003C;CM # ASHTRA VOWEL SIGN VOCALIC L1x003D;CM # ASHTRA VOWEL SIGN VOCALIC LL1x003F;CM # ASHTRA VOWEL SIGN E1x0040;CM # ASHTRA VOWEL SIGN EE1x0041;CM # ASHTRA VOWEL SIGN AI1x0042;CM # ASHTRA VOWEL SIGN O1x0043;CM # ASHTRA VOWEL SIGN OO1x0044;CM # ASHTRA VOWEL SIGN AU
1x0045;CM # ASHTRA SIGN VIRAMA
1x0046;NU # ASHTRA DIGIT ZERO
1x0047;NU # ASHTRA DIGIT ONE
1x0048;NU # ASHTRA DIGIT TWO
1x0049;NU # ASHTRA DIGIT THREE
1x004A;NU # ASHTRA DIGIT FOUR
1x004B;NU # ASHTRA DIGIT FIVE
1x004C;NU # ASHTRA DIGIT SIX
1x004D;NU # ASHTRA DIGIT SEVEN
1x004E;NU # ASHTRA DIGIT EIGHT
1x004F;NU # ASHTRA DIGIT NINEIf the character shown as proposed for 1x0500 in the Saurashtra proposal
L2/03-098 were to be encoded, it would have linebreaking property as shown below:
1X0050;AL # unidentified Saurashtra letterOther Property Values (Proplist.txt)
The Script property value for all of the characters in the proposal (1x0000 through 1x004F)
should be "Saurashtra", and that script value should be added to UAX #24.
1x0000..1x0001 should have the Other_Alphabetic property (Proplist.txt), like
U+0903.
1x0044 should have the Diacritic property (Proplist.txt), like U+094D.
No punctuation characters are specified in the proposal.
Derived Property Values
1x0002..1x0033 are all Alphabetic (Letter Other, Lo), and should end up with the
derived property of Grapheme_Base.
1x0035..1x0044 are all CM and should end up with the derived property Grapheme_Extend.
There are 10 decimal digits 1x0046..1x004F, which should be checked to make
sure they end up with the right numeric and/or math properties.