L2/04-352
Saurashtra Linebreaking and Other Properties
Source: Rick McGowan on behalf of UTC Date: September 15, 2004 This is to accompany document ISO/IEC JTC1/SC2/WG2 N2549 (L2/03-098) Proposal to encode the Saurashtra script in the UCS by Michael Everson and Jeyakumar Chinnakkonda Krishnamoorty.
The Saurashtra characters have properties similar to Devanagari and the other Indic scripts. These properties for UnicodeData.txt are covered in L2/03-225.
Linebreaking Property Values
The table below shows the values for the Linebreaking property, as they would be defined in the LineBreak.txt file of the UCD.
1x0000;CM # SAURASHTRA SIGN ANUSVARA 1x0001;CM # SAURASHTRA SIGN VISARGA 1x0002;AL # ASHTRA LETTER A 1x0003;AL # ASHTRA LETTER AA 1x0004;AL # ASHTRA LETTER I 1x0005;AL # ASHTRA LETTER II 1x0006;AL # ASHTRA LETTER U 1x0007;AL # ASHTRA LETTER UU 1x0008;AL # ASHTRA LETTER VOCALIC R 1x0009;AL # ASHTRA LETTER VOCALIC RR 1x000A;AL # ASHTRA LETTER VOCALIC L 1x000B;AL # ASHTRA LETTER VOCALIC LL 1x000C;AL # ASHTRA LETTER E 1x000D;AL # ASHTRA LETTER EE 1x000E;AL # ASHTRA LETTER AI 1x000F;AL # ASHTRA LETTER O 1x0010;AL # ASHTRA LETTER OO 1x0011;AL # ASHTRA LETTER AU 1x0012;AL # ASHTRA LETTER KA 1x0013;AL # ASHTRA LETTER KHA 1x0014;AL # ASHTRA LETTER GA 1x0015;AL # ASHTRA LETTER GHA 1x0016;AL # ASHTRA LETTER NGA 1x0017;AL # ASHTRA LETTER CA 1x0018;AL # ASHTRA LETTER CHA 1x0019;AL # ASHTRA LETTER JA 1x001A;AL # ASHTRA LETTER JHA 1x001B;AL # ASHTRA LETTER NYA 1x001C;AL # ASHTRA LETTER TTA 1x001D;AL # ASHTRA LETTER TTHA 1x001E;AL # ASHTRA LETTER DDA 1x001F;AL # ASHTRA LETTER DDHA 1x0020;AL # ASHTRA LETTER NNA 1x0021;AL # ASHTRA LETTER TA 1x0022;AL # ASHTRA LETTER THA 1x0023;AL # ASHTRA LETTER DA 1x0024;AL # ASHTRA LETTER DHA 1x0025;AL # ASHTRA LETTER NA 1x0026;AL # ASHTRA LETTER PA 1x0027;AL # ASHTRA LETTER PHA 1x0028;AL # ASHTRA LETTER BA 1x0029;AL # ASHTRA LETTER BHA 1x002A;AL # ASHTRA LETTER MA 1x002B;AL # ASHTRA LETTER YA 1x002C;AL # ASHTRA LETTER RA 1x002D;AL # ASHTRA LETTER LA 1x002E;AL # ASHTRA LETTER VA 1x002F;AL # ASHTRA LETTER SHA 1x0030;AL # ASHTRA LETTER SSA 1x0031;AL # ASHTRA LETTER SA 1x0032;AL # ASHTRA LETTER HA 1x0033;AL # ASHTRA LETTER LLA 1x0035;CM # ASHTRA VOWEL SIGN AA 1x0036;CM # ASHTRA VOWEL SIGN I 1x0037;CM # ASHTRA VOWEL SIGN II 1x0038;CM # ASHTRA VOWEL SIGN U 1x0039;CM # ASHTRA VOWEL SIGN UU 1x003A;CM # ASHTRA VOWEL SIGN VOCALIC R 1x003B;CM # ASHTRA VOWEL SIGN VOCALIC RR 1x003C;CM # ASHTRA VOWEL SIGN VOCALIC L 1x003D;CM # ASHTRA VOWEL SIGN VOCALIC LL 1x003F;CM # ASHTRA VOWEL SIGN E 1x0040;CM # ASHTRA VOWEL SIGN EE 1x0041;CM # ASHTRA VOWEL SIGN AI 1x0042;CM # ASHTRA VOWEL SIGN O 1x0043;CM # ASHTRA VOWEL SIGN OO 1x0044;CM # ASHTRA VOWEL SIGN AU 1x0045;CM # ASHTRA SIGN VIRAMA 1x0046;NU # ASHTRA DIGIT ZERO 1x0047;NU # ASHTRA DIGIT ONE 1x0048;NU # ASHTRA DIGIT TWO 1x0049;NU # ASHTRA DIGIT THREE 1x004A;NU # ASHTRA DIGIT FOUR 1x004B;NU # ASHTRA DIGIT FIVE 1x004C;NU # ASHTRA DIGIT SIX 1x004D;NU # ASHTRA DIGIT SEVEN 1x004E;NU # ASHTRA DIGIT EIGHT 1x004F;NU # ASHTRA DIGIT NINEIf the character shown as proposed for 1x0500 in the Saurashtra proposal L2/03-098 were to be encoded, it would have linebreaking property as shown below:
1X0050;AL # unidentified Saurashtra letterOther Property Values (Proplist.txt)
The Script property value for all of the characters in the proposal (1x0000 through 1x004F) should be "Saurashtra", and that script value should be added to UAX #24.
1x0000..1x0001 should have the Other_Alphabetic property (Proplist.txt), like U+0903.
1x0044 should have the Diacritic property (Proplist.txt), like U+094D.
No punctuation characters are specified in the proposal.
Derived Property Values
1x0002..1x0033 are all Alphabetic (Letter Other, Lo), and should end up with the derived property of Grapheme_Base.
1x0035..1x0044 are all CM and should end up with the derived property Grapheme_Extend.
There are 10 decimal digits 1x0046..1x004F, which should be checked to make sure they end up with the right numeric and/or math properties.