L2/11-286 Date: Thu, 21 Jul 2011 14:15:50 -0400 From: Behdad Esfahbod Subject: Syriac shaping fixes and clarifications A few months ago I implemented Syriac shaping in HarfBuzz and during the process I identified a few items that can be improved in the Unicode Syriac Shaping text: 1) The Syriac Shaping section talks about "dalath and rish" without defining exactly which characters are meant. I suggest qualifying this as Joining_Group=Dalath_Rish. This would match what Uniscribe does. In the same vein, I suggest qualifying references to Alaph with Joining_Group=Alaph. 2) Make a note that when applying the Syriac shaping rules (R1, R2, R3), Joining_Type=Transparent characters are skipped. That is, note that the R1 rule from Arabic Shaping has precedence over the Syriac shaping rules. 3) Perhaps rename the Syriac R1, R2, R3 rules to something that does not conflict with the Arabic Shaping rules. Maybe call them R1.1, R1.2, R1.3 to make it clear that they fit right after Arabic R1. 4) The Unicode Syriac Shaping has rules that depend on "word breaking character"s: An alaph that has a non-left-joining character to its right, except for a dalath or rish, and a word breaking character to its left will take the form of A_fn. That can be hard to implement since 1) there's no such thing as "word breaking character" in Unicode, there is the Unicode Text Segmentation Algorithm, 2) shaping engines typically don't have access to the word-boundaries as determined by the Text Segmentation Algorithm. I suggest replacing that with something like this: An alaph that has a non-left-joining character to its right, except for a dalath or rish, and a non-joining character or the end of the text to its left will take the form of A_fn. Ie. rely on the fact that in this context, roughly any "word breaking character" is non-joining and vice versa. [The full writeup and sample fonts are at: http://behdad.org/syriac/] behdad