This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Tue Mar 5 22:17:41 CST 2013
Contact: john@tiro.ca
Name: John Hudson
Report Type: Public Review Issue
Opt Subject: PRI 250 – Malayalam conjuncts
While I think it might be useful to have a mechanism to signal the desire for particular conjunct forms in plain text, using control characters in some manner as suggested, it should be noted that the OpenType 'language system' tag mechanism already makes possible the creation of fonts that support both traditional and reformed Malayalam orthography, without the need of ZWJ or ZWNJ. The OpenType Layout language system tag registry contains two separate Malayalam related tags (in addition to the {dflt} tag, which different fonts may interpret differently): Malayalam Traditional MAL Malayalam Reformed MLR http://www.microsoft.com/typography/otspec/languagetags.htm This mechanism relies on appropriately built fonts (as does the proposed ZWJ/ZWNJ mechanism, of course), and on software providing for orthographic tagging of text. It should be noted that CSS3 font support specifically enables this kind of tagging and font layout behaviour. The advantage of this mechanism, of course, is that a single font can be used to display either orthography simply by the user appropriately tagging the text, rather than having to insert control characters in sequences wherever a particular orthographic conjunct form is desired. That said, I favour the addition of an explicit ligature request/block mechanism using control characters, as this will no doubt be as useful for Malayalam as it has proven for other scripts. I see such a mechanism as a secondary means to override, at the cluster level, the results of higher level mechanisms such as OpenType Layout language system tagging.
Date/Time: Wed Mar 6 00:59:23 CST 2013
Contact: verdy_p@wanadoo.fr
Name: Philippe Verdy
Report Type: Public Review Issue
Opt Subject: PR250: Optional Conjuncts in Malayalam
Could it be possible to encode instead a "modern virama" to facilitate input, i.e. a virama that would always be visible and never part of any conjunct, and that would allow the use of the same fonts for both orthographies, the normal virama defaulting to the creation of conjuncts, while the other defaulting to the separation, both viramas not needing the use of any ZWJ/ZWNJ ? It would mean that sequences: - <traditional virama, ZWNJ> would be deprecated in favor of <new virama> - <traditional virama, ZWJ> would be deprecated in favor of just <tradtional virama> - <new virama, ZWNJ> would not be needed and treated like <new virama> - <new virama, ZWNJ> would not be needed and treated like <tradictional virama> Keyboards built for modern Malayalam would preferably map only the <new virama> with a simple key, the <traditional virama> would be accessible with some modifer key (AltGr?) if needed. Or keyboards could have a new working mode between traditional and new orthography, also by using a state mode key (similar to CapLock). changing the way the typed virama key would be mapped. Possibly, by typing the <virama> key twice, it could automatically swtich to the other mode (meaning that the virama key would behave like a dead key, and would not return somethinf to appication befor we type another letter or the virama key a secnd time to switch input mode. Note that the same technic used on keyboard mappings could also be used to automatically generate ZWNJ in the modern orthography when you type the <virama> key before another letter. In which case, defining a <new virama> character would not be needed. In that case the <virama> key would also be a dead key, and the <traditional virama> character would not be outputed before you press another key: - if you press the <virama> key a second time, it would switch mode between trandtional and modern orthoghraphy - if you press the BACKSPACE key, the dead key is canceled, and nothing is outputed to the application - if you press the SPACEBAR key, the alternate virama is emitted without needing to change the virama input mode, i.e. a virama is emitted without ZWNJ when currently working in the modern mode, but with ZWNJ when currently working in traditional virama input mode - if you press a letter key, the <traditional virama> character would be emitted, followed automatically by ZWNJ only if working in modern orthography mode, before outputing the letter character. In that case, there would be no conjuncts displayed in modern orthography, but traditional conjuncts would be enabled when entering text in the traditional input mode. - if you press any other function key, that function key is emitted without change, but the current virama dead key state should be canceled (many European keyboard drivers forget to cancel their dead key state, this is a defect in my opinion, but it may still be acceptable to not cancel this state in the Malalayam keyboard driver too, that's why I wrote "should be" and not "must be"). There are certainly other similar variants to adjust input modes to best match the new orthography with the easiest input mode. But in my opinon having users to know when they need to enter ZWNJ and ZWJ is not very friendly and not easy. Many things can be enhanced with keyboard drivers to allow fast input but with correct encoding ouput and the expected rendering, without having to depend also on specific fonts (all Malayalam fonts should work with both orthographies, given the appropriate text input). In both approaches, the keyboard general layout would remain basically the same, only thr behavior of the existing <virama> key would be changed to work as a dead key. And both encoding options (using ZWNJ automatically, or using a <new virama> character if it was encoded) would be possible. However, the intent of this review is certainly to discuss about the practical consequences of the need to use ZWNJ everywhere in the modern orthography. In my opinion, encoding a <new virama>, with the same combining class as the existing <traditional virama>, and that would collate nearly the same as it (withou the complications of ZWJ/ZWNJ) would simplify things a lot : - the traditional virama would be used now always without ZWJ/ZWNJ : it would create conjuncts everywhere it is possible to create them (with just a fallback using a visible virama when using simpler fonts or renderers that can't process the ligatures) - the new virama would be used now always without ZWJ/ZWNJ : it would never create conjuncts (it should not be used with ZWJ to change this : you should use the traditional virama instead) - less characters to enter/edit : ZWJ/ZWNJ is an Unic-only artefact, foreign to the Malayalam script itself. Two distinct viramas are coherent with the expected use and modeling of the script understood by users. - for most uses in modern Malalayam, a simple keyboard layout would need to map ONLY the <new virama>, producing no conjuncts. Those layouts do not need to map keys for ZWJ/ZWNJ, and applications do not need then to handle those sequences containing them. - Caveat : fonts or renderers would need to be developed to support the <new virama encoding> as an alternative to the <traditional virama, ZWNJ> encoding. But this developement would be worth the trouble as now they could manage and render correctly all texts written in modern as well traditional orthographies.
Date/Time: Wed Mar 6 22:30:04 CST 2013
Contact: behdad@google.com
Name: Behdad Esfahbod
Report Type: Public Review Issue
Opt Subject: PRI250 Malayalam conjunct sequences
Case 1 & 2 The idea of a ZWJ before a VIRAMA pulling C2 to conjoin to C1 is prevalent in other Indic scripts, Bengali for example. To have ZWNJ cancel any such effect is logical. In fact, I believe fonts can already be made to work this way with HarfBuzz. Case 3 & 4 If one reads the Malayalam section of the Unicode Standard very carefully, there is this two lines at the end of the "Special Cases Involving Ra" section: "The sequence <0D7B, 0D31> is rendered as {chillu followed by ra}, regardless of the reading of that text. The sequence <0D7B, 0D4D, 0D31> is rendered as {chillu-ra conjunct}. So, to me it appears that a chillu followed by a virama encourages conjunct formation whereas chillu not followed by virama ends the syllable and hence does not form any conjuncts. I don't see how that is different from what Case 3 & 4 in the proposal try to achieve.
Date/Time: Sat Apr 6 04:10:22 CDT 2013
Contact: jfkthame@gmail.com
Name: Jonathan Kew
Report Type: Public Review Issue
Opt Subject: PRI#250: Proposal to Specify Optional Conjuncts in Malayalam
While I can appreciate the desire to be able to access both traditional and reformed renderings... "However, there is a definite need for the ability in a reformed orthography font to display the traditional full conjuncts on demand. As of now there is no mechanism specified in the standard to suggest a full conjunct of a cluster. The reverse of the above scenario is also needed - a traditional orthography font might want to display reformed orthography grapheme clusters optionally." ...it is not clear that this is something that should be represented at the level of plain-text encoding. The difference is a stylistic one that would be better handled by having separate fonts, or by having optional features that may be applied to select different glyph combinations within the font, or by distinct language systems that in turn have distinct collections of features. From a user's point of view, attempting to control this level of rendering via invisible control characters in the text stream would be extremely cumbersome and difficult to use, especially as the effects, if any, of those control characters will be dependent on the particular font being used. Even a careful user is likely to insert the controls only in contexts where the particular font being used during data entry happens to support a conjunct that the user wishes to override (in either direction). But if the data is later viewed with a font that supports a slightly different repertoire of conjuncts, the attempt to enforce "traditional" or "reformed" style will fail, as the exact set of character sequences that need such control may be different. A solution that treats this as a style difference, encoded in the styling and/or language attributes of rich-text data, will be much more workable than a plain-text representation of all the possible stylistic variations. With text runs marked appropriately as "malayalam-traditional" or "malayalam-reformed", and font shaping technology (such as OpenType) that is sensitive to this distinction, the desired result will be achieved even across multiple fonts with varying glyph repertoires. I believe it would be a mistake to attempt to encode this stylistic distinction as part of the plain-text Unicode data.
Date/Time: Tue Apr 9 08:21:02 CDT 2013
Contact: naa.ganesan@gmail.com
Name: Naga Ganesan
Report Type: Public Review Issue
Opt Subject: Proposal to Specify Optional Conjuncts in Malayalam
Tamil has one case of style difference in display like this. Right now, in the web page display, and inside MS Office products such as MS Word, when ZWNJ is used to mark the split case of u & uu vowel signs, a dotted circle appears. That dotted circle should not be displayed for text portions in web page or for printing from MS Word documents. Here are Newspaper samples from Chennai, India that use the both styles, "Tamil-traditional" and "Tamil-reformed". February 2, 2004: Look at two paragraphs of the same text in Viduthalai newspaper. In the top paragraph, u & uu vowel signs ligate, while at the bottom they do not ligate. http://2.bp.blogspot.com/-LJpjRbdtNbw/UWOHxitxm-I/AAAAAAAACew/5xjch91D96o/s1600/2004.jpg 3-August-2006. Note the u & uu vowel signs split avoiding ligation for u or uu vowel-consonants in the paragraph at the bottom of the page, http://2.bp.blogspot.com/-UO9MSh3z_Ns/UWOIzaxRODI/AAAAAAAACe8/Uj376-St1sM/s1600/Viduthalai_3-8-2006.jpg
Date/Time: Mon Apr 15 02:19:33 CDT 2013
Contact: pravin.d.s@gmail.com
Name: Pravin Satpute
Report Type: Public Review Issue
Opt Subject: PR250: Proposal to Specify Optional Conjuncts in Malayalam
For this PR 250 i think we again need to revisit difference between GLYPH and CHARACTER One character can take number of shapes/glyphs depending on calligraphic or script requirement but it should not affect the storage. Example In Devanagari script character U+0932 DEVANAGARI LETTER LA ल This characters has different representation in Marathi Language ल. For this specific need it is handled in open type specification with <locl> feature tag. Same way there are number of example where single character take different form depending in script, language and calligraphic requirement. Now thinking proposed changes to Unicode on same line U+0D38 U+0D4D U+0D15 സ്ര – Meera fonts സ്ര – Lohit Malayalam In the above example though representation is different, the syllable is same. It will be pronounced in the same way and both are same syllables. As per the proposal if we add ZWJ/ZWNJ to type specific type of representations it can create following problem. 1. NLP application: Need to handle both the things, even though they are same. 2. Backward compatibility: already enormous data created for Malayalam language, fixing it for new introduced storage way will be problematic. Ideal Solution: A. Handing in fonts 1. As already mentioned by John Hudson Malayalam Traditional MAL Malayalam Reformed MLR http://www.microsoft.com/typography/otspec/languagetags.htm 2. Options to user for disabling and enabling particular gsub lookup in fonts. How it will help: If user want ligature mostly used in Traditional script, he can enable that lookup. Else he will get only Reformed script output.