PRI 359 Disposition of feedback
Bob Hallissy,
Lorna Evans
2018-03-23
Date/Time: Mon Sep 25 15:12:22 CDT 2017 Name: Thomas Milo Report Type: Public Review Issue Opt Subject: Proposed Draft UTR #53, Unicode Arabic Mark Ordering Algorithm Now Available for Public Review Please consider taking into
account the established solutions for these sequences as already
implemented in www.mushafmuscat.om, which is now available
world-wide as the authoritative, Azhar-recommended electronic
reference Qur’ān. |
No action taken. Although many of the examples cited in the UTR are from the Quran, AMTRA is not intended purely for Quranic uses but also for practical orthographies that use Arabic script. |
Date/Time: Mon Sep 25 19:54:28 CDT 2017 Name: A./ Report Type: Public Review Issue Opt Subject: PRI 359 1.) Better guidance should be given
when to apply this algorithm. From reading the draft, it is
usefully applied as a standard preparatory step before handing text
off to a rendering engine, or perhaps also as a standard
transformation on input to a rendering engine. This should be
explicitly |
1) In order to explicitly identify the scope as the rendering pipeline and not storage, the initial Summary section was changed from: The Unicode Arabic Mark Ordering Algorithm (UAOA) describes an algorithm for determining correct rendering of Arabic combining mark sequences to the following: This technical report specifies an algorithm that can be utilized during rendering for determining correct display of Arabic combining mark sequences. This UTR makes no change to Unicode normalization forms, and does not propose a new normalization form. Instead, this is similar to the processing used in https://docs.microsoft.com/en-us/typography/script-development/use: a transient process which is used to reorder text for display in an internal rendering pipeline. This reordering is not intended for modifying original text, nor for open interchange. |
2.) If there are other situations, operations or processes where transforming Arabic text using this algorithm are seen as useful, these should be stated explicitly. |
2) The only section of the UTR that suggests other possible uses of the algorithm was enhanced to be more explicit: 5.6 Other uses for AMTRA |
3.) There are situations and protocols that demand text in a given normalization form. Care should be taken in presenting the new algorithm so that it does not lead users to expect that all Arabic text "out to be" always in the transformed format. |
3) As mentioned above, the Summary statement was enhanced to state: This UTR makes no change to Unicode normalization forms, and does not propose a new normalization form. and section 5.6 was enhanced to explicitly state: There is no intention or expectation that AMTRA would be applied to stored text. |
4.) The stability note before 3.2
could be improved. The word "existing" will change meaning.
Therefore: |
4) The original text: For stability reasons, existing Unicode characters will not be added to the list of MCM. was replaced with: The set of MCM characters is stable. Characters from Unicode Version 10.0 or earlier will not be added or removed from this set in future updates to this UTR. Characters added after version 10.0 may be added to MCM at the time they are incorporated into the standard but not after. |
5.) In step 2, the
specification does not address keeping multiple instances, e.g.
multiple MCM, in relative order when moved "to the beginning". The
current text could be interpreted as requiring multiple instances
of such character to be inverted in relative order as each is moved
"to the beginning". (The issue theoretically exists for shadda as
it is defined by CCC value, which on the face of it allows the
possibility of multiple distinct shadda code points where again,
internal ordering could be observable). |
5) The original text: b. If a sequence of ccc=230 characters begins with any MCM characters, move those MCM to the beginning of S (before any characters with ccc=33). c. If a sequence of ccc=220 characters begins with any MCM characters, move those MCM to the beginning of S (before any MCM with ccc=230). was replaced with text clarifying that it is the sequence of MCM that is to be moved: b. If a sequence of ccc=230
characters begins with any MCM characters, move the sequence of
such MCM characters to the beginning of S (before any characters
with ccc=33). |
Date/Time: Fri Oct 6 05:59:06 CDT 2017 Name: r12a Report Type: Public Review Issue Opt Subject: When should UAOA be used? I'm sending this on behalf of
the W3C i18n WG. It relates to UTR#53.
|
r12a correctly identifies the intended use of the algorithm: transient reordering used during rendering and not a new form of normalization. We hope the changes mentioned above make that more explicit. In contrast to r12a’s statement, the authors are unaware of any fonts that “generally produce the behaviour described” in the draft UTR. |
Btw, the understanding of the intended use of UAOA is not helped by the way the document mentions canonically equivalent character sequences, nor by the vague descriptions of when CGJ should be used. |
The Unicode Standard 10.0, in section 5.13, states: Canonical equivalence must be taken into account in rendering multiple accents, so that any two canonically equivalent sequences display as the same. A corollary of this is: if the text author wants two sequences to display differently, those sequences must not be canonically equivalent. As further stated in The Unicode Standard 10.0, section 23.2: [The CGJ] is also used to distinguish sequences that would otherwise be canonically equivalent. In that the intent of this UTR is to provide a mechanism to support rendering, the authors consider it to be within the scope of this UTR to address issues related to canonical equivalence of texts being rendered. |
Date/Time: Fri Oct 6 06:05:21 CDT 2017 Name: r12a Report Type: Public Review Issue Opt Subject: AMOA rather than UAOA ? http://www.unicode.org/reports/tr53/ |
The name of the algorithm was changed to “Arabic Mark Transient Reordering Algorithm” which has the more pronounceable acronym “AMTRA” |
Date/Time: Tue Oct 10 09:40:48 CDT 2017 Name: David Corbett Report Type: Public Review Issue Opt Subject: PRI #359: U+08D9 ARABIC SMALL LOW NOON WITH KASRA U+08D9 ARABIC SMALL LOW NOON WITH
KASRA has Canonical_Combining_Class=Above |
No action taken. As defined, AMTRA will always order U+08D9 after all Below (ccc=220) marks. While it is technically possible to alter AMTRA such that U+08D9 is treated throughout the algorithm as if it were ccc=220 so that it maintains its position relative to other ccc=220 marks, doing so would not guarantee consistent rendering since text processes (prior to rendering) are free to reorder U+08D9 relative to any ccc=222 marks — thus resulting in different rendering for canonically equivalent texts. |
Date/Time: Fri Oct 13 16:48:21 CDT 2017 Name: Behnam Esfahbod Report Type: Public Review Issue Opt Subject: Feedback on Proposed Draft UTR #53 — Revision 1 Status: Liaison Contribution - W3C
i18n WG |
As mentioned above, Section 5.6 was rewritten. In particular, the rewrite states: There is no intention or expectation that AMTRA would be applied to stored text. The rewrite also provides a more detailed discussion about why a text editor may want to utilize AMTRA within backspace processing, but the UTR does not require such. |
From the language and examples of the document, it looks like the usage of the algorithm is too focused on one application, Quranic text, and the claims are related only to that specific application of the script. |
While Quranic texts were the easiest examples to find, the algorithm is not specific to such. |
Date/Time: Fri Oct 13 16:59:35 CDT 2017 Name: Behnam Esfahbod Report Type: Public Review Issue Opt Subject: Feedback on Proposed Draft UTR #53 — Revision 1 Status: Individual Contribution |
|
# 1. Scope of the PDUTR |
#1. UTC has recognized that other scripts may need similar special attention. |
# 2. Scope of the algorithm |
#2. As mentioned above, the initial Summary statements were enhanced to clarify intended use of the algorithm. |
# 3. Consequences of the Algorithm:
Normalization |
#3 and #4. In separate correspondence, items #3 and # 4 were withdrawn by their author as they stemmed from a misunderstanding of the algorithm. The text that caused that misunderstanding — steps 2b and 2c of the AMTRA — have been clarified as noted above. |
# 5. Not enough details in the
examples |
#5. Example 4a was removed as it did not contribute to the document. After renumbering examples 4b and 4c, additional details — similar to those originally provided for examples 2a and 2b — were added to examples 1, 3, 4a and 4b. |
Date/Time: Wed Jan 10 08:29:55 CST 2018 Name: r12a Report Type: Error Report Opt Subject: Use HTML rather than PDF This is a comment from the W3C i18n
WG. |
The next draft is being prepared in HTML-based format rather than PDF. |