This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Thu Jan 5 10:29:43 CST 2017
Name: Alastair Houghton
Report Type: Error Report (UTS #46)
Opt Subject: IdnaTest.txt contains incorrect test cases
The test vectors for UTS #46, which can be found in http://www.unicode.org/Public/idna/9.0.0/IdnaTest.txt appear to have a few errors. For instance, line 74: B; 0à.\u05D0; ; xn--0-sfa.xn--4db # 0à.א which should fail [B1] because the first character has Bidi property EN, not L, R or AL, and line 93: B; àˇ.\u05D0; ; xn--0ca88g.xn--4db # àˇ.א which should fail [B6] because “ˇ” has Bidi property ON, not L, EN or NSM. This is quite a common problem in the file. (I've already mentioned this on the Unicode mailing list and was asked by Mark Davis to report it here.)
From: Patrik Fältström
Subject: Re: Prep for Unicode 10.0, liaison contact
Date: Wed, 29 Mar 2017 10:54:33 +0200
... I have checked mechanically the 10.0.0 derived attribute values and compared with 9.0.0 defined attribute values according to the IDNA2008 algorithm and have not found any issues. What I am concerned about though is the continued communication that UTS#46 is something that can be used in applications when in reality that creates confusion regarding what code points can be used in identifiers like domain names. Specifically as normal users do not understand the various flags that one must define (to give the same and predictable result), the fact UTS#46 do not only recommend a certain mapping step (which IDNA2003 include, but not IDNA2008). And finally that according to my reading UTS#46 and UAX#31 do have different sets of allowed characters, which further creates confusion. For example when one look at what normal people believe is "emojis". I would like to encourage Unicode Consortium be more clear in its intentions with the future recommended use of UTS#46 and UAX#31 in the context of the IDNA2008 algorithm. Patrik Fältström IETF Liaison to Unicode Consortium
From: Mark Davis
Date: Thu, 6 Apr 2017 17:09:01 +0200
Subject: Re: Prep for Unicode 10.0, liaison contact
I agree with the suggestion to clarify the meaning and "default" values of the different flags used in http://www.unicode.org/reports/tr46/proposed.html#ToASCII and http://www.unicode.org/reports/tr46/proposed.html#ToUnicode As to UTS#46 and UAX#31, it was never a goal to make them align and they never have aligned. The primary goal for UAX#31 is to extend identifiers such as used in programming languages to Unicode (and UAX#31 defines several different kinds of identifiers). The primary goal for UTS#46 is to provide a solution for implementations that want to maintain backwards compatibility with IDNA2003, while extending the repertoire to modern Unicode versions based on the IDNA2003 principles. Of course, any implementation can always apply additional filters on top of UTS#46, including restricting to UAX#31 default identifiers, restricting to the IDNA2008 repertoire, applying tests such as in UTS#39 for mixed scripts, or applying ICANN rules. For IDNA2008, the data files in fact provide information about what IDNA2008 would allow, and also reference certain conditions in IDNA2008, such as ContextJ. (UTS#46 does project forward to the current Unicode release — based on the IDNA2008 principles — since the version of Unicode supported by IDNA2008 is old.) Mark