The structure of the feedback below includes a citation of text from the
document, suggested replacement text or other changes to remedy the
problem, and a rationale for the change.
2.1.1.3.2 String Requirements
The label must be a valid
internationalized domain
name, as specified in the technical
standard
Internationalizing Domain Names in Applications
(RFC
3490). This includes the following
nonexhaustive list of
limitations:
=>
The label must be a valid internationalized
domain name, as specified in the latest version of the IDNA
specifications (see XXX). This includes, but is not limited to, the
following constraints. Note that these are in no way a complete
statement of the requirements of the IDNA specifications.
Rationale. Clearer wording, and you *really* don't
want the reader to think that what is listed here is in any way
completely whatsoever.
-
Must consist entirely of characters
directional property.
[DELETE]
Rationale. This is completely
false. It would disallow many IDNs that are needed, and allowed by
idna-bis-bidi. Note: it is questionable how much of IDNA2008 this
text should repeat, especially in the case of complex provisions
like BIDI. Moreover, "directional property" is undefined.
All code points in a single label
must be taken
from the same script as determined by the
Unicode Standard Annex #24: Unicode Script
Property.
=>
Labels are subject to a constraint based on the script value of
their characters. All characters in the label that do not have the
Common script value or the Inherited script value must share a
single script value. Script values are determined as specified in
the Unicode Standard: see Unicode Standard Annex #24: Unicode
Script Property.
Rationale. The constraint
to single scripts is far too narrow. The script values Common and
Inherited are given to characters that are used with multiple
scripts, such as "-" or "2", or Arabic vowels. Forcing such obvious
characters to go through the exception process is needless overhead,
and obscures the exceptional cases.
2.1.1.4.1 Requirements for Strings
Intended to Represent Geographical Entities
This includes a representation of the
country or territory name in any of the six official
United
Nations languages (French, Spanish,
Chinese, Arabic, Russian and
English) and the
country or territory’s local language.
=>
This includes a representation of the country or territory name
in any of the six official United Nations languages (French,
Spanish, Chinese, Arabic, Russian and English) and
any of the
country or territory’s local language
s.
Rationale. It is quite common for a country or territory
to have more than one language, so that needs to be accounted for.
Applications for any string
that represents a subnational
place name, such as a county,
province,
or state, listed in the ISO 3166-2 standard.
=>
Applications for any string that represents a subnational place
name, such as a county, province, or state. These could be, for
example, as listed in the ISO 3166-2 standard.
Rationale. The ISO 3166-2 standard is not complete, and
is not freely available. Including the comma may imply to the reader
that it is required, that the sentence is to be read as:
"Applications for any string that represents a subnational place
name (such as a county, province, or state) listed in the ISO 3166-2
standard."
Applications for
a city name, where the applicant
clearly intends to use the gTLD
to leverage from the
city name.
Issue.
City names are *very* ambiguous - look at the number of "Paris"
cities that exist. If Paris, Texas gets there first, what happens?
Should there be some qualification necessary to disambiguate city
names instead?
1.3
Information for Internationalized Domain Name Applicants
If
an applicant applies for such a string, it must provide
accompanying information indicating compliance with
the IDNA
protocol and other requirements. The IDNA
protocol is currently
under revision and its documentation
can be found at
http://www.icann.org/en/topics/idn/rfcs.htm.
[ADD AFTERWARDS]
This document presumes that the IDNA protocol has been revised
in accordance with the description at
http://www.icann.org/en/topics/idn/rfcs.htm, and makes use of
terminology defined in the draft revisions. That revision may change
before approval, and such changes could require corresponding
modifications of the following text.
Rationale.
It must be made clear to the reader that while we expect the
revision to succeed, the text following this in the document is
subject to change.
2.
Language of label (ISO 639-1). The applicant will
specify the
language of the applied-for TLD string, both
Module 1
Introduction to the gTLD Application Process Draft – For Discussion
Only
1-17 according to the ISO’s codes for the representation of
names of languages, and in English.
=>
Language tag of label (according to IETF BCP 47
Tags
for Identifying Languages). The applicant
will specify the language tab of the applied-for TLD string, both
Module 1 Introduction to the gTLD Application Process Draft – For
Discussion Only 1-17 according to the IETF BCP 47
Tags
for Identifying Languages, and in English.
Rationale: ISO 639-1 only covers a small
fraction of the world's languages. The correct reference, used in
HTML, XML, and all modern software, is BCP 47.
3. Script of label (ISO 15924).The
applicant will specify the
script of the applied-for gTLD string,
both according to
the ISO code for the presentation of names of
scripts,
and in English.
=>
Main
script of label (see
2.1.1.3.2 String
Requirements). The applicant will specify
the scripts of the applied-for gTLD string, both according to the
Unicode Script property, and in English.
Rationale.
This brings the text in line with the use of script in 2.1.1.3.2
String Requirements. It also prevents bogus information such as
script variants (Latin Fraktur), which are not properties of
characters. The term "scripts" takes account of the fact that some
cases of multiple scripts are allowed. (Note that this information
is competely derivable from the U-Label.)
4. Unicode code points. The applicant
will list all the code
points contained in the U-label according
to its
Unicode form.
=>
4. Unicode code points. The applicant will list
all the codepoints contained in the U-label according using the
U+ notation. For example, for the label "öbb", the list would
be: "U+00F6 U+0062 U+0062".
Rationale. This makes the intent
clear.
5. Representation of label in phonetic alphabet. The
applicant will provide its applied-for gTLD string notated
according to the International Phonetic Alphabet
(http://www.arts.gla.ac.uk/IPA/ipachart.html ).
[DELETE]
Rationale. First, it is
questionable what the purpose of this is -- how is it to be
used? How would it make a difference in the registration
what the IPA was? Secondly, the same word could have many
different IPA readings, narrow vs broad, or vary greatly by
speaker (the same word spoken by a Scot vs a Chicagoan).
Third, very few registrants will be able to supply correct
IPA representations.
6. Its IDN table. This table provides the list of
characters
eligible for registration in domain names
according to
registry policy. It will contain any
multiple characters
that can be considered “the same” for
the purposes of
registrations at the second level. For
examples, see
http://iana.org/domains/idn-tables/.
Question: we think this means a reference to a table
rather than a complete copy. If so, what format should such
a reference take, is a link sufficient? It should be clear
exactly what a registrant needs to supply.
7. Applicants must further
demonstrate that they have
made reasonable efforts to
ensure that the encoded
IDN string does not cause any
rendering or operational
problems. For example, problems
have been identified
in strings with characters of mixed
right-to-left and leftto-
right directionality when
numerals are adjacent to
the path separator. If an
applicant were applying for a
string with known issues,
it should document steps that
will be taken to mitigate
these issues in applications.
Question. It
sounds like this is asking the applicant to change all the
program applications that use the domain name, which is
clearly impossible. What would be an example of "reasonable
efforts"?
2.1.1.1
String Confusion Review
...
The similarity review will be conducted by a panel of
String
Similarity Examiners. This examination will be informed
by an
algorithmic score for the visual similarity between
each
applied-for string and each of other existing and
applied-
for TLDs. The score will provide one objective measure
for
consideration by the panel.
...
The algorithm uses proprietary software to perform a
series of mathematical calculations to assess the visual
similarity between strings based upon the following
parameters:
...
Issue. It is
inappropriate for ICANN to use an algorithm which is not
public, and not based on public data.
If the evaluators determine
that a string poses stability
issues that require further
investigation, the applicant must
either confirm that it
intends to move forward with the
application process or
withdraw its application.
Issue.
What is an example of "stability issues" in a string? Should
this be "technical issue"? How is an applicant supposed to
know what "stability issue" means. All terms needs
definition, and either before usage or in a glossary.
Currently there is a definition of stability of a "registry
service", is later, at the end of 2.1.3, but no definition
or indication of what "stability issues" are for string?