Accumulated Feedback on PRI #196
This page is a compilation of formal public feedback received so far. See
Feedback for further
information on this issue, how to discuss it, and how to provide feedback.
Source: Mark Davis
Date: 2011/10/24
-
One of the links in UAX #38 is broken:
http://www.coffeesigns.com/Resources/romanization/korean.asp
-
The regex for kXerox; N/A; ^[0-9]{3}:[0-9]{3} is inconsistent
with the others; missing a final $.
-
However, I’d recommend that
all
of the regex patterns in UAX 38 remove leading ^ and
trailing $;
-
They are superfluous given that you need to match against
the whole string, and just make the expressions even less
readable.
-
There needs to be a clear statement of whether the ordering of
multivalued properties is significant, and if so, what that
significance is. “Arbitrary” means you could store the values in
a hashset and you wouldn’t lose any information.
-
There should be a statement at the top of UAX #38 that the
ordering is arbitrary unless documented, and then clearly
document those cases where it is
not
arbitrary:
-
kMandarin, kTotalStrokes, kHanyuPinyin.
-
kCantonese has a documented ordering, but that it is
alphabetical, which is really arbitrary.
-
kHanyuPinlu also does, but because the number on each item
gives the frequency, it is also arbitrary.
-
Unihan (these might be longer term,...)
-
Should remove kCompatibilityVariant. It is just a subset of
the data in UnicodeData.txt. So it is just an opportunity
for error waiting to happen.
-
Should change kHanyuPinlu to accented pinyin instead of
numeric, like the other pinyin fields.