L2/13-134

Comments on Public Review Issues
(May 1 - July 30, 2013)

The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of May 1, 2013, since the previous cumulative document was issued prior to UTC #135 (May 2013). This document does not include feedback on moderated Public Review Issues from the forum that have been digested by the forum moderators; those are in separate documents for each of the PRIs. Grayed-out items in the Table of Contents do not have feedback here.

Contents:

The links below go to directly to open PRIs and to feedback documents for them, as of May 1, 2013.

IssueName (+ feedback links)
249 Unicode 6.3 Beta Review
253 Draft UTR #50, Unicode Vertical Text Layout
254 Testing the Unicode Bidirectional Algorithm for Unicode 6.3 (no feedback)
 
Closed Public Review Issues for Unicode 6.3 with feedback post-UTC #135:
240 Proposed Update UAX #29, Unicode Text Segmentation (feedback post UTC 135)
241 Proposed Update UAX #31, Unicode Identifier and Pattern Syntax (feedback post UTC 135)
252 Proposed Update UTS #18, Unicode Regular Expressions (feedback post UTC 135)

The links below go to locations in this document for feedback.

Feedback on Encoding Proposals
Error Reports
Other Reports

 


Feedback on Encoding Proposals

Date/Time: Tue Jun 25 23:48:08 CDT 2013
Contact: cowan@ccil.org
Name: John Cowan
Report Type: Feedback on an Encoding Proposal
Opt Subject: L2/13-116 Revised Proposal to add the Leke Script in the SMP of the UCS

I suggest that the characters named LEKE VOWEL be renamed LEKE VOWEL SIGN,
since they have General Category Mn.  The only Mn vowels that are named 
simply VOWEL rather than VOWEL SIGN are these 12:

17B4;KHMER VOWEL INHERENT AQ;Mn;0;NSM;;;;;N;;;;;
17B5;KHMER VOWEL INHERENT AA;Mn;0;NSM;;;;;N;;;;;
A926;KAYAH LI VOWEL UE;Mn;0;NSM;;;;;N;;;;;
A927;KAYAH LI VOWEL E;Mn;0;NSM;;;;;N;;;;;
A928;KAYAH LI VOWEL U;Mn;0;NSM;;;;;N;;;;;
A929;KAYAH LI VOWEL EE;Mn;0;NSM;;;;;N;;;;;
A92A;KAYAH LI VOWEL O;Mn;0;NSM;;;;;N;;;;;
AAB2;TAI VIET VOWEL I;Mn;230;NSM;;;;;N;;;;;
AAB3;TAI VIET VOWEL UE;Mn;230;NSM;;;;;N;;;;;
AAB4;TAI VIET VOWEL U;Mn;220;NSM;;;;;N;;;;;
AAB8;TAI VIET VOWEL IA;Mn;230;NSM;;;;;N;;;;;
AABE;TAI VIET VOWEL AM;Mn;230;NSM;;;;;N;;;;;

Per contra, there are 518 VOWEL SIGN characters, of which the 17 Thai and Lao
vowels are Lo and the rest either Mn or Mc. (For the record, the other Lo
VOWEL characters are GUJARATI VOWEL CANDRA E and O, the 17 KHMER INDEPENDENT
VOWELs, and 8 more TAI VIET VOWELs.)  While there does not seem to be a
principled distinction between VOWEL SIGNs and VOWELs, the majority for VOWEL
SIGN is overwhelming.

Date/Time: Tue Jul 9 23:12:52 CDT 2013
Contact: cowan@ccil.org
Name: John Cowan
Report Type: Feedback on an Encoding Proposal
Opt Subject: L2/13-128 Latvian and Marshallese Ad Hoc Report


I propose an alternative to encoding 4 Marshallese undecomposable letters
(plus 8 more if anyone turns up evidence for D/d, G/g, K/k, R/r).  What about
a new combining mark, COMBINING CHARACTER INVARIANT CEDILLA?  This works like
the regular combining cedilla, except that it always looks like a cedilla no
matter what letter it is attached to.  This sixth alternative was apparently
not considered by the ad hoc, but has the same stability advantages and allows
any letter to be encoded with a cedilla.

It does mean that there will be three spellings for C/c, E/e, H/h, S/s, and
the useless T/t with cedilla, but we already have that situation for acute and
grave thanks to U+340 and U+341.  People who don't need INVARIANT CEDILLA
should obviously stay away from it.

Date/Time: Tue Jul 30 12:38:42 CDT 2013
Contact: larson@towncommons.com
Name: Tim Larson
Report Type: Feedback on an Encoding Proposal
Opt Subject: naming error in n4387

RE:  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4387.pdf

Given the size sequence tiny, very small, slightly small, small, medium small,
medium, -, large;

Given the corresponding sequence of black circles 22c5 (dot operator), 2219
(bullet operator), 1f784, 2022 (bullet), 2981 (z notation spot), 26ab, 25cf,
2b24;

Given the (incomplete) corresponding sequence of white circles containing
black circles 2299 (circled dot operator), 1f78a, 29bf (circled bullet);

Given the assumption that 2299 (circled dot operator) corresponds to size
"tiny" as 22c5 (dot operator) does;

Given the assumption that new proposal 1f78a corresponds to size "slightly
small" as new proposal 1f784 does;

Given the assumption that 29bf (circled bullet) corresponds to size "small" as
2022 (bullet) does;

Then said new proposal 1f78a (white circle containing black small circle) is
misnamed, as it corresponds to a white circle containing a black SLIGHTLY
SMALL circle. Please correct this naming before it is rendered unchangeable by
inclusion in the standard.

Date/Time: Tue Jul 30 13:00:24 CDT 2013
Contact: larson@towncommons.com
Name: Tim Larson
Report Type: Feedback on an Encoding Proposal
Opt Subject: inconsistent naming sequence in n4387

RE:  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4387.pdf

Given the white circle weight sequence white, heavy, medium bold white, bold
white, heavy white, very heavy white, extremely heavy white;

Given the white square weight sequence white, light white, medium white, bold
white, heavy white, very heavy white, extremely heavy white does not
correspond;

Given that 25cb (white circle), 2b58 (heavy circle), and 25a1 (white square)
are already encoded and cannot be changed;

Wouldn't it make more sense to change the names for proposed characters 1f78e
and 1f78f to "heavy square" and "medium bold white square", respectively? The
existing inconsistency of going to "heavy" before going to "bold" and then
back to "heavy" again, while unfortunate, does not in any justify introducing
another inconsistency, especially one which implies the "normal" weight being
positioned even lighter than the one denoted "light".

Date/Time: Tue Jul 30 13:20:44 CDT 2013
Contact: larson@towncommons.com
Name: Tim Larson
Report Type: Feedback on an Encoding Proposal
Opt Subject: inconsistent sequences in n4387


RE:  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4387.pdf

Given the size sequence tiny, very small, slightly small, small, medium small,
medium, (normal), large;

Given the (incomplete) sequence of white circles containing black circles
correspond to sizes tiny, slightly small, and small;

Given the (incomplete) sequence of white squares containing black squares
correspond to sizes very small, small, medium;

Given that the "normal" size of the containing square is only one step larger
than the medium size of the contained square, which would surely render almost
indistinguishably from each other at regular text sizes;

Shouldn't the proposed sequence of white squares containing black squares be
renamed to match with the proposed sequence of white circles containing black
circles, i.e. using sizes tiny, slightly small, and small?

Date/Time: Tue Jul 30 13:45:25 CDT 2013
Contact: larson@towncommons.com
Name: Tim Larson
Report Type: Feedback on an Encoding Proposal
Opt Subject: inconsistent sequences in n4387 (more)


RE:  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4387.pdf

Given the size sequence tiny, very small, slightly small, small, medium small, 
medium, (normal), large;

Given the (incomplete) sequence of white circles containing black circles 
correspond to sizes tiny, slightly small, and small;

Given the (incomplete) sequence of white diamonds containing black diamonds 
correspond to sizes very small, small, medium;

Given that the "normal" size of the containing diamond is only one step larger 
than the medium size of the contained diamond, and would surely render almost 
indistinguishably from each other at regular text sizes;

Given that 25c8 (white diamond containing black small diamond) is already 
encoded and cannot be changed;

Shouldn't the proposed sequence of white diamonds containing black diamonds 
be renamed to match with the proposed sequence of white circles containing 
black circles, i.e. using sizes tiny (1f79a), slightly small (1f79b), and 
small (25c8)?



Error Reports

Date/Time: Thu Jul 11 21:33:55 CDT 2013
Contact: dthompson@prinpay.com
Name: dave thompson
Report Type: Error Report
Opt Subject: tr36(-11) example appears damaged


I'm not sure "etc" includes TR, but "Corrigenda" 
on that page linked here. Please change if wrong.

tr36(-11) 3.1.1 #Ill-Formed_Subsequences 
para 3 "Consuming any subsequent" is ambiguous
and I suspect this may be automated damage.

Served 2013-07-12 00:23:18Z as text/html with meta 
further specifying utf-8, to Firefox requesting 
type html xthml then xml *, lang en-us then en *, 
and "charset" 8859-1 then utf-8 *, if relevant.

It says >>... where " " indicates a bare C2 byte:<<
where " are " around SPACE and C2 is tagged bold,
and then has an HTML snippet containing three SPACE 
of which clearly only the second is intended to be 
the gobble-next C2-byte. I think this not-quite-char 
should be something visible and distinctive, or 
(not my preference) described more completely.

Perhaps related on checking I see that /reports/
points to proposed update /reports/tr36/tr36-12.html
(but /reports/tr36 or /reports/tr36/tr36-11.html 
points to /reports/tr36/proposed.html
which does not know about this update)
and aside from editorial overhead, this update 
appears to only fix HTML markup errors most of which 
appear likely to be from an automated process,
which might also have lost something better that 
was intended in the example I report on.

Finally I note the two HTML snippets use <ul> which 
(typically) renders with bullets that aren't exactly 
appropriate for this content, and neither explicit style 
nor class/id that would allow them to be styled without 
hurting other <ul>s. I understand for documents that 
need to be automatically maintained or at least controlled 
it can be difficult to get optimal markup/rendering, and 
the main point should be to get the substance correct, but 
I point this out in case you can do anything about it.

Cheers.

Other Reports

None at this time.