L2/20-107

Editorial Committee Report and Recommendations for UTC #163 Meeting

Source: Editorial Commitee

Date: April 20, 2020

A. Unicode Release Topics

A1. Unicode 14.0 Schedule and Planning

FYI: The Unicode 14.0 release has been pushed out by 6 months to mid-September, 2021, from the originally scheduled data of mid-March, 2021. The original reason for this change of schedule was the impact of the SARS-Cov-19 virus on all the meetings and operations of the UTC, the Editorial Committee, and other committees and groups, as well as all our member companies. This was reinforced by the unplanned consequences of a recent major failure of infrastructure (see B1 below), which has taken another considerable bite out of schedules. The new significant milestones for the Unicode 14.0 release are:

The change of schedule for Unicode 14.0 was publicly announced in the Unicode blog on April 8.

EC-UTC163-R1: Because of significant dependencies that cascade into development work for CLDR and ICU, it is advisable to decide on the exact repertoire for a Unicode releases as soon as feasible. We are not asking for an immediate decision on repertoire yet, but the Editorial Committee recommends deciding on most of the repertoire by this fall, rather than waiting until just prior to the start of beta review. The Editorial Committee also recommends designating most of the already approved characters listed in the Pipeline for publication in Unicode 14.0, instead of letting too many languish "gathering dust" for a subsequent release.

EC-UTC163-R2: The Editorial Committee recommends getting early commitments from the various UAX and UTS authors as to whether they will be able to complete their action items targeted for Unicode 14.0, and to get actual drafts available for review this year. Early posting of PRIs for proposed updates for these documents helps both for thorough document review and with the overall release burden.


A2. Unicode 13.0 POD

FYI: The print-on-demand (POD) version of Unicode 13.0 is in preparation. Most work has been completed, but the last steps have been delayed because of the repercussions of the infrastructure failure (see B1). The anticipated publication date is now June, 2020 (instead of May).


A3. "Alpha" Pipeline Charts

FYI: The Editorial Committee has decided to augment the Pipeline with "alpha" code charts, as proposed by Michel Suignard. Michel has systematically tracked all new character approvals by the UTC, and has the tooling (with Unibook) to enable generation of code charts showing all the new approvals. These will provide a more complete picture of the new approvals, showing all code points, names, and glyphs, instead of just the schematic listing shown currently in the Pipeline.

The Editorial Committee suggests that these alpha charts reflect the actual repertoire slated for the next release as much as possible (see A1 above), rather than including any characters approved for a subsequent release. And we suggest that the charts be prominently watermarked as alpha draft charts, so as not to be confused with actual beta review charts or published final code charts.

EC-UTC163-R3: The Editorial Committee recommends assigning a couple action items related to alpha Pipeline charts:

AI: Michel. Generate an alpha Pipeline chart for posting, once the repertoire for Unicode 14.0 has at least tentatively been decided, with prominent watermarks to identify the charts as alpha drafts.

AI: Ken Whistler, edcom. Link Michel's alpha Pipeline chart from the 14.0 section of the Pipeline page, with some contexutalizing explanation.


B. Website Topics

B1. Website Status

FYI: The Unicode server infrastructure suffered a catastrophic failure in our ISP's data center on April 7. This took down the VM that contained our web server, all our mail services, two major repositories, and a number of other functions. That VM was not salvageable.

The infrastructure failure did not impact the new web server, home.unicode.org (a WordPress site), the Unicode blog, the Google Sites subsites for CLDR and for ICU, or any of our repositories operating out of Github. However, anything pointing at the technical site was failing, and the failure of email also made ongoing committee and consortium work quite difficult for a while.

Recovery work is well under way. As of April 20, the web server, recovered from backup, is operating on temporary hardware, itself recovered from a storage closet. A new VM has been spun up and the migration to that VM is well underway. A more robust backup scheme is in the works, to minimize the impact of any future failures.

As part of the recovery effort, some of the functionality formerly associated with www.unicode.org as been moved off the server running the main technical site, and onto a secondary, non-public VM for internal work. In particular, both the Consortium mail services and all of the Editorial Committee work area have already migrated to that secondary VM.

Recovery of the two major repositories that crashed and burned as part of the infrastructure failure has proven problematical, because of a major hole in the provision for their backup. We anticipate being able to recover almost all of the content eventually, but may lose some of the recent checkin history. As part of this recovery, the plan is to migrate both repositories to Github, for greater reliability. However, in the interim, the recovery of these repositories is heavily impacting staff time. And their unavailability has a serious impact on both Editorial Committee process and on tooling used in all our release processes. In particular, the SVN "draft" repository is used for editorial work on about two-thirds of all our technical specifications.


B2. Press Page Issues

FYI: The Editorial Committee has done updates on the old press page to mark it as obsolete, in the context of the existence of the press page on the new website. (This was an oversight in the original rollout of the new website.) In related updates, the emoji articles and other articles pages have been overhauled to make it clear they are historical and are no longer currently maintained.


B3. Digit Terminology

FYI: The Editorial Committee has added a page to the terminology section on the site, clarifying the ambiguity in usage of terms like "Arabic digits" and "Hindi digits". This content was provided mostly by Markus Scherer, with editorial tweaking by Ken Whistler. See Digit Terminology.


C. Process Issues

C1. Editorial Committee Input on UTC and Ad Hoc Process

EC-UTC163-R4: The Editorial Committee in general recommends that the UTC and other committees and ad hoc groups document on their websites or process and procedure pages any changes to their meeting process occasioned by the current COVID-19 pandemic crisis. In particular, the switchover from face-to-face meetings to virtual meetings should be prominently noted, as appropriate, on pages documenting logistics or committee process. The Editorial Committee has added such a note to the Editorial Committee page.


D. UTR Topics

D1. UTR #23, Character Property Model

The Editorial Committee has prepared a new working draft of UTR #23. This draft has seen significant review among the document editors. Its content updates are complementary to the changes underway for UTS #18, Unicode Regular Expressions, regarding the extension of the character property model to encompass properties of Unicode strings. See draft 6 [note: temporary link only for UTC review as a working draft].

EC-UTC163-R5: The Editorial Committee recommends that
The UTC authorizes preparation of a proposed update of UTR #23, Unicode Character Property Model, based on draft 6 of the working draft, as reviewed by the Editorial Committee.

Suggested associated action items:

AI, Ken Whistler, Edcom. Prepare a PRI for a proposed update to UTR #23, based on draft 6 of the working draft.

AI, Rick McGowan. Post the PRI for the proposed update of UTR #23.


E. PRI Topics

E1. PRI #404: UTS #18

FYI: The Editorial Committee reviewed the latest text of the proposed update for UTS #18, Unicode Regular Expressions. Feedback has been provided to the UCD & Algorithms ad hoc for incorporation in its report to the UTC. In particular, Ken Whistler provided a draft of a substantial rewrite of Section 1.2 of UTS #18 for clarity.


F. Responses to Public Feedback

FYI: The Editorial Committee has reviewed the feedback from Bogdan, dated April 16 noted in L2/20-104. The Editorial Committee concurs that the defects noted in the Thai section of the core specification are typos. The editor will correct the typos in the next revision of the core specification.

Suggested action item:

AI, Rick McGowan. Respond to Bogan with the disposition from the Editorial Committee.


G. Miscellaneous Topics

G1. References to RFC's

EC-UTC163-R6: After investigation, the Editorial Committee recommends updating various references to RFC's (and IETF Std's) to use rfc-editor.org/info links instead of older ones. This reflects best practice that is also recommended for RFC's themselves. This will have an impact on the References pages on the Unicode website and on the references lists for UTS #46, UTS #39, UAX #41 for 14.0, etc. No action is asked of the UTC; the Editorial Committee will tackle these changes as various specifications are updated for Unicode 14.0.