X3L2/96-123
Preliminary
Minutes - UTC #71 & X3L2 #168 ad hoc meeting
San Diego -
December 5-6, 1996
December 18, 1996
NeXT is represented by Ken Whistler
Present: Apple, Digital, HP, IBM, Justsystem, Microsoft, NCR, NeXT (proxy), Novell, Oracle, Reuters, RLG, SGI, Spyglass, Sybase, Unisys
Not present: Gamma Productions, MGI
Quorum is 9, we have quorum
Name |
Company |
X3L2 |
UTC |
UTC |
X3L2 |
Jenkins,
John |
Apple |
P |
CM |
CM |
P |
Rannenberg,
Wendy |
Digital |
O |
CM |
CM |
O |
|
Gamma
Production |
|
|
|
|
Carroll,
Don |
Hewlett-Packard |
A |
CM |
CM |
A |
Ksar,
Mike |
Hewlett-Packard |
P |
CM |
CM |
P |
Umamaheswaran,
V.S. |
IBM |
P |
CM |
CM |
P |
Kobayashi,
Tatsuo |
Justsystem |
|
CM |
CM |
|
Kondo,
Hiroaki |
Justsystem |
|
CM |
CM |
|
Gotoda,
Koji |
Justsystem |
|
CM |
CM |
|
Batutis,
Ed |
IBM
- Lotus |
x |
AM |
AM |
x |
|
MGI |
|
|
|
|
Suignard,
Michel |
Microsoft |
P |
CM |
CM |
P |
Sargent,
Murray |
Microsoft |
A |
CM |
CM |
A |
Roberts,
Gary |
NCR |
x |
CM |
CM |
x |
proxy
- Ken Whistler |
NeXT |
|
|
|
|
Honomichl,
Lloyd |
Novell |
O |
CM |
CM |
O |
Kung,
Michael |
Oracle |
O |
CM |
CM |
O |
Texin,
Tex |
Progress
Software |
P |
AM |
AM |
P |
Wolf,
Misha |
Reuters |
|
CM |
CM |
x |
Aliprand,
Joan |
RLG |
P |
CM |
CM |
P |
Hart,
Edwin |
SHARE |
P |
AM |
AM |
P |
Mariani,
Gianni |
Silicon
Graphics |
|
CM |
CM |
|
Adams.
Glenn |
Spyglass |
P |
CM |
CM |
P |
Hiura,
Hidek |
Sun
Microsystems |
|
AM |
AM |
|
Whistler,
Ken |
Sybase |
O |
CM |
CM |
O |
Freytag,
Asmus |
Unicode |
P |
VP, AM |
VP, AM |
P |
Winkler,
Arnold |
Unisys |
P |
CM |
CM |
P |
Gilbert,
Judith |
vivid
studios |
O |
x |
x |
O |
Legend: P - primary, A - alternate, O -
observer, L - liaison, X - ex-officio
x - present, CM -
corporate member, AM - associate member, VP - Vice-
President, Unicode
Consortium
X3L2/96-107 |
Report - IAB character set workshop |
Chris Weider |
96-10-15 |
|
X3L2/96-108 |
Apostrophe clarification |
Mark Davis |
|
396-D |
X3L2/96-109 |
Romanian request |
|
|
396-D |
X3L2/96-110 |
Proposal for a standard compression schema for
Unicode |
Wolf, Whistler, Wicksteed, Davis |
96-11-27 |
396-D |
X3L2/96-111 |
BMP and supplementary planes allocation roadmap |
Moore, McGowan, Becker, Whistler |
|
396-D |
X3L2/96-112 |
Supplement Arabic with Uighur, Kazakh and Kirghiz |
China (Mao) |
96-10-18 |
396-D |
X3L2/96-113 |
About the function of identifiers of Mongolian
Proposal |
China (Mao) |
96-11-10 |
396-D |
X3L2/96-114 |
Concerns about UFT-8 conversion algorithm
differences between Unicode and AMD-2 |
Ed Hart |
96-11-27 |
396-D |
X3L2/96-115 |
“Pipeline” draft list, summary of proposals |
Joe Becker |
96-11-23 |
396-D |
X3L2/96-116 |
Final Agenda, X3L2 #168 |
Winkler |
96-12-05 |
396-D |
X3L2/96-117 |
Final Agenda, UTC and X3L2 ad hoc meeting |
Aliprand/Winkler |
96-12-05 |
396-D |
X3L2/96-118 |
Responses to WG2 N2767, support for projects of
WG2 |
WG2 N2770 |
96-12-03 |
396-D |
X3L2/96-119 |
Proposed draft amendment #10 to 10646 |
Glenn Adams |
96-12-01 |
396-D |
X3L2/96-120 |
BIG5, User defined Chinese characters |
Jenkins |
96-12-05 |
396-D |
X3L2/96-121 |
WG2 meeting - Action items |
WG2, Uma |
96-12-05 |
admin |
The agenda as amended was approved. Added items: 6.2.3, 6.2.4, 11.5, 11.6, 11.7
Motion to approve the minutes, moved by Adams, seconded by Uma.
Motion approved: 11 for, 4 abstentions.
Action for Greenfield: Distribute list of attendees at UTC #70 to members (corporate representative and associates who attended that meeting).
Action for Greenfield: Correct spelling of Uma’s name in the minutes. Make sure that spell checker has correct spelling.
Action for Aliprand: Notify Greenfield of all action items for him from this meeting.
Reviewed, based on UTC 71 #38. The next action item list will combine X3L2 and UTC items in one document (X3L2/SD-2), with AIs in continuous chronological order. Each AI will be numbered as either a UTC AI or an X3L2 AI ; these numbers will be in separate columns.
Action for Adams: Contact Everson re WG2 procedures for Cherokee proposal.
Action for Adams re UTC#70-A32: Document for next UTC meeting (fonts for public use).
UTC, UIC, X3L2, WG2, IRG, WG20 …
Include W3C Conference in Santa Clara, CA, April 7-11, 1997.
Action for Aliprand: Have Greenfield check and revise meeting calender and distribute to “unicore”.
UTC agreed to cancel March meeting. Winkler proposal: have 3 meetings only, all of them together with X3L2. This would eliminate the problems with incomplete information of people who cannot participate in UTC meetings.
Future dates for the three joint X3L2 & UTC meetings in 1997 are:
May 29-30, 1997.
Aug. 7-8. 1997.
Early December (exact date to be set after dates for Tokyo IUC are finalized).
John Jenkins offered Apple to host May meeting if Taligent cannot.
Action item for Aliprand: Check with Davis re Taligent as host for rescheduled meeting in May.
January 13 - 17, 1997: |
SC2/WG2 IRG |
Singapore |
January 20 - 24, 1997 |
SC2/WG2 |
Singapore |
March 2 - 7, 1997 |
SHARE meeting |
San Francisco |
March 10 - 12, 1997 |
IUC #10 |
Mainz, Germany |
April 7 - 11, 1997 |
W3C conference |
Santa Clara, CA |
May 12 - 16, 1997 |
SC22/WG20 |
Québec |
May 29 - 30, 1997 |
UTC #72 & X3L2 #169 |
Cupertino |
June 23 - 27, 1997 |
SC2/WG2 and SC2 |
Cyprus |
August 7 - 8, 1997 |
UTC #73 & X3L2 #170 |
TBD |
September 3 - 5, 1997 |
IUC #11 |
San Jose |
Sept. 29 - Oct. 3, 1997 |
CEN TC304 |
Reykjavik |
November 1997 |
SC22/WG20 |
Egypt |
December 1997 |
IUC #12 |
Tokyo, Japan |
December 11 - 12, 1997 |
UTC #74 & X3L2 #171 |
TBD, tentative |
Errata should be reported to errata@unicode.org. Editorial corrections will be posted at the Web site without review by UTC. Errata affecting technical content will be reviewed by the UTC and posted if approved. Presently there is only a glyph error (J with hook).
Action for McGowan: Bring dump of errata to UTC #72 for review.
Action for Aliprand: Notify McGowan of all action items for him from this meeting.
Ksar expressed concern about errors on the Unicode Web site, specifically, section on allocation.
Whistler reported for Becker. Paper describes
1) allocation and coding per se, i.e., the proposals that are accepted, rejected, or coming down the pipeline.
2) Attempts an overall assessment of what should be encoded on the BMP and what might be encoded on supplementary planes.
Working group included Becker, Jenkins, Ksar, McGowan, Moore, Whistler.
Becker proposed making it a standing document that shows the progress for all proposals.
Suggested: BMP to be filled with selected scripts, based on the expertise of the authors. All contemporary scripts and extinct scripts with large collections of literature. Intention is to cover all living minority scripts on BMP. The proposal takes into account the proposal from Everson in 96-101. It leaves room for the vertical extensions from the IRG (target area was left empty).
Plane 1: all extinct non Han scripts of the world
Plane 2: all additional Han characters
Freytag argued for inclusion of ideographic components within the BMP, to allow representation of characters that have not been standardized. Adams gave examples from Chu Nom, but said we should not make a final judgment so soon.
The document is meant for UTC and should not go any further without some editing of personal opinions.
Motion, moved by Freytag, seconded by Jenkins:
UTC approves the high level policy decisions in guidelines document X3L2/96-111:
· the 3 plane allocation as to number and nature (BMP, non Han, Han)
· placement of the right-to-left scripts between Arabic and Devanagari.
· the proposed placement of Yi script between ideographs and hangul.
Motion approved: 15 for, 1 abstention.
UTC should define the progression of the document.
Uma: just guideline document, not cast in concrete. Eliminate differences with Everson, if possible before submitting
Justsystem abstains as there is not enough room for kanji in Plane 3. A Chinese dictionary has over 80.000+ characters, so kanji need more space than just one plane. Jenkins said that any additional ideographic characters can spill over into another plane. Kobayashi said that Han characters need to be organized within their assigned plane.
Uma said that the document should highlight possible use of other plans. Ksar pointed out that WG2 does not reserve any planes for anything, so planes cannot be reserved for a specific use.
Motion, moved by Freytag, seconded by Jenkins:
The UTC accepts the technical content of X3L2/96-111 omitting the comparison section with the Everson proposal and the immediately preceding sentence starting with “X These …” and will use this document as a starting point leading to a document that meets the action item to develop a NP like scope for the 10646-x project in WG2.
Motion approved: 15 for, 1 abstention.
Edit document based on discussions in the meeting - number of additional Han characters seems to be a bone of contention. Leaving “holes” for efficient allocation of Kanji needs significant study.
Action for Whistler: Write first cut edited document within 2 weeks.
The document to be developed will take into account the e-mail from Sato with the guidelines.
Action for Adams and Suignard: In conjunction with Japanese NSB, draft work item scope statement on WG2 request for meta-level architecture of ISO 10646.. Targeted date: WG2 meeting in Singapore. Uma suggested that relevant parts for the work item should be extracted from document X3L2/96-111.
Covered by discussions and decisions in 3.1.
Should go out of the BMP. Jenkins developed this proposal to allow encoding of a script on a supplementary plane to encourage implementation of UTF-16. Needs 4 columns
Motion, moved by Sargent, seconded by Mariani:
The UTC accepts the Deseret script repertoire, and recommends that it be encoded off the BMP.
Motion passed: 14 for, 1 abstention, 1 absent from room.
Action for Jenkins: Forward proposal to WG2 as a contribution from the Unicode Consortium. Give the proposal to Ksar for WG2 mailing.
Whistler has studied both proposals, Everson proposes a combining tone mark. Everson also proposes spelling of the syllables in pronounceable form. Ken suggests that the UTC support the Everson proposal.
Adams: Other syllables of Yi have tome inherent; an explicit, separate tone mark would be variant treatment. China is researching a revision to their proposal. Mao is making a trip and will come back with his findings, but the Chinese probably like their current one. Names are less likely to be am issue than the tone mark.
Sargent: Implementation question is important. Combining marks are being worked on, technology will be available some time soon.
Jenkins: preference for combining way, but no big fight.
Asmus: don’t let us be boxed in on combining method.
Motion, moved by Freytag, seconded by Jenkins:
Unicode representatives at the WG2 meeting are instructed to push for principle of use of combining marks, and in all other respects support Everson’s analysis, that is:
1) use of combining marks for tone mark,
2) naming of syllables more closely corresponding to Lolo phonetics,
3) addition of Yi radical set.
Motion approved unanimously.
Jenkins: The proposal looks plausible on the surface, but Becker needs to provide input. Discuss on “Unicore” list.
Glenn: proposal introduces a identifier character to define the presentation form. We might not be able to avoid the introduction of this character, as the presentation is not machine decided. Could this character then be used with other scripts, like Arabic. More work will be needed in the definition of architectural impact of the identifier character. The numbers for Mongolian in the roadmap document assume use of identifier character, not presentation forms Ksar said that we do not expect a final vote in Singapore..
Guidance for Singapore: get out of the e-mail discussion - contribute!!!
Action for Adams: Summarize unicore discussion for Singapore.
Glenn: Presentation forms should not be allowed. They can be predicted by a presentation engine. The language and the font need to be known for correct prediction. Another font might need yet different presentation forms
Motion: Moved by Whistler, seconded by Sargent:
The UTC is not in favor of addition of the Arabic presentation forms as in 96-112 that are renderable by algorithm in accordance with the character/glyph model.
Approved unanimously.
Adams: Alternative names for runes were moved to Annex P. The Swedish NSB now wishes to remove them.
Freytag: Alternate names in Unicode should also be in 10646, if requested. Alternative names publishing is easier in Unicode.
Ksar: Annex P should not be used as repository for alternate names. Ken supports that.
Suignard: Annex P as last resort for agreements - we need sensible consensus first.
Uma: WG2 asks for contributions what goes into annex P, Unicode should contribute.
Motion, moved by Ksar, seconded by Whistler:
The UTC does not support adding the Runic alternate names to annex P.
Approved by consensus.
Consensus of UTC (including proxy) that Runic can go into the BMP. Use of Annex P is an exceptional procedure if no other compromise can be found.
This is more a font issue than a coding issue. Romania wants to add characters to Latin-2, just for the sake of presentation. Addition of the characters would create a migration nightmare.
Carroll: the type industry wants to make characters look good. Comma and cedillas often look similar. Language and script code allows selection of different font, another possibility would be to use combining characters with cedilla and/or comma below.
The motion from UTC #69 opposing addition of these proposed characters was endorsed !
No input
CD or DIS ballots? European synchronization issue. Wait for JTC1 proposal.
UTC and X3L2 support the inclusion in the BMP The Japanese NSB is reported to have said that some characters might not be legitimate and that only legitimate characters should get into the BMP, 6585 characters should be re-checked for legitimacy.
Kobayashi: said that the aim is to make the standard perfect, and do NOT add incorrect characters. This is the position of the Japanese TAG to SC2.
Motion, moved by Adams, seconded by Jenkins:
The UTC instructs its representatives for WG2 that the UTC position is to prefer encoding of Vertical Extension A on the BMP, but the liaisons should remain flexible,.
Approved unanimously.
Jenkins: We want them encoded NOW in 10646. Rather off the BMP now than on BMP in 1-2 years.
Dealt with in September UTC meeting. U.S. position deferred to X3L2 meeting.
IRG positions:
1) Position on Vertical Extension A (see above)
2) New composition method proposal from Prof. Shih in Taiwan, who will be attending IRG.. Will be discussed in Singapore. UTC favors a standardized composition method, no preference for a specific one. Open issues are the levels and the position.
Adams asked about the Berkeley meeting with Prof. Lancaster and his colleagues in September. Whistler said that the attendees were aware of the issues, and have seen Vertical Extension A. Have a repertoire of 15K characters that are not in the URO. The 15 K have not been unified, nor have they been checked against Vertical Extension A.
Adams said that members have been concerned about Hong Kong characters, and asked whether the UTC should be collecting a US contribution.
Action for Jenkins: Contact point Prof. Lancaster re working on U.S. contribution to the IRG.
IRG should give UTC the fonts for distribution of information. Glenn has other source. The liaisons are entitled to ask for the fonts in Singapore.
Action for Jenkins: Draft letter from Consortium (for Davis to sign) to IRG re getting fonts for Vertical Extension characters.
Jenkins said that lack of characters from CNS 11643 was hindering acceptance of the Unicode Standard in Taiwan. Only planes 1-3 are in the URO.
Motion: moved by Jenkins, seconded by Uma:
The UTC favors addition of all unique characters from CNS 11643:1992 (all planes) to ISO 10646.
Approved unanimously.
Action item: Jenkins: Have Cora begin work on checking for unique characters in CNS 11643:1992.
Motion: Moved by Adams, seconded by Winkler:
That John Jenkins be the Unicode Consortium’s representative to the IRG.
Amendment by Freytag: That Jenkins be the Consortium’s primary representative, with Glenn Adams and Michael Kung as alternates.
Amended motion passed unanimously.
Glenn Adams and Asmus Freytag were appointed Unicode representatives to WG2 for the meeting in Singapore (by consensus).
Action item (Adams): Revise pDAM to conform to WG2 instructions.
Motion: moved by Adams, seconded by Jenkins:
The proposed character Ethiopic Space which was present in WG2 N1420 should be removed upon further consideration, and should not appear in attachment A or B. Additional study indicated that this character should have been unified with SPACE, and that no “intrinsically Ethiopic” space is required.
Motion approved unanimously.
[ Mike Ksar later informed the Chair that he had made a motion to ensure that the pDAM text prepared for WG2 was exactly in accord with the original proposal to WG2 (i.e., including the Ethiopic space). Neither the Chair nor the Vice-Chair had recorded this motion. The need for it was eliminated when Glenn Adams agreed to revise the pDAM text to remove the additions that were not in accordance with WG2 instructions.]
Action item (Unicode representative on X3L2): When pDAM comes up for ballot, Ethiopic space should be requested for removal.
Action item (Unicode liaisons to WG2): Communicate this position informally in Singapore to other representatives.
Action item: Joan and Mike Ksar to clarify liaison between TC46 and WG2 (Joan).
Discussion about “collection identifiers” after Hangul changes in 10646. Are they needed in Unicode, and if so, how? We need to be responsive to requests for collection identifiers.
Freytag proposed an ad hoc committee to come up with the Consortium’s position for the WG2 meeting. Ad Hoc Committee members to be Uma, Whistler, Suignard, Hiura.
Quad symbol:
Motion: moved Adams, seconded Jenkins:
The UTC accepts the change of APL quad symbol from 237B to 2395
Motion approved: 15 for, 1 abstention
Action item for Winkler: Distribute WG2 N1396 to X3L2 and UTC
Action item (Freytag and Adams): What should or should not go into annex P of 10646? Specify a proposal as Unicode reps to WG2.
Action item: Winkler to send WG2 N1416 to Ken Whistler
Action item: Winkler to distribute WG2 N1385 to X3L2 and UTC.
Action item for Adams: Response to proposal for adding special letters for Nigerian Yoruba (WG2 AI 31-2)
Whistler said the issue of script encoding was premature. Is a standard characterization needed? Postponed to next meeting.
Mark was unable to prepare anything.
ISO 639 is being revised. Mnemonics are language dependent. Will be discussed further in the next meeting.
We need a position that squares with reality. UTC (at meeting #70) voted to distinguish between versions. Vendors are not distinguishing between versions. Misha recommends a version independent Unicode tag.
Adams pointed out that there are two factors: identification of the encoding system and identification of the repertoire. “charset” in the MIME context = character encoding scheme.
Motion moved by Adams, seconded by Wolf:
UTC to register “UTF-8” and “UTF-7” with IANA and undo its September 1996 decision to register version 2.0 designators (action item 70-A39).
Motion passed by 2/3: 13 for, 2 opposed, 1 abstention
Coding system definition (UTF-8) needs no version as char-set-name (IETF term for character encoding schema). Version label is a different story. Too specific labeling can lead to rejection of simple English text, due to a change in Korean.
Uma: In the context of Internet traffic, using the latest version, “UTF-8” is not ambiguous.
Freytag: Need to work out a general policy on this. Reserve generic names for most up-to-date version, with distinctions for previous versions. Mariani pointed out the problem of using “UTF-8” as a generic when it is embedded in data.
For information
Hard work of a small bunch of volunteers. Misha asks that member companies put links to web pages into their networks.
San Jose, September 3-5, 1997. Misha is chairing the editorial board of the IUC.
Tokyo, early December. Exact date still to be finalized.
Unicode definition of equivalence will be used in WG20’s sorting standard.
To be handled by the Officers.
All we can do is to define the process of how to progress. Wolf recommends to leave it on the Unicore discussion list a little bit longer. Empower Wolf to post the revised draft documents according to the W3C method (old, current, editors...)
Action item for Wolf: act as editor for a draft document with feedback from today and to 96-110, by January 31, 1997. Discussion on Unicore, also disposition of comments.
Action item for Adams: Create a “Working Paper” section on the Web site. install document on web site and create a section for comments
Action for Aliprand/Winkler: Withdraw 70-A14 action item.
UNIX does not support Unicode to a great extent. Why this, and how do we alleviate this problem. A UNIX SIG would be of great value for all interested parties. Freytag suggested that Roberts form a SIG to discuss the issue and come up with recommendations.
Rannenberg: X/Open has strong recommendations for APIs.
Action for Roberts: Collect names of people interested in working in the SIG. (Mariani, Texin, Rannenberg, Kung, Hiura expressed interest.)
Action item for Roberts: create list of issues, report via “unicore” list or else at May UTC meeting.
No discussion document provided.
Hart: What to do with undefined codes?
Freytag: standard does not define what should happen. For illegal input, the Unicode sample implementation will react in a compatible manner, the generic algorithm allows anything.
Ksar: Implementation guidelines can be followed, other implementations might do different things.
Whistler: difficulty of keeping Unicode and 10646 standards in sync. Old model: UTC decides, lobbies WG2 for acceptance... New model of co-operation with X3L2 changes the situation somewhat, how should UTC work?
Freytag: Consensus through submitting to WG2 instead of accepting into Unicode. Tracking of status of proposals is a necessity. Pipeline ad-hoc group could possibly update the pipeline document in real time.
Adams: who drives whom? or cooperation.
Uma: both standards are quite stable, less confrontation occurs.
Ksar: things have changed to the better over the last few years. WG2 tracking mechanism is available, check if we can enhance it
Hart: go through WG2 has advantages of worldwide acceptance.
Asmus: Do we have the documented approval of our membership? We need or own tracking, especially if our feedback is not accepted by WG2. Ksar agrees.
Jenkins: what is the process and flow of proposals.
Action item Ksar: send out URL of Thygessen document.
Action item Uma, Ken, Jenkins: Draft UTC process flow additions to this document.
No comments
Ed Hart reports about input from NB. Al Griffee and Ed Hart will meet on the weekend to resolve comments A disposition of comments will be distributed to UTC and X3L2.
The Chair thanked NCR for hosting the meeting, and Gary Roberts making the arrangements.