========================================================================= Date: Wed, 29 May 91 11:01:15 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: microsoft!michelsu@uunet.uu.net From: Edwin Hart Subject: Re: Current Draft of Ad Hoc Meeting ----------------------------Original message---------------------------- I agree with Asmus and Isai comments. I already sent an answer to Ed but the message bounced. TRying again. Michel Suignard | To: uunet!APLVM.BITNET!HART | Subject: Re: Your Endorsement and JTC1 mailing | Date: Tue May 28 20:41:09 1991 | | | Question 1: So far, I have only received one response to the current draft. | | Please send me E-mail that states either 1) you endorse the statement as | | written or 2) you have concerns and you do not endorse the paper. | | I endorse the statement as stated. I really feel that if we start now | arguing about the sentences we will never stop. I fully agree with Isai | on this topic. | | The only remark I have about the document are the incomplete references to | 2 annexes: | 1) Willy Bohn proposal, | 2) Floating marks. | The annexes should be there and there references set accordingly or I could | also survive with their removal as long as they are not referred to. | | | Question 2: Do you want to distribute the document INFORMALLY as agreed? | | informally or formally through JTC1 (put my name as one of the experts) | | Again I agree with Isai. I don't care about formally or informally but for | sure I want it to be distributed to JTC1/SC2 recipients. | If you want to go with the formal way then I have no problem with your | proposed wording. | | Finally to get a chance to be endorsed by the French body I had to start | circulating the document in its current shape (with a note about possible | change). As long as you change the date on the final document (from May | 23rd's which is the version I have circulated) it will be fine. | | Michel Suignard | ========================================================================= Date: Wed, 29 May 91 11:01:30 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: "F. Avery Bishop 28-May-1991 1602" From: Edwin Hart Subject: RE: Your Endorsement and JTC1 mailing ----------------------------Original message---------------------------- Q1: I endorse the statement as written. Q2: I would prefer a formal mailing if it can be done within the bylaws of the relevant bodies. Q3: Either is OK Avery ========================================================================= Date: Wed, 29 May 91 11:01:44 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: microsoft!asmusf@uunet.uu.net From: Edwin Hart Subject: Re: Current Draft of Ad Hoc Meeting ----------------------------Original message---------------------------- I am substantially pleased with your minutes and don't think that I have objections other than perhaps minutiae that would prevent it from being distributed to the intended audience. I agree with Isai that speed is important. A. ========================================================================= Date: Wed, 29 May 91 11:02:16 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: whistler@zarasun.metaphor.com (Ken Whistler) From: Edwin Hart Subject: 10646M Minutes ----------------------------Original message---------------------------- Ed, I am reviewing the revised minutes right now and will send shortly any suggestions I note. Metaphor concurs with Isai's note that the overriding concern is to distribute the document quickly, as minutes to the meeting. We do not take it as a formal proposal yet, and do not much care exactly what channels the document is distributed in. I have in hand Olle's note, and have to agree with him about the ¬G and ¬Z codes. Don't use them in email which hits the Internet--I had to edit them all out of the document, though fortunately they did not result in any truncations. Note, though, that the ¬? in Olle J¬?rnefors (Ja"rnefors) name DID result in a critical truncation of the document, since it represented an assignment of responsibility to him! --Ken Whistler ========================================================================= Date: Wed, 29 May 91 11:02:35 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: Olle Jarnefors From: Edwin Hart Subject: Re: Current Draft of Ad Hoc Meeting ----------------------------Original message---------------------------- Just some small points: > >2. The list of electronic addressees > >==================================== > > > > "Olle Ja{rnefors" , > > > >The _international_ form of my name is "Olle Jarnefors". > I fixed your entry to J{rnefors. It is better to use the form "Jarnefors". ("J{rnefors" is only understandable for people in the Scandinavian countries and in Germany using national 7-bit codes instead of the REAL Ascii.) > >5. List of participants > >======================= > > Did I make the correct changes? I think so, yes. > >6. C0-C1 restriction > >==================== > > I have not decided about the other point yet because I thought our intent was > > . . . careful review by experts (from the computer communications, systems, > and applications disciplines within our enterprises and from > ISO, ECMA, CCITT, etc.), we believe it desirable . . . That wording is OK. > >7. Non-spacing marks > >==================== > > > >According to my notes we also decided that "all sequences of > >codes should be allowed". (Joe Becker had argued that there is > >no practical way to enforce a legislation against certain > >sequences of codes.) Do you have any comment on this suggested addition? > >> I took the liberty of placing your name on the task list for the task of > >> determining the other compaction methods to be specified in 10646M beyond > >> a 2-octet form for the base multilingual plane and the 4-octet canonical > >> form. I thought we needed some balance by having coordinators outside of > >> North America, especially since the people from Unicode only want a 2-byte > >> and > >> 4-byte compaction method, and several of the European standards bodies have > >> previously stated that they need a 1-byte compaction method. > > I think the people from Unicode do not care if any other compaction > methods are used as long as they have their 2-byte mode and it is the > default. (This is my opinion and may not be true.) > > I believe the question is: What should the other compaction methods be? > Should 1-byte compaction be allowed? > I understand from Mike Ksar that this is very important to many > countries. > Should 3-byte compaction be allowed? > This may have value for the ideographic scripts. Check with C/J/K > countries. > Should compaction mode 5 (mixed number of bytes per character in the > data stream) be allowed? > I think these are the issues. Please review them and make a > recommendation and state the reason for the recommendation. So the "J" in Action Item 14 _did_ mean "Jarnefors" then! Of course I accept this assignment. Shall I prepare the recommendation to the next ad hoc meeting in Geneva and send it to the new distribution list around 5 Aug? I would like to contact those that took part in the discussion on compaction forms. Do you remember which persons did do that? > Thank your for your clarifying remarks. I found them to be valuable to the > document (and some additional e-mail problems I must work around). And thank _you_ for your persistant effort to bring the two sides closer together, for arranging the informal meeting in San Francisco and for producinthe excelent minutes. We seem to almost have found the type of compromise that both SHARE and ITS (Swedish standards body) have asked for. ========================================================================= Date: Wed, 29 May 91 11:03:51 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: andersen@ralvmk.vnet.ibm.com From: Edwin Hart Subject: Ad-hoc paper ----------------------------Original message---------------------------- 1) The paper is fine with me 2) Yes 3) Yes Regards, Jerry ========================================================================= Date: Wed, 29 May 91 11:19:28 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Electronic Distribution is Ready The electronic distribution is ready and your name has already been activated. To use it simply send mail to 10646M@JHUVM.BITNET It will then distribute your item to everyone else on the list but you (since you sent the information, it presumes you do not need a copy). Please start sending mail directly to the list instead of to me. Thanks for all of your comments. I will shortly make a decision on what will go into the final version of the paper, mail it out, and upload it. I heard two messages: GET IT OUT SOON, and IF POSSIBLE, USE THE FORMAL ROUTE. Right now, I am thinking of sending it informally to some of the JTC1 member bodies and formally to JTC1 and WG2 for them to distribute to everyone. Best regards, Ed ========================================================================= Date: Thu, 30 May 91 10:21:58 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Futures I just wanted to clarify a few of the ideas that I have to be sure we are all thinking on the same track. 0. We all need to continue to be diplomatic and be less sensitive to any less-than-diplomatic criticism from our peers. 1. We are looking for a compromise that *ALL* of us can live with. That means that if we can agree on the major points, we *ALL* need to be flexible on some of the less-major points. In other words, please plan that 10646M will have some characteristics of DIS 10646 and some from Unicode, but do not plan to include every Unicode feature or every 10646 feature into 10646M. Realistically, everyones' pet features cannot be in 10646M where we can reach consensus. 2. Our work needs to be merged back into the ISO WG2 activities starting with the August meeting. After we give the proposal to ISO, it is its decision on what to do with it, what changes to make to DIS 10646, etc. After WG2 decides on the changes, then editing DIS 10646 may begin. I changed the editing action item with this in mind. 3. We are producing a proposal with specific recommendations. It is quite likely that both the Unicode Consortium and WG2 will suggest changes. Although I would hope that the changes are only ones to fine tune the merged standard, that may not be the case. Remember that ISO must evaluate each comment made from around the world (not just ours) and decide what actions to take. 4. I deleted the action item and references to another ad hoc meeting just before the WG2 meeting. 5. Concerning another ad hoc meeting, we need to decide whether to have it or not. To me the decision is whether to complete our current draft proposal and obtain consensus on the remaining issues, etc. or to simply fold that activity into WG2. If we need the time, Mike Ksar has offered to extend the WG2 meeting time to accommodate such a discussion as part of the WG2 meeting. What do you think? If you disagree, be diplomatic. I am not taking a position yet. 6. I change the action item for the June 7 Unicode meeting to add that Unicode needs to issue a statement that they approve of the general direction to merge the two codes (and list any concerns that they may have). This was part of our agreement in San Francisco but not in the action item. 7. I believe that we agreed that the merged code would be a 4-byte code rather than a 2-byte code. We did not discuss the architecture/ structure of the resulting code. 8. We may continue to need some type of 2-byte half plane switching to logically bring the Japanese, Korean, and other planes into the BMP. We did not agree to this but Japanese support of CJK-RLG would seem to require continued support of this feature. Our secretary is typing the mail labels for selective JTC1 members now. I am going to call the JTC1 secretariate in ANSI to try to clear the way for rapid distribution. Ed ========================================================================= Date: Thu, 30 May 91 15:38:00 GMT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: "Ecological Linguistics,Anderson, PRT" Subject: Ad Hoc Draft, from EL Remarks from Ecological Linguistics, Lloyd Anderson on the draft minutes from the Ad Hoc meeting to merge ISO 10646 and Unicode as "10646M". It is of the utmost importance that we maintain trust and balance in continuing this effort, and that we maintain the spirit of respecting different views. This is fully supporting the work of Ed Hart in bringing us and this together, but noting that the feedback and opportunity for comments has not been equally distributed since that meeting. I fear Ed is under pressure more from the Unicode side in the past week. Since I believe the current draft Minutes favor some Unicode positions more than did the Ad Hoc meeting, mostly by omitting reference to some considerations discussed there, I feel that the urge to immediate distribution is also flavored partly by this. I also want quick distribution, but in a way which preserves the fairness and trust or increases them. Notice that most of the European participants have not even had an opportunity to examine the draft. This makes it even more essential that we safeguard the accuracy of representing their concerns. This is NOT TRIVIAL. Notice Olle Jarnefors statement that "I would have protested if that wording had been used in the meeting." Personally, without taking ANY position on the outcome, I think Unicode partisans would better make their case by bending over backwards to be fair on matters of such apparently great concern to others. The same holds of course for strong partisans of existing DIS 10646. For the second reason, I strongly support including the lists of advantages of each approach as seen by participants at the ad hoc meeting. This in no way implies that the advantages are of equal value (Ken Whistler's expressed concern). Participants from different background points of view will weight these differently, and no single view is the exclusively "right" one. Including the list does imply that there are valid arguments to be made on each side. It also makes more likely that those from opposing points of view will listen to each other. It treats our role as one of clarifying issues in such a way that OTHERS can draw their own conclusions and weigh advantages of different approaches with more information, not that we are presuming to take the decisions away from them. A correct solution should succeed by the virtue of its obvious correctness or on-balance maximizing of advantages. The trust and balance issue requires that the minutes express what happened at that meeting, without attempting to change it in spirit or substance. I will list some of these issues here. 1. Action items to promote the agreement. Action item number 8 is the most important, and is likely to be lost sight of. In current draft it says "Coordinate an investigation of the impact of coding in C0". This is fine wording, but if relegated to some appendix it gives a misleading picture of what was consensus at the meeting. The "impact" as discussed at that time included impacts in both directions (that is coding or not coding that C0 space). It included an attempt to get more precise information on the advantages of each approach, admitting that each approach is known to have some advantages the other does not, and that we do not yet know all the factual details about the kinds and degrees of these advantages. Olle Jarnefors was quite correct in saying he did not vote for something that said "extremely important to code C0 now". He did vote for something which included the action item in its context. Some of the specifics involved here are enclosed in the following digression. The proposal is: PROPOSAL: Place the action item in context, with the discussion of the C0 and C1 space. Since action item on the "impact of coding in C0" affects the C0 but not the C1 space, and since consensus was in fact very much clearer at our ad hoc meeting on the uses of the C1 space, these two issues should be separated in the minutes, as they indeed were at the ad hoc meeting. Thus: C1-space (easy) Next point. C0-space (useful to code *now or in future*. Group approved an Action item to assess impact, (since at the very least not all details yet clear to all participants)). ---------------------------------------------------------- A DIGRESSION TO REFER TO CONTENT, WHY THIS POINT IS SO IMPORTANT (Notice that I am taking NO position on the outcome, only that the consideration should be open and fully honest.) It is very difficult to get ether side to admit that its "proofs" are not absolutely compelling, that there is any room for debate at all. Perhaps I have seen the first (!!!) public admission of such from Ken Whistler recently in the following paragraph (it is the task force duty to explore and clarify issues such as these). "What I am getting at is that all C code designed for 8-bit character interraaces has to be rewritten to handle multiple-octet codes. Unicode or DIS 10646 both have this problem. Al multiple-byte character encodings have this problem. And ALL computer languages have this problem (Assembly, Cobol, Fortran, Forth, Pascal, Modula, C++, APL, Lisp, Snobol, SmallTalk, Eiffel, Icon, ...) -- they are ALL broken if interfaces to handle strings in 8-bit units get handed strings with characters encoded in 16-bit or 32-bit units. The ALL need to be fixed." One way of asking questions here is how much various programming languages must be changed to achieve each of a variety of advantages, including both 16-bit codes and null-pads for basic ASCII, and whether the changes required to achieve also a major goal of the other point of view (C0 space free via 032 = Hex 20 pads rather than null pads) are substantially greater or offer sufficient gain to warrant the effort. (These extra changes are understood to affect the revision of programming languages.) The same mode of asking questions can be used on the other side, namely how much change to various programming languages is necessary to attain each of a variety of advantages, including both 16-bit codes and Hex 20 pads for basic ASCII, and whether the changes required to achieve also a major goal of the other point ov view (null pads to make conversion of programming languages easier) are substantially greater or offer sufficient gain to warrant the effort. (These extra changes are understood to affect some communication software and hardware.) Notice the additional difficulty we will have in carrying out this consideration that the person chairing the task group to investigate the impacts has himself a very strong view on the outcome. While it is possible to operate in this fashion, we do lose the very considerable advantages conferred by having Ed Hart chair the Ad Hoc meeting. Isai can do this and maintain trust, but to achieve it he will have to lean over very far backwards to insure that other points of view than his own are considered, and other people's estimation of the costs and advantages of various approaches. On the other hand he has the record of having engaged in a very civil discussion with Keld Simonson which has been better at least than other discussions I have been witness to. No doubt others can formulate more questions to refine our knowledge of various advantages of various approaches. That is the purpose of a study of "impact". ---------------------------------------------------------- 1B. The last sentence of Hart's point 1 says "In addition, the 1-octet compaction method must be adjusted to insure that the control characters are correctly handled." This is a correct reflection of what happened at our meeting, but will probably not be understood by most readers. Perhaps it should say something like "Participants in the Ad Hoc meeting realized that the current implementations of padded control codes in ISO DIS 10646 have significant interactions with the compaction methods. For example, even though single padeed Carriange Return and Line Feed controls may be correctly interpreted by some existing software and hardware, the sequence of padded CR+LF will not be interpreted by some existing devices as equal to the usual unpadded CR+LF sequence because the two 8-bit portions do not immediately follow each other." Perhaps we will be able to add "Jerry Andersen has since the meeting undertaken an action item to clarify similar issues." He did make clearer statements on this than anyone else at the Ad Hoc meeting. JERRY HOW ABOUT IT? 3a. "In addition to diacritics, non-spacing marks *do* include stress marks, tone marks, and ..." ("do" or null here rather than "should") 7. Simplify the compaction methods. This item should definitely be credited to Masami Hasegawa unless he does not want that. As stated at the Ad Hoc meeting it was slightly clearer in separating compaction methods. Additional clarifications can include the following (note carefully the deletion of the word "other" from the Part 4 draft, since that part contains ALL AND ONLY the compaction methods. There are no compaction methods involved in parts 2 and 3. Each is a pure fixed-width code. Part 4: Mechanisms for compaction methods are included in Part 4. (Parts 2 and 3 contain no compaction methods whatsoever and are respectively pure 16-bit and 32-bit codes or 2-octet and 4-=octet codes). Part 3: This part may be implemented without implementing any of the compaction methods in part 4. Part 2: Reverse the last two sentences, and possibly add here or as a separate comment at the end of point 7 the comment between the asterisks: "This part may be implemented without announcers, *and is expected to be the most commonly used*. Following parts (3 and 4) are not required for conforming implementations of the basic 16-bit form of the standard. Points 10 and 11 go back to the action item 8 on impact of C0 coding in which the clarification and balancing of advantages for wchar_t data type and for existing communications software and hardware are involved. Should be linked to it, since as it stands the separate mention of point 10 without parallel mention of a simple upgrade compatibility path for important existing communication software and hardware does not accurately reflect the concerns of those at the meeting. Exclusion of this point of view is a repeat of what is excluded when the lists of advantages seen for each approach are excluded. Tends to make the minutes a bit one-sided. ------------------------------------------------ Some more general comments which may help to keep minds open to compromise. It was noted at the WG2 meeting and our Ad Hoc meeting that Korea is now requesting 60,000 assigned for preformed Hangul blocks, perhaps not all to be filled and rather that it be filled in a way algorithmically related to the single Jamos elements? We do not know, but this impacts our considerations on the amount of space needed within the BMP, both by clearly exceeding its capacity, and paradoxically perhaps at the same time by making the inclusion of only *SOME* preformed Hangul blocks in that plane rather than as renderings seem less important, therefore further reducing our needs in that plain. I do not know how to word this exactly, but mention of it and any other new pieces of information which were made available, such as the increased needs for unified Han expressed by the Chinese representative as also the fact that they could tailor their needs downwards by a few characters to fit into the space of rows 128 through 255 of the BMP (excluding C0 areas). This willingness to tailor the "needs" to effect a common solution is exactly what needs to become a bit more contagious, so that we can balance advantages and needs as a totality rather than assertin absolute need which are not that. I certainly do hope that the section on "Advantages of having only one multi-octet code standard" will be included in our report. In addition, Jerry Andersen clearly had a set of very detailed evaluations of cost in mind in case we were faced with having two so-called standards, and there was not sufficient time to explore these in the Ad Hoc meeting. It would be wonderful if Jerry could be induced to provide these in a very detailed form. (Jerry is going to hate me, this is the second time I have noted he might contribute something more. All the rest of us need to also.) By the way, Friday's participation list should include (Jerry) Andersen as well as (Lloyd) Anderson, Jerry was definitely there till the end. Olle Jarnefors' remark about one of the task groups' conclusions not to be decided by a single individual needs to be generalized. These task groups need to complete their work quickly and in an open and inclusive way to provide what enlightenment they can for the larger groups which will convene. They are not independently deciding questions for everyone else (the MUCH larger community involved will just be offended by any attempt to do that). ========================================================================= Date: Thu, 30 May 91 12:55:59 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: schein@TOROLAB5.VNET.IBM.COM Subject: Re: Ad-hoc meeting in Geneva I strongly favor having AD-HOC INFORMAL meeting in Geneva on Aug 19-20, BEFORE the formal WG2 meeting, which starts Aug 21. I feel that we will be able to accomplish more work that way and present it immediately to the WG2 meeting. I am also considering calling special C0 committee meeting in Geneva immediately preceding AD-HOC meeting (say Aug 15-16). The objective of this meeting will be to review the draft report prepared by C0 group (which I plan to create using E-mail discussions). Isai ========================================================================= Date: Thu, 30 May 91 16:53:00 GMT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: "Ecological Linguistics,Anderson, PRT" Subject: Re: Futures Response to Ed Hart's "Futures" I emphatically agree with the spirit of what Ed has written here, especially with points 1, 3, and 6. One small correction I think. We definitely did discuss the architecture and structure of the resulting code in the very limited sense of Masami Hasegawa's proposal that it be divided into parts, as noted in your draft minutes. That is, the standard is neither a 2-byte code nor a 4-byte code. Part 2 is a 2-byte code (or 16-bit code). Part 3 is a 4-byte code (or 32-bit code). Because the greater size is a 4-byte code, 10646ers can claim to have what they want. Because the normal default (not requiring declarations) is a pure 16-bit code, the Unicoders also have what they want. Notice that I used here the terms 4-byte and 16-bit rather than 32-bit and 2-byte, thus in each case conforming to what the participants most concerned want. We should recognize this for what it is, most of the time pure terminology, otherwise a religious schism, and in either case to be disregarded to the maximum extent possible. Only when it has some theoretical implications for substantive issues does it matter, and then we can perfectly well put in caveats to avoid the choice of terms biasing any substantive matter. Lloyd Anderson ========================================================================= Date: Thu, 30 May 91 15:17:14 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: schein@torolab5.vnet.ibm.com From: Edwin Hart Subject: AD-HOC meeting in Geneva ----------------------------Original message---------------------------- Ed, the importance of ad-hoc vs WG2 meeting is in who is going to control it. From experience, I am afraid that meeting controlled by Mr. Ksar will have much less chance to proceed and end in harmony. It will also allow other people (Klaus) easier participation. After we prepare the agreed document in AD-HOC meeting, it will be much more difficult to kill it later. Isai ========================================================================= Date: Thu, 30 May 91 15:27:48 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: schein@torolab5.vnet.ibm.com From: Edwin Hart Subject: Address for Belgium ----------------------------Original message---------------------------- I am attaching message from Willy Bohn: ----------------------------------------- 'MSG FROM: PAECH1 --GHQVM1 TO: SCHEIN --TOROLAB5 29.05.91 15:19:07a To: SCHEIN --TOROLAB5 *** Reply to note of 26/05/91 14:13 From: Wilhelm Friedrich Bohn (Willy), +49 711 785-3209 Dep. 3889, Bldg. 7000-01 IBM Deutschland, 7000 Stuttgart 80 Subject: Your Endorsement and JTC1 mailing Isai, thank you for sending me the information. I have no problem with the text as written. If you have an idea what I must do to be able to be reached from the outside please let me know. If you feel that it is necessary please inform Ed Hart of my endorsement of his report. I can agree to both forms of distribution but would prefer the official form. The date of the next meeting can then be communicated to those who want or need to know by other channels. . Auf Wiedersehen and Regards . Willy Bohn, GHQVM1(PAECH1) ========================================================================= Date: Thu, 30 May 91 15:28:54 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Comments: Resent-From: Edwin Hart Comments: Originally-From: Mike Ksar From: Edwin Hart Subject: Re: draft cover letter In-Reply-To: Message from "Edwin Hart" of May 29, 91 at 12:35 (noon) ----------------------------Original message---------------------------- Hello Ed, I have a few comments on your draft letter to JTC1. 1. Before you send it I recommend that you clear it with JTC1 Secretariat, ANSI (NY). The contact name is Fran Schrotter. It is still up to you to send it, but I think her support to you will be invaluable. 2. When you talk about the informal meeting, it is important to preface that paragraph with the fact that it was held outside WG2 and that WG2 did not take any decisions to affect the structure of DIS 10646. Right now you end the paragraph that it was not a WG2 meeting. Let me know what the result of your contacts with JTC1 Secretariat are? Best regards Mike > > > Johns Hopkins University > Applied Physics Laborato > Laurel, MD 20723-6099 > USA > 28 May, 1991 > > > > > To: Members of ISO-IEC JTC1 > From: Edwin Hart, USA > Subject: Personal Contribution on DIS 10646: Merging 10646 and > Unicode > > > Recently, we held an informal discussion between proponents > of ISO-IEC DIS 10646 (from JTC1/SC2/WG2) and Unicode (from > the Unicode Consortium) for the purpose of merging the two > incompatible codes into one code. We achieved a > breakthrough because the diverse group was able to achieve > consensus on a number of issues that divided 10646 and > Unicode. Although several issues remain to be resolved, it > is appropriate to share this good news with you and ask for > your support of this effort by communicating it to the > members of your national standards body. > > > A number of information users and developers are concerned about > the real possibility that we will need to support two incompatible > multi-octet codes, ISO 10646 and Unicode. Some may say quite > correctly that Unicode is not a standard and therefore deserves > neither support nor recognition. However, we live in an imperfect > world where regardless of whether Unicode is an international > standard or not, many of us will be forced to support it unless we > do something soon. For the reasons stated in the enclosed > document, I believe that the world is too small to have two > incompatible multi-octet codes with the same goal. I also believe > that both DIS 10646 and Unicode complement each other and have > features valuable to a multi-octet code. Therefore, an > international standard that merges the best features of DIS 10646 > and Unicode makes good sense to me, and I hope to you also. That > is my goal. > > In May, the 10646 Working Group, JTC1/SC2/WG2, met in San > Francisco, California, USA. This appeared to be the perfect time > and place to hold a discussion between the 10646 proponents and the > Unicode proponents. The results of such discussions could be > extremely useful in resolving issues if the DIS 10646 should fail > to obtain a majority of the ballots. Although we wanted to hold > these discussions at the WG2 meeting, JTC1 rules prevented > discussing any changes to DIS 10646 while it was out for ballot. > we did not discuss it there. Rather, after the WG2 meeting ended, > several of us met informally to discuss merging the two codes into > one. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The above paragraph could be modified per my note 2 above. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > I believe that we achieved a breakthrough because we were able > to achieve a consensus on several issues that divided 10646 and > Unicode. This was particularly encouraging because the > participants presented a diverse industry cross-section. We came > from eight countries, over a dozen (12) different enterprises, > included both product developers and users, and represented both > the 10646 and Unicode codes. If it was a breakthrough that we had > the discussions, it was a miracle to achieve consensus among such > a diverse group. The initial results are enclosed for you to read > and reach your own conclusions. > > While encouraging as a first step, the proposal needs additional > work. When the proposal to merge DIS 10646 and Unicode is completed, > I will submit it to JTC1/SC2/WG2 and JTC1/SC2 for consideration. > Meanwhile, I am making the draft available for your consideration, > your comments, and if you think it appropriate, your support. > > Thank you for your consideration. > > > Sincerely, > > > > Edwin Hart > ========================================================================= Date: Thu, 30 May 91 15:50:35 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Final Cover Letter Enclosed is the "final" cover letter. It is being reproduced now. ________________________________________________________________________ Johns Hopkins University Applied Physics Laborato Laurel, MD 20723-6099 USA 30 May, 1991 To: Members of ISO-IEC JTC1 From: Edwin Hart, USA Subject: Personal Contribution on DIS 10646: Merging 10646 and Unicode Recently, we held an informal discussion between proponents of ISO-IEC DIS 10646 (from JTC1/SC2/WG2) and Unicode (from the Unicode Consortium) for the purpose of exploring the possibility of merging the two incompatible codes into one code. We achieved a breakthrough because the diverse group achieved consensus on several issues that divided 10646 and Unicode. Although several issues remain to be resolved, and our proposal needs to be accepted by the formal organizations involved, it is appropriate to share this good news with you. We also ask for your support of this effort by communicating it to the members of your national standards body and commenting on it in your ballot on DIS 10646. Many information users and developers are concerned about the real possibility that we will need to support two incompatible multi-octet codes, ISO 10646 and Unicode. Some may say that Unicode is not an international standard and therefore deserves neither support nor recognition. However, we live in an imperfect world where regardless of whether Unicode is an international standard or not, many of us will have to choose to support it unless we do something soon. For the reasons stated in the enclosed document, I believe that the world is too small to have two incompatible multi-octet codes with the same goal. I also believe that both DIS 10646 and Unicode complement each other and have features valuable to a multi-octet code. Therefore, an international standard that merges the best features of DIS 10646 and Unicode makes good sense to me, and I hope to you also. That is my goal. In May, the 10646 Working Group, JTC1/SC2/WG2, met in San Francisco, California, USA. This appeared to be the perfect time and place to hold a discussion between the 10646 proponents and the Unicode proponents. The results of such discussions could be extremely useful in resolving issues if the DIS 10646 should fail to obtain a majority of the ballots. Although we wanted to hold these discussions at the WG2 meeting, JTC1 rules prevented discussing any changes to DIS 10646 while it was out for ballot. Accordingly, we did not discuss any changes to DIS 10646 at the WG2 meeting. Rather, after the meeting ended, we met informally to discuss merging the two codes into one. I believe that we achieved a breakthrough because we achieved consensus on several issues that divided 10646 and Unicode. This was particularly encouraging because the participants presented a diverse industry cross-section. We came from eight countries, over a dozen (12) different enterprises, included both product developers and users, and represented both the 10646 and Unicode codes. If it was a breakthrough that we had the discussions, it was a miracle to get consensus among such a diverse group. The initial results are enclosed for you to read and reach your own conclusions. While encouraging as a first step, the proposal needs additional work. When the proposal to merge DIS 10646 and Unicode is completed, I will submit it to JTC1/SC2/WG2 and JTC1/SC2 for consideration. Meanwhile, I am making the draft available for your consideration, your comments, and if you think it appropriate, your support. Thank you for your consideration. Sincerely, Edwin Hart ========================================================================= Date: Thu, 30 May 91 15:52:53 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Final Version of Draft Document, Part 1 This is part 1 of the final DRAFT document. It is being reproduced now. Thanks for all of you input. If you do not like the final result, yell at me. Ed _______________________________________________________________________ Document: 10646M/91-01 Date: 30 May, 1991 Subject: Summary of Results of Informal Meeting to Discuss Merging of DIS 10646 and Unicode into One Code From: Edwin Hart, Moderator 10646M (Merger) Ad Hoc Group Reply to: Edwin Hart Johns Hopkins University Applied Physics Laboratory 11100 Johns Hopkins Road Laurel, MD 20723-6099 Electronic Mail: HART@APLVM.BITNET or HART@APLVM.JHUAPL.EDU Voice: +1 (301) 953-6926 Facsimile: +1 (301) 953-1093 This document represents the first draft of what we hope will become a proposal to merge DIS 10646 and Unicode into one code. The primary advantage of this proposal is that it is built on consensus of people supporting ISO 10646 and others supporting Unicode. We plan to submit a final consensus document to WG2 for consideration at the WG2 editing meeting planned for August, 1991 in Geneva, Switzerland. At that time, we plan to work within WG2 to refine the 10646 standard. Summary We affirm our strong support of the effort by ISO-IEC JTC1/SC2/WG2 to develop 10646. We believe that ISO with its open and responsive procedures will give careful consideration to our proposal to refine the DIS 10646. In addition, we believe that the Unicode Consortium has provided valuable insight and technical solutions to newer requirements. We also believe that having a single international standard that incorporates the best features of DIS 10646 and Unicode as outlined in this proposal is far superior to having two incompatible standards with same goal. Therefore, after the completion of the May, 1991 ISO-IEC JTC1/SC2/WG2 meeting in San Francisco, California in the USA, the delegates attended an informal meeting. At the meeting, we discussed requirements to merge ISO-IEC DIS 10646 and Unicode. The people attending the informal meeting included some who favored the ISO 10646 code and others who favored Unicode. We believed that achieving consensus among these people would lead to a merger proposal more likely to be supported by ISO-IEC JTC1/SC2 and the Unicode Consortium. In view of the diverse views represented at the meeting, the results are surprisingly positive. We succeeded in reaching a consensus on major design issues that had previously separated the DIS 10646 and Unicode codes and made them incompatible. We believe that this proposal paves the way for a merger of the best features of DIS 10646 and Unicode into one multi-octet code standard. Yet, this is merely a first step; further work and consensus are required to produce a final proposal. In summary, although ISO and the Unicode Consortium have not yet endorsed this proposal, it is promising because it was the result of a consensus of many people who represented both the ISO 10646 and Unicode Consortium efforts. However, our work would have been almost impossible had it not been preceded by the excellent proposals submitted to WG2 by ECMA, Canada and China. To form our consensus, we used these proposals and new information on the Chinese, Japanese and Korean Joint Research Group (CJK-JRG) announced at the WG2 meeting in San Francisco. We believe this new proposal is very promising and those attending agreed to work to build support for it within their respective companies, and national and industry standard bodies, including ECMA and the Unicode Consortium. General Objectives We adopted the following objectives for the group: 1. Create a proposal to merge the best features of DIS 10646 and Unicode such that the proposal is acceptable to both ISO and the Unicode Consortium. 2. Increase cooperation between ISO-IEC JTC1/SC2 and the Unicode Consortium. 3. Define action items and the timing to complete them. Participants Except for Mr. Jenkins, the following people participated in the Wednesday afternoon discussions: Jerry Andersen IBM, USA Lloyd Anderson Ecological Linguistics, USA Joseph Becker Xerox, USA F. Avery Bishop Digital, USA Willy Bohn University of Hanover, Germany Mark Davis Apple, USA Asmus Freytag Microsoft, USA Joachim Friemelt Siemens, Germany Edwin Hart SHARE Inc./Johns Hopkins University, USA Masami Hasegawa Digital Japan Huang, Weimin CESI, China Olle Jarnefors Royal Institute of Technology, Sweden John Jenkins Apple, USA Bo Jensen IBM Denmark Mike Ksar HP, USA Takayuki Sato HP Japan Isai Scheinberg IBM Canada Karen Smith-Yoshimura The Research Libraries Group, USA Michel Suignard Microsoft, France J. G. Van Stee IBM, USA Kenneth Whistler Metaphor, USA Zhang, Zhoucai CCID, China On Thursday, Mr. Jenkins joined the group but Mr. Stee and Mr. Whistler were absent. In addition, Mr. Jenkins left before voting, and Mr. Hasegawa, Mr. Ksar, and Mr. Bohn were unable to stay for all the votes. On Friday, except for Mr. Friemelt (who had to leave before we concluded the meeting), the following participated in the voting: Mr. Anderson, Mr. Bishop, Mr. Bohn, Mr. Freytag, Mr. Friemelt, Mr. Hart, Mr. Hasegawa, Mr. Jenkins, Mr. Sato, Mr. Scheinberg, and Mr. Suignard. Advantages of Having Only One Multi-Octet Code Standard Here is a list of advantages to having one global multi-octet code standard: 1. Why should we be concerned about two standards? a. Inevitable requirement to support both i. 10646 because it is an international standard ii. Unicode for compatibility with Unicode-based products b. Cost of supporting both i. The cost to do both is probably very large ii. Must consider the costs to convert between the two c. Erosion of single code standard mind-set i. If two, why not three? four? ten? d. Diminishes advantages of either alone without the other i. Single code standard solves many problems that would not be solved if we have two or more of them ii. May introduce the requirement to switch between the two 2. The importance of de-jure standards a. Increasingly used as procurement requirements i. Gives customer more options for interconnection of products from different vendors b. Integral part of vast, interlocking family of standards, each assuming the others c. Better acceptance, because every country can participate i. Not perceived as dominated by U.S. 3. Problems of code conversion a. Must identify both the source and the target code, but often no way to do this b. Conversion is application/subsystem dependent, and it often cannot be confined to one place (that is, it is much more expensive) c. Solving same problem in several places introduces probability of getting some solutions out of synchronization with others d. An uncontrollable, moving target (that is, you never own more than one of the two codes, you cannot control repertoires, etc.) e. Complicated by repertoire differences f. No right way to manage the differences i. Mismatch can range from minor irritation to catastrophe g. Further complicated by differences in character semantics i. No tested solution is known ii. At best, makes translation even more difficult 4. The Costs of code conversion a. Monetary cost of developing, testing, maintaining, etc. b. Diversion of human and other resources by developers c. Performance and memory penalties (extra overhead) d. Errors and other problems are inevitable e. Customer dissatisfaction f. Customer conversion requirements will divert resources for creating local solutions g. Forces tradeoffs between satisfying installed base and meeting new market requirements 5. Other advantages a. One reference source for the code Areas of Consensus 1. Remove the C0and C1 restrictions. We support the ECMA proposal, point 1, To remove the restriction on the so-called C1 space. This point is also included in the Canadian proposal, and other national body positions on DIS 10646 including the ones from China and the US. Vote Thursday: 17 for/ 0 against/ 2 abstain (Davis, Freytag) In addition, pending a careful review by computer communication, systems, and applications experts, from ISO, ECMA, CCITT, and within our enterprises, we believe it desirable to allow encoding graphic characters in the C0 space presently reserved in DIS 10646. This refines point 2 from the Canadian proposal. Annex ____ provides more details on this refinement (the Bohn refinement, named for Willy Bohn, who proposed it) of the ECMA proposal. Vote Thursday: 16 for/ 0 against/ 3 abstain (Bishop, Hasegawa, Sato) Removing the C0 restriction in addition to removing the C1 restriction will provide for flexibility by allowing the encoding of more characters in the base multilingual plane that is the most important 2-octet plane for interchange and interworking. A consequence of removing the C0 restriction is that 10646 must change the way 1-octet control characters are encoded by placing the 1-octet control character into the least significant octet of the current compaction method and padding the most significant octets to the width of the current compaction method. In addition, the 1-octet compaction method must be adjusted to ensure that the control characters are correctly handled. ========================================================================= Date: Thu, 30 May 91 15:56:06 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Final Version of Draft Document, Part 2 Part 2 ________________________________________________________________________ 2. Create an International Repertoire of Unified Chinese, Japanese, and Korean Ideographs and Encode This Set of Ideographs into the Base Multilingual Plane. We propose a refinement to point 5 of the Canadian proposal. We believe that coding an international repertoire of unified Chinese, Japanese, and Korean ideographs in the base multilingual plane is mandatory for international interworking and processing efficiency. The encoding of the international C/J/K repertoire must be completed by the end of 1991. We propose to use the CJK-JRG results if it is available in 1991; otherwise we propose to use the best information available at that time. Vote Thursday: 17 for/ 0 against/ 1 abstain (Ksar), 1 absent (Hasegawa) Recent statements by the Japanese delegates to WG2 showed their strong support for the CJK-JRG. From this information, the group concluded that the unification of Chinese, Japanese, and Korean ideographs so highly desired by the international community is feasible. Providing that WG2 continues to recognize the stated Japanese requirement to encode its characters in its own 10646 plane, Japan recognized the need for an international repertoire of Chinese, Japanese, and Korean ideographs. A meeting of the CJK-JRG has been called (Tokyo, July, 1991) to start creating an international repertoire and ordering. 3. Allow the Option to Use Non-Spacing Marks. Pending careful review by ISO TC46 and CCITT, we propose to refine point iv) 2) of the ECMA proposal for floating diacritical marks as follows: The third Code Extension Level should specify: a. In addition to diacritics, non-spacing marks should include stress marks, tone marks, and those used for text processing operations such as underlining or mathematical notation for the name of a vector. b. Non-spacing marks should follow the base character for consistency. c. Imaging and the order of multiple non-spacing diacritics should follow well-defined rules. (See Annex ____.) d. To allow for compliance with future versions of 10646 that may encode additional pre-composed characters, allow both encoding a character as a pre-composed character or as a base character with one or more non- spacing marks. (That is, delete the ECMA statement if the accented letter is already coded as a single character, the alternative representation by means of floating diacritical marks is not allowed.) This assumes that future revisions of 10646 will take certain characters that used floating marks in the current version of 10646 and encode them as pre- composed characters. e. All sequences of codes should be allowed because of the difficulty of enforcing a legislation against certain sequences of code positions. Vote Thursday: 16 for/ 0 against/ 1 abstain (Sato)/ absent (Bohn, Hasegawa, Ksar) 4. Define the merger (10646M) of DIS 10646 and Unicode as a 4- octet code. Vote Thursday: 16 for/ 0 against/ 0 abstain/ absent (Hasegawa, Ksar, Bohn) We support the 4-octet definition of the merger of DIS 10646 and Unicode. Using 4-octets allows the flexibility needed to expand the code repertoire to meet all foreseeable requirements. 5. Location of Space for Presentation Forms We would support a drastic reduction or elimination of the presentation forms in the base multilingual plane while retaining codes necessary to transcode existing standards in plain text. People were concerned that DIS 10646 reserved too much unused code space in the base multilingual plane. A final determination of the presentation codes will be made in consultation with Arabic and other experts. Vote Thursday: 15 for/ 0 against/ 1 abstain (Becker) 6. Combine the Repertoires of DIS 10646 and Unicode into the Merged Code. We propose that the repertoire of the base multilingual plane of the merged code, 10646M, be derived from a superset composed of the union of the repertoires of DIS 10646 and Unicode; for example, the superset should include pre- composed Latin, Greek, Hangul, Vietnamese, and additional symbols. Vote Friday: 10 for/ 0 against/ 0 abstain 7. Simplify the Compaction Methods. We are concerned about the complexity of the DIS 10646 compaction forms. For simplicity, we propose that there be several parts to the standard: Part 1: General introduction, terminology, etc. Part 2: The base multilingual plane (BMP). This part of the standard will specify the 2-octet implementation of the BMP. Other parts are not required for conforming implementations of the BMP. This part may be implemented without announcers. Part 3: The full four-octet canonical form. Part 4: Mechanisms for other compaction methods to be determined. In the absence of other introducers for 10646 data, Part 2 should be assumed. Vote Friday: 10 for/ 0 against/ 0 abstain 8. Make Annex H Part of the 10646 Conformance statement. We recommend moving Annex H of DIS 10646 into the main body of the standard and making it a requirement for conformance. Vote Friday: 9 for/ 0 against/ 0 abstain/ 1 absent (Bohn) Due to time limitations we were unable to discuss and make recommendations to resolve the following differences between DIS 10646 and Unicode. 9. Coding of Semantics versus Shape. For example, parenthesis, brackets and braces are coded as open/close in Unicode, and as left/right in DIS 10646. 10. Using Any Multi-Octet Coded-Character-Set Will Require Program Changes. The following two examples show that neither DIS 10646 nor Unicode may be blindly used with the C programming language. a. C Language Wide-Character (wchar_t) Model Padding ISO 8859/1 characters with the decimal 032 value precludes the direct use (without conversion) of 10646 compaction forms 2-4 as the wchar_t data type in the C programming language. This is point 3 in the Canadian position statement. b. NULL Characters in the C Language Unicode may use 000 as the first or second octet of the 2- octet code. The C language uses the NULL (000) octet as a character string terminator for 1-octet character data. Therefore, C programs must be rewritten to use Unicode. 11. Other Issues The above list of differences between Unicode and DIS 10646 is not exhaustive. Other lower priority issues also need to be considered. Action Items to Promote the Agreement 1. Participants will lobby for this proposal with their country and company constituencies. (All, immediately) 2. Ask the Unicode Consortium member companies to place a discussion of this document on the agenda of the next Unicode Consortium meeting on June 7. The Unicode Consortium should formally state that it agrees or disagrees with the general direction and state any of its concerns with specific points. (Whistler) 3. Form a joint editing committee to help draft the final 10646 merged standard. (Freytag provides updated code tables, Hasegawa provides updated structure and text, 15 Aug. list the areas of the DIS 10646 document that would require changes) 4. For closer cooperation between ISO and the Unicode Consortium, we encourage the Unicode Consortium to pursue becoming a liaison member of JTC1/SC2, and for JTC1/SC2 to accept the Consortium as a liaison member. (Unicode Consortium, Aug., 1991) 5. Send this report to the national bodies and ask them to consider our consensus agreement in their votes on ISO-IEC DIS 10646. (Hart, 29 May) 6. Provide a list of the advantages of having one multi-octet code rather than two. (Andersen, done) 7. (Point 1) Coordinate an investigation of the impact of coding in C0. (Scheinberg, 15 Aug.) 8. (Point 2) Using formal minutes and other information, summarize the Tokyo CJK-JRG meeting. (Collins, 31 July) 9. (Point 3) Provide the Annex describing the rules to be used with multiple non-spacing marks. (Whistler, 9 June) 10. (Point 3) Coordinate review by ISO TC46 and CCITT of proposed use of non-spacing marks. (Smith-Yoshimura (TC46) and Friemelt (CCITT), Aug. 15) 11. (Point 5) Coordinate a review of the need to reserve so large an area for presentation forms for Arabic and other scripts on the base multilingual plane. (Ksar and Friemelt, 15 Aug.). 12. (Point 6) Investigate need for composed characters from Cyrillic and Polytonic Greek. (Why did WG2 include them in the DIS?) (Whistler, 15 Aug.) 13. (Point 7) Coordinate an investigation of which compaction methods to propose in Part 4. (Jarnefors, 15 Aug.) 14. Create 10646M electronic distribution list. Send electronic mail message to Hart to subscribe. (Hart, done) (End of Document) ========================================================================= Date: Thu, 30 May 91 18:12:19 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: schein@TOROLAB5.VNET.IBM.COM Subject: C0 committee I am calling for nominations for this work. Olle already suggested two names of Internet experts: Johny Eriksson (Swedish University Computer Network) Greg Vaudrelli (????) I am also planning to invite Mr. Bernard Marty (SC2 chairman and CCITT expert) as suggested by Michel. Please nominate experts from your companies (particularly from DEC and HP, who obviously have concerns in this area) in addition to people from CCITT, ISO and application/communications areas. Most of the work will be done by EMAIL. We may hold one meeting in NA and one in Geneva (adjacent to the WG2 meeting). Isai P.S. Olle, can you provide telephone number for Mr. Greg Vaudrelli? ========================================================================= Date: Thu, 30 May 91 14:42:46 PDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: "F. Avery Bishop 30-May-1991 1441" Subject: RE: AD-HOC meeting in Geneva Isai, I think the attached personal attack on Mike is unwarranted, to say the least. I sat through 1 1/2 days of the WG2 meeting over which Mike presided. It was clear to me that he was fair and unbiased in his management of the meeting, in contrast to when he participates as sn individual. If you have a personal dispute with Mike, you should take it up with him privately rather than sending diatribes over the public channels. Avery ----------------------------Original message---------------------------- Ed, the importance of ad-hoc vs WG2 meeting is in who is going to control it. From experience, I am afraid that meeting controlled by Mr. Ksar will have much less chance to proceed and end in harmony. It will also allow other people (Klaus) easier participation. After we prepare the agreed document in AD-HOC meeting, it will be much more difficult to kill it later. Isai ========================================================================= Date: Thu, 30 May 91 22:08:48 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: schein@TOROLAB5.VNET.IBM.COM Subject: RE: AD-HOC meeting in Geneva Avery, >Isai, >I think the attached personal attack on Mike is unwarranted, to say the >least. I did not intend to attack Mike, but just to express my opinion that the next step with 10646M will be easier with Ed as a chair, primarily because Ed's main drive is to succeed with the merge, whatever the final technical solution is. If you see this as an offense, I am taking my remark back. >If you have personal dispute with Mike, you should take it up with him >privately rather than sending diatribes over the public channels. You are absolutely right, but I did not send this note to the public forum, just to Ed, who inadvertently resent it to this forum. Isai ========================================================================= Date: Fri, 31 May 91 09:50:38 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Re: 10646M Minutes --Notes In-Reply-To: Your message of Tue, 28 May 91 15:36:56 PDT Ken, thanks for your comments. Here is how I handled them. >Here are my more picky notes on the draft, followed by a couple >of more hefty substantive comments on two of the points regarding >Areas of Consensus. > >I concur with Olle's comments re wording of point 6. C0-C1 restriction >and 7. Non-spacing marks. > >Concur with comments circulating re removing the schedule of the next >ad hoc meeting in Geneva from paragraph 5 of the Summary. > I did these and fixed the typos. Unfortunately, I did not fix Van Stee's name--He also sent me a note after I had the "final" reproduced. I have corrected it for the final August version > >===================== > >Substantive fixes: > >Areas of Consensus 4., 2nd paragraph: > I changed the wording to say it would be a 4-byte code and removed the part about the Canadian proposal. I believe (and I could be wrong) that the intent was that 10646M would be a 4-byte code rather than to specify coding of the BMP. So far we have concentrated on the architecture and purposefully deferred discussing code assignments, including the number of the plane for the BMP. These decisions need to wait on the report on C0 coding. > >This is to be constrasted with the current DIS 10646, where the >three values would come out to: >Unicode U+0041 ==> decimal 65 >10646 032/065 ==> decimal 8257 >10646 032/032/032/065 ==> decimal 538,976,321 > >Incidentally, this numerical values problem is not just numerology. >Making the ASCII character value = the 16-bit character value >= the 32-bit canonical form character value is a MAJOR help >to conversion of existing software, and to my mind is the >strongest argument by far for agreeing to abandon the C0 restriction. >The second strongest has to do with value contiguity, range-checks, >and table-size. Only the third level of the argument has to >do with the overall coding space-size--and even that one is important! > You have made a good point here for removing the C0 restriction. Your examples show why you and several other people from Unicode (Joe Becker included) have be so concerned with multiple representations of the same character in DIS 10646. The programmer must be much more careful in handling the compaction methods. I would only use the compaction methods for storage and transmission. For processing, I would first "normalize" 10646 into either a 2-byte or 4-byte form. > >Areas of Consensus 11, NULL Characters in the C language > > >What I am getting at is that all C code designed for 8-bit character >interfaces has to be rewritten to handle multiple-octet codes. Unicode >or DIS 10646 both have this problem. All multiple-byte character >encodings have this problem. And ALL computer languages have this >problem (Assembly, Cobol, Fortran, Forth, Pascal, Modula, C++, APL, >Lisp, Snobol, SmallTalk, Eiffel, Icon, ...) --they are ALL broken if >interfaces to handle strings in 8-bit units get handed strings with >characters encoded in 16-bit or 32-bit units. They ALL need to be >fixed. > I tried to capture the above thought when I edited the final draft. Isai said that I still failed to capture the right ideas. However, we have not discussed it but can discuss it at our next meeting so that I can get it right. This was only the third mistake I made yesterday. I wonder where I'll get into trouble next? I expanded that we need to have a statement from the Unicode Consortium saying that they approve or disapprove of the general direction of this 10646M group. (This is for the June 7 meeting.) Also for Cyrillic pre-composed characters and Platotonic Greek, you need to review the SC2 and WG2 documents to find why they were included in the DIS and any other specific comments. I think the group needs to understand why they were included in the first place, what the arguments are for including them and the arguments for removing them. For the proposed merger, we need to decide what to recommend. Thanks again for your comments. Ed ========================================================================= Date: Fri, 31 May 91 08:15:31 PDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: "J. G. Van Stee" Subject: Olle Jarnefors comments (one-byte compaction form) Is the requirement to be able to compact a four-byte code to one or to be able to continue processing existing one-byte code pages? The former could result in the generation of many new code pages and the latter suggests that a mecha- nism is required to convert between current code pages and an lcs. Van ========================================================================= Date: Fri, 31 May 91 11:18:54 EDT Reply-To: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> Sender: "10646M: Multibyte code working group" <10646M@JHUVM.BITNET> From: Edwin Hart Subject: Guess who did not have his name on the distribution list Until a few moments ago, I did not have my name on the 10646M list. Color my face red. I just obtained a copy of the log to see all of the messages. As the owner, I get all of the mail delivery error messages but none of the other data traffic. Lloyd, since I just saw your note now, I did not put any of it into the final document. I must also appologize because in my haste I forwarded some personal mail to the distribution. As one trying to encourage trust between the ISO and Unicode people, I sure have made a big mess of it. I am sorry. Since we have the list available, send mail to it for distribution. I will not redistribute mail sent directly to me unless you direct me to redistribute it. Here is a list of the people now on the electronic distribution list: * * 10646M: Multibyte code working group * * Confidential= Yes * Files= No * Mail-via= Dist2 * Notebook= Yes,X1/201,MOnthly,Public * Owner= HART@APLVM * * * 10646M mailing list * * Location: JHUVM * * Purpose: * * For discussion of merging ISO DIS 10646 and Unicode into one * global multibyte code. * ojarnef@ADMIN.KTH.SE Olle Jarnefors HART@APLVM Edwin Hart jenkinsj@APPLE.COM John Jenkins Davis.Mark@APPLELINK.APPLE.COM Mark Davis ecoling@APPLELINK.APPLE.COM Lloyd Anderson Bishop@DECWET.ENET.DEC.COM F. Avery Bishop ksar@HPCEA.CE.HP.COM Mike Ksar Takayuki_K_Sato%e2@HP8900.DESK.HP.COM Takayuki K Sato ma_hasegawa@JRDV04.DEC.COM Masami Hasegawa whistler@METAPHOR.COM Ken Whistler andersen@RALVMK.VNET.IBM.COM Jerry Andersen bl.kss@RLG.STANFORD.EDU Karen Smith-Yoshimura jvanstee@STLVM7.VNET.IBM.COM J. G. Van Stee schein@TOROLAB5.VNET.IBM.COM Isai Scheinberg microsoft!asmusf@UUNET.UU.NET Asmus Freytag microsoft!michelsu@UUNET.UU.NET Michel Suignard becker.osbu_north@XEROX.COM Joe Becker * * Total number of users subscribed to the list: 17 * Total number of local node users on the list: 0 *