L2/00-205 From: Joan_Aliprand@notes.rlg.org Sent: Monday, June 26, 2000 7:43 PM Subject: Report on 2000 meeting of ISO/TC46/SC4/WG1 This is an informal report on this year's meeting of ISO/TC46/SC4/WG1 based on my notes. -- jma The 2000 meeting of ISO/TC46/SC4/WG1 was held on May 9 in Berlin (hosted by DIN). I attended as a member of the US delegation. In 1998, L2 objected [L2/98-285] to registration of TC46 character sets and a NISO character set (submitted by NSAI for ISO/TC46/SC4 and the National Information Standards Organization respectively). One of L2's grounds for objection that the mappings to ISO/IEC 10646 characters included in the registrations contained errors. My purpose in attending the May 9 meeting was to resolve the disagreement between L2 and WG1 over character set mappings. As a member of NCITS/L2 and of NISO (the US TAG to ISO/TC46), RLG has a foot in both camps. Many of the TC46 character sets are based (in whole or in part) on sets from either the Library of Congress or the British Library. RLG has particular experience with the LC sets, having developed the ones for CJK, Hebrew, and Arabic. To provide a convincing argument for changes to WG1's mappings, I compiled tables comparing WG1's work, L2's recommendations, and independently created mappings for TC46 character sets or their sources. The comparisons included a final column showing the "consolidated opinion of expert groups representing interested parties." Thus, it was not simply L2 head-to-head WG1, but other opinions as well. (Some of the comparative tables are available in PDF format at http://www.niso.org/sc4act.html#wg1.) RLG submitted these comparisons and two summaries intended as working documents for the meeting as a member contribution to NISO, thus ensuring that they would have national body standing. WG1 met on May 9, with the Convener, Randy Barry, presiding. Countries represented were Australia, Austria, China, France, Ireland, Italy, Japan, Slovenia, South Korea, and USA. The bulk of the meeting was devoted to the mappings. The most active participants in this work were Randy Barry (WG1 Convener), Michael Everson (Ireland and JTC1/SC2 liaison to SC4), Akira Miyazawa (Japan) and myself (US). Summary of decisions Registration of TC46 standards (needed to support data exchange in the UNIMARC bibliographic format) was separated from mapping of characters in the standards. Applications for Registration will be re-submitted without mappings. Randy Barry is to contact Ms. Kimura to determine what the Registration Authority currently requires for registration. The "consolidated opinion" was accepted in all but these three cases: ISO 5426-2, character 31 PRIME "Used in Sami" was mapped to U+00B4 ACUTE ACCENT. This decision was based on Michael Everson's implementations for Skoalt Sami. The comparative mappings were all different; the "consolidated" mapping was L2's U+02C8 MODIFIER LETTER VERTICAL LINE because the other choices had problems. ISO 5428, characters 5/4 Koppa (Capital letter) and 7/4 Koppa (Small letter) will be mapped to the code values for the disunified "tailed" forms when WG2 assigns them. (Change proposed by Michael Everson.) ISO 6862, Table 2, character 5/1 Diagonal rule to left was mapped to U+002F SOLIDUS instead of WG1's earlier choice of U+2215 DIVISION SLASH (with which L2 had concurred). ISO 6862 gives no information about the use of this character, so it was considered preferable to unify it (via mapping) with the generic ASCII slash than to impute a specific function to it through the choice of mapping. (Although the DIVISION SLASH mapping preserves round-trip integrity, I am not aware of any implementations of ISO 6862, so I did not oppose this decision.) Because ISO 5428 (Greek) and ISO 6862 (Math) now have complete mappings, WG1 recommended their transfer to ISO/IEC JTC1/SC2 (in addition to the six previously transferred). The comparative analysis demonstrated the need for modifications to mappings for four of the character sets already transferred: ISO 5426 (Extended Latin), ISO 6438 (African), ISO 10585 (Armenian), and ISO 10586 (Georgian). The identity of a character in ISO 11822 (Extended Arabic) was questioned and must be clarified before this mapping can be finalized. There were no errors in the mapping for ISO 5427 (Extended Cyrillic). Five standards remain to be transferred. Completing the mapping of these character sets will be on the agenda of the next meeting of WG1 (projected for spring 2001). Appreciation To Ed Hart and Ken Whistler who reviewed the comparative mapping tables. To Kathy Bales of RLG and Pat Harris of NISO for guidance in standards procedures. To Michael Everson for the promptly issued "Liaison Statement to ISO/TC 46/SC 4, Transfer of ISO/TC 46 Standards to JTC 1/SC 2" (ISO/IEC JTC1/SC2 N3439) Ken added: appreciations to Joan: And much appreciation to Joan for working hard on resolving these mapping issues and helping to shepherd these character standards through the transferal process. -- Joan Aliprand, RLG