IT/02-0689, L2/02-263R

INCITS / L2 ANNUAL REPORT

Covering the Period from June 2001 through August 2002
Title of INCITS Subgroup L2: Character Sets and Internationalization

2002-08-15

L2 Website (Password Protected)

Informal description of work

Executive Summary

Link to L2 Projects List on INCITS Website

Significant accomplishments

Significant challenges

Expected challenges

Previous year’s meetings

Next year’s meetings

Liaison activities

Membership and Officers

Future trends

Other administrative information

 

Informal Description of Work.

L2 is the US-TAG for character sets (JTC1/SC2) and for internationalization (JTC1/SC22/WG20). Both subject areas are essential for the development of well-globalized, internationally usable systems and applications. They are particularly important in the rapidly growing marketplace of international WWW access and electronic commerce.

Executive Summary.

The number of L2 members is presently 13, taking the merger of HP and Compaq into account. The continued interest in L2 and stability of the membership stems from the following:

  • The number of Unicode/ISO 10646 based products continues to grow, both in number and in diversity (more companies are implementing Unicode/ISO 10646);
  • Support for Unicode/ISO 10646 on the World Wide Web continues to increase;
  • L2 is the TAG for SC22/WG20 (Internationalization), which more companies recognize as important because internationalization has become a mission critical feature of their products.

L2 works very closely with the Unicode® Consortium - all technical meetings are co-located.

L2 meets now 4 times per year. Co-location with the Unicode Technical Committee meetings is economical, since most of the members of L2 are also members of the Unicode Consortium and the subject matters overlap widely.

L2 has switched almost totally to electronic document distribution via our web site (password protected), including sets of 20 zipped documents for faster downloading. Paper documentation is used infrequently, mostly in situations where there currently is no technical solution available (that is, the script is not yet in ISO 10646 or legacy character sets, it is not used electronically and there is no means to input, render or display the text); however, scanned copies of these documents are also used.

The work on ISO 10646, after publication of the second edition of ISO 10646-1:2000, has two aspects at this time. One is the continued addition of scripts and characters to the Basic Multilingual Plane, as manifested in Amendment #2 to ISO 10646-1:2000, currently in FPDAM ballot; the other is the allocation of scripts in the Supplemental Planes, off the Basic Multilingual Plane (as seen in Amendment #1 to ISO 10646-2, which is also in FPDAM ballot). The US is contributing substantially to these projects, as the editor is Michel Suignard from Microsoft in Redmond, who is also the IR of L2.

The work on internationalization as TAG of SC22/WG20 includes contributions to and reviews of draft standards and technical reports, such as the international string ordering standard IS 14651, the guide for the development of programming language standards TR 10176, and the registration standard for cultural elements IS 15897.

Significant Accomplishments.

For all items below, the US has made substantial contributions; for most projects we provide the editor.

  • ISO/IEC 10646-1:2000 (second edition) — Universal character set, Basic Multilingual Plane (BMP), has been approved and published on CD.
  • ISO/IEC 10646-1:2000, Amendment #1 Universal character set (second edition), Basic Multilingual Plane (BMP), has been approved.
  • ISO/IEC 10646-2 — Universal character set, Supplementary planes, has been approved and published.
  • ISO/IEC TR 10176:2001 (third edition) — Guidelines for the preparation of programming languages, has been approved and is being published, the 4th edition is in DTR ballot. These revisions of the TR are necessary due to the growing repertoire of 10646.
  • ISO/IEC 10646 aka Unicode in programming languages: at the February 2002 meeting of L2/UTC, we hosted a meeting of programming language experts and conveners of national and international PL committees to discuss the use of the Universal Character Set in programming languages, and the possible addition of a Unicode data type to at least C and C++. The meeting was a success; it will be carried into the international arena with an ad-hoc session at the SC22 plenary in August 2002. L2 will have a representative there for technical leadership, and we hope to continue this fruitful discussion with experts in programming languages in the future.

Significant Challenges.

TAG for SC2:

  • Present challenges are the completion of the work on ISO/IEC 10646-1:2000, Amd. #2 and ISO/IEC 10646-2 Amd. #1. Since these standards contain as yet unavailable characters, the creation of high quality fonts, and the printing of code charts are huge tasks for the editors, often with external dependencies.
  • Maintaining synchronization with the Unicode standard, which is widely implemented, is crucial. This is achieved through co-location of technical meetings and strong liaison activities.
  • Additionally, fending off new proposals for 8-bit character sets (especially from emerging markets) present constant challenges.

Co-operation with C and C++ committees:

  • L2 hosted a combined meeting of representatives from J11, J16, and L2 to develop a consistent, acceptable model for a data type that allows the use of ISO 10646 data in its UTF-16 format. Work on this effort will continue with an ad-hoc meeting at the SC22 plenary in Finland, August 2002. It is crucial to make sure that ISO 10646 (UTF-16) is a valid data type.

TAG for SC22/WG20:

  • The quest for synchronization of the Unicode Collation Algorithm with ISO/IEC 14651 — International string ordering, needs constant US input to the WG20 project. The nature of collation data makes it very difficult to synchronize research on script collation, such that the collation data is available at the same time as the new scripts. As such, the collation data lags behind the repertoire.
  • L2 was successful in having the TR 15435 project withdrawn by SC22.

Expected Challenges.

TAG for SC2:

  • Ensure architectural consistency in both parts of ISO/IEC 10646.
  • Prioritize the encoding of new scripts according to market demands and technical readiness, including fonts.
  • Political challenges:
    • North Korea requesting the change of names and sequence of characters;
    • Various state governments in India requesting the change of the script model and reordering of characters;
    • Cambodia requesting the deprecation of a large set of characters and replacing them with an architecturally different set of characters (same script, different means of representing the script and language).
  • Unreasonable requests for pre-composed characters, especially in the Indic scripts (vis-à-vis the matra model in 10646).
  • Requirements from East Asia, e.g., compliance with new character set (and PRC standard) GB 18030.

TAG for SC22/WG20:

  • Ensure high quality registration process definition in ISO/IEC 15897.
  • Add additional repertoire to the ISO/IEC 14651 ordering standard.

Previous year’s meetings.

# 185           

August 7-10, 2001     

Redmond (Microsoft)

# 186           

November 6-9, 2001 

Mountain View (Microsoft)

# 187           

February 11-14, 2002           

Mountain View (Microsoft)

# 188           

April 30-May 3, 2002           

Pleasanton (PeopleSoft)

Next year’s meetings.

# 189           

August 20-23, 2002   

Redmond (Microsoft)

# 190           

November 5-8, 2002 

Nashua, NH (HP)

# 191           

March 4-7, 2003        

Mountain View (Microsoft)

# 192           

Q2 2003         

San Jose (Adobe)

 

Liaison activities.

Liaison Representatives to L2:

 

Committee

Representative

FCC

Federal Communications Commission

D. Campbell

NISO (Z39)

National Information Standards Organization

S. McCallum

NCITS

NCITS Operational Management Committee

David Thewlis

TC46/SC4/WG1

 

R. Barry

X3J4

COBOL

A. Bennett

SC22/WG4

COBOL

A. Bennett

 

Liaison Representatives from L2:

 

Committee

Representative

JTC1/SC2

Character Sets and Information Coding

M. Suignard

SC2/WG2

Universal Coded Character Set

M. Suignard

SC2/WG3

7-bit and 8-bit Codes

M. Suignard

NISO (Z39)

National Information Standards Organization

J. Aliprand

WG2/IRG

Ideographic Rapporteur Group

J. Jenkins

Significant activities with liaisons:

  • COBOL: work on mainly upper to lowercase mappings of cased scripts, equivalence, and characters for identifiers.
  • TC46: registration of bibliographic character sets with the character set registry (SC2, Japan)
  • The programming language community: the programming language and Unicode meeting in February, as discussed in an earlier section.
  • WG2/IRG: L2 continues to work with the IRG to develop the most workable set of CJK (Chinese-Japanese-Korean) characters for ISO 10646. The liaison work includes locating quality fonts for the standard, proposing new characters as needed and reviewing requests from the IRG.

 Membership and Officers.

a. Officers.

Present Officers:

Position

Name

Organization

Training Date

Chair

Cathy Wissink

Microsoft

1/29/02

Vice Chair

Lisa Moore

IBM

7/17/00

International Representative

Michel Suignard

Microsoft

7/17/00

Vocabulary

Open

 

 

 

b. Membership.

Please see the appendix at the end of the document.

Future trends.

Membership: L2 has presently has 13 members: Apple Computer, Inc.; Hewlett-Packard Company; IBM Corporation; Microsoft Corporation; NCR, Oracle Corporation; PeopleSoft; Progress Software Corporation; The Research Library Group, Inc.; SHARE, Inc.; Sun Microsystems, Sybase, Inc.; Unicode Inc. .

Changes since last year: Unisys has dropped out. Compaq and HP have merged and the combined company is now listed as HP.

The membership appears to be stable at this time, although it is clear from discussions within the TC that the economy is starting to impact members’ ability to justify the costs of participating in standards.

Market relevance of standards area: The market relevance for this area of standardization (character sets and internationalization) is great. Most major software companies make a significant portion of their profits from outside of the US with globalized software (e.g., Microsoft makes over 50% of their profit outside the US), and both the Universal Character Set and internationalization play a big role in this.

As international interoperability becomes increasingly important, so does the Universal Character Set. ISO 10646 is used increasingly in Java, C#, in XML, on the web (e.g., the W3C’s work in the Character Model), and in other Internet standards (e.g., the IETF’s Internationalized Domain Name work), and is considered the logical character set for world wide use. Programming languages such as C, C++, SQL, COBOL, and FORTRAN are now enabling the use of ISO 10646. A new data type for Unicode (UTF-16) in programming languages is under consideration.

There are huge emerging markets (e.g., SE Asia, India, Africa) now recognizing the importance of communication world-wide. As these markets move towards greater technical capabilities, the standards work in L2 will become even more relevant.

Other administrative information.

None. L2 does not collect or disburse funds.

 

regards

Cathy Wissink, L2 chair


Membership appendix:

(Please note that this list from INCITS is severely out of date, and does not take the Compaq/HP merger into account, as that information was not available.  I will be updating this at the next meeting.)

06/14/2002

INCITS L2

Codes and Character Sets

MCGOWAN, RICK

Alternate

Email: rmcgowan@apple.com

ONE INFINITE LOOP 302-1NS

Phone: 01-(408) 974-1427

Fax: 01-(408) 974-5639

APPLE COMPUTER INC

CUPERTINO, CA 95014 USA

Voting

JENKINS, JOHN

Principal

Email: jenkins@apple.com

MS 302-2IS 2 INFINITE LOOP

Phone: 01-(408) 974-6276

Fax: 01-(408) 862-4566

APPLE COMPUTER INC

CUPERTINO, CA 95014 USA

Voting

LONG, WAI MAN

Principal

Email: longman@zk3.dec.com

ZK03-2W/17 110 SPIT BROOK ROAD

Phone: 01-(603) 884-0268

Fax: 01-(603) 884-2257

Compaq Computer Corp

NASHUA, NH 03062-2698 USA

Voting

MARTIN O'DONNELL, SANDRA

Alternate

Email: sandra.odonnell@compaq.com

ZK03-2W/17 110 SPIT BROK ROAD

Phone: 603-884-6257

Fax:

Compaq Computer Corp

NASHUA, NH 03062-2698 USA

Voting

RANNENBERG, WENDY

Additional Alternate

Email: wendy.rannenberg@compaq.com

I18N TECHNICAL DIRECTOR 110 SPIT BROOK ROAD, ZKO3-2W/

Phone: 603-884-0405

Fax: 603-884-2257

Compaq Computer Corp

NASHUA, NH 03062-2698 USA

Voting

Draper-Campbell, Donald

Liaison

Email: campbell@sytex.com

Room 7130K 2025 M Street

Phone: 202-653-8113

Fax: 202-653-8773

FEDERAL COMMUNICATIONS COMM

Washington, DC 20554 USA

Carroll, Don

Alternate

Email: don_carroll@hp.com

Suite 700 101 Stewart Street

Phone: 206-269-4011

Fax: 206-269-4020

Hewlett-Packard Co

Seattle, WA 98101 USA

Voting

MCCALLUM, SALLY

Liaison

Email: smcc@loc.gov

NCMSO PROCESSING SERVICE

Phone: 202-287-6237

Fax: 202-707-0115

LIBRARY OF CONGRESS

WASHINGTON, DC 20540-4020 USA

Ksar, Mike

Additional Alternate

Email: mikeksar@microsoft.com

One Microsoft Way

Phone: 425-707-6973

Fax: 425-936-7329

MICROSOFT CORP

Redmond, WA 98052 USA

Voting

SARGENT, MURRAY

Alternate

Email: murrays@microsoft.com

ONE MICROSOFT WAY

Phone: 425-936-8942

Fax: 425-936-7329

MICROSOFT CORP

REDMOND, WA 98052-6399 USA

Voting

Wissink, Cathy

Additional Alternate

Email: cwissink@microsoft.com

1 Microsoft Way

Phone: 425-705-8738

Fax: 425-936-7329

MICROSOFT CORP

Redmond, WA 98052 USA

Voting

Roberts, Gary

Principal

Email: gary.roberts@sandiegoca.ncr.com

17095 Via del Campo

Phone: 858-485-3803

Fax: -

NCR CORPORATION

San Diego, CA 92127 USA

Voting

Maghbouleh, Albert

Alternate

Email: albert.maghbouleh@ncr.com

100 North Sepulveda Boulevard

Phone: 310-524-6404

Fax: -

NCR CORPORATION

El Segundo, CA 90245 USA

Voting

YANG, JIANPING

Principal

Email: jiyang@us.oracle.com

BOX 659206 500 ORACLE PARKWAY

Phone: 01-(650) 506-4865

Fax: 01-(650) 506-7223

ORACLE CORPORATION

REDWOOD SHORES, CA 94065 USA

Voting

Yau, Michael

Alternate

Email: myau@us.oracle.com

500 Oracle Parkway Box 659409

Phone: 650-506-0730

Fax: 650-506-7223

ORACLE CORPORATION

Redwood Shores, CA 94065 USA

Voting

PHIPPS, TOBY

Principal

Email: tphipps@peoplesoft.com

4411 PEOPLESOFT PARKWAY

Phone: 925-694-9525

Fax: 707-221-7432

PEOPLESOFT

PLEASANTON, CA 94588-3031 USA

Voting

Kurosu, Hirobumi

Alternate

Email: hirobumi_kuosu@peoplesoft.com

4440 Rosewood Drive

Phone: 01-(925) 694-1401

Fax: 01-(925) 694-3100

PEOPLESOFT

Pleasanton, CA 94588-3100 USA

Voting

TEXIN, TEX

Principal

Email: texin@progress.com

14 OAK PARK

Phone: 01-(781) 280-4271

Fax: 01-(781) 280-4655

PROGRESS SOFTWARE CORP

BEDFORD, MA 01730 USA

Voting

WATT, STEVEN

Alternate

Email: swatt@progress.com

14 OAK PARK

Phone: 781-280-4569

Fax: 781-280-4655

PROGRESS SOFTWARE CORP

BEDFORD, MA 01730 USA

Voting

SMITH-YOSHIMURA, KAREN

Alternate

Email: KSS@NOTES.RLG.ORG

1200 VILLA STREET

Phone: 650-691-2270

Fax: 650-964-0943

RESEARCH LIBRARIES GROUP INC

MOUNTAIN VIEW, CA 94041-1100 USA

Voting

Aliprand, Joan

Principal

Email: Joan_Aliprand@notes.rlg.org

1200 Villa Street

Phone: 01-(650) 691-2258

Fax: 01-(650) 964-0943

RESEARCH LIBRARIES GROUP INC

Mountain View, CA 94041-1100 USA

Voting

HART, EDWIN

Principal

Email: edwin.hart@jhuapl.edu

C/O APPLIED PHYSICS LABORATORY 11100 JOHNS HOPKINS R

Phone: 240-228-6926

Fax: 240-228-1093

SHARE INC

LAUREL, MD 20723-6099 USA

Voting

Thewlis, Dave

Liaison

Email: dthewlis@dcta.com

1460 Kane Ridge Road P.O. Box 670

Phone: 707-488-9978

Fax: 707-488-2618

SHARE INC

Trinidad, CA 95570-0670 USA

Voting

SMITH, WILLIAM (BILL)

Alternate

Email: bill.smith@eng.sun.com

MPK 16-201 901 SAN ANTONIO ROAD

Phone: 650-786-9127

Fax: 650-786-9553

SUN MICROSYSTEMS INC

PALO ALTO, CA 94303 USA

Voting

HIURA, HIDEKI

Principal

Email: hiura@eng.sun.com

MPK 16-201 901 SAN ANTONIO ROAD

Phone: 650-786-8906

Fax: 650-786-9553

SUN MICROSYSTEMS INC

PALO ALTO, CA 94303 USA

Voting

Whistler, Ken

Principal

Email: kenw@sybase.com

MLG Building 1301 65th Street

Phone: 510-922-3611

Fax: -

SYBASE INC

Emeryville, CA 94608 USA

Voting

MacLead, Ian

Alternate

Email: imaclead@sybase.com

1650 - 65th Street

Phone: 510-922-3611

Fax:

SYBASE INC

Emeryville, CA 94608-1012 USA

Voting

Freytag, Asmus

Principal

Email: asmusf@ix.netcom.com

c/o ASMUS, Inc. / BASIS 6008 Corliss Avenue North

Phone: 206-523-1670

Fax: 206-523-0517

UNICODE INC

Seattle, WA 98103 USA

Voting