Summary Narrative
The Unicode character encoding derives its name
from three main goals:
-
universal (addressing the needs of world
languages)
-
uniform (fixed-width codes for efficient access),
and
-
unique (bit sequence has only one interpretation
into character codes)
The concept of a 16-bit universal code is not new;
even the original principles of the ISO multi-byte character encoding
are directed to that end:
"...Develop the ISO standard for graphic character
repertoire and coding for an international two byte graphic character
set... Consider the needs of programming languages to have the same
amount of storage for each character..."
[ISO/TC97/SC2 N1436, 1984]
Other antecedents to Unicode are found in the Star
workstation introduced by Xerox in 1980 and in aspects of the two-byte
standards found in the Far East.
Unicode began as a project in late 1987 after
discussions between engineers from Apple and Xerox: Joe Becker,
Lee Collins and Mark Davis. By early 1988, three main investigations
had been completed:
-
comparisons of fixed-width and mixed-width text
access
-
investigations of the total system storage
requirements with two-byte text; and
-
preliminary character counts for all world
alphabets.
Based on these investigations, and their experience
with different character encodings, Becker, Collins and Davis derived
the basic architecture for Unicode.
In the fall of 1988, Collins began building a
database of Unicode characters.
The original design ordered characters
alphabetically within scripts, and excluded all composite characters.
Xerox had already built up a database of Unified Han for font
construction. Collins used a database of EACC characters from RLG (The
Research Libraries Group) to start a Han Unification database at
Apple. Becker and Collins later correlated the two databases, and
Collins continued to extend the database with further character
correspondences added for other national standards.
In early 1989, the scope of the Unicode working
group was extended to gain the participation of other companies. At
this time, Ken Whistler and Mike Kernaghan of Metaphor, Karen
Smith-Yoshimura and Joan Aliprand of RLG, and Glenn Wright of Sun
joined the working group, and began making significant contributions
to the design.
In mid 1989, a number of changes were made to bring
Unicode closer to existing standards. All existing ISO composite
characters were added to Unicode, and "round trips" were added (any
entries distinguished as two characters in national standards would be
distinguished in Unicode), and the ordering was changed to use ISO
8859 ordering where possible.
In early 1990, Michel Suignard and Asmus Freytag
joined representing Microsoft. Together with Whistler, they were to
begin an extensive effort to produce mapping tables to other character
encoding standards. Unicode alphabetics and symbols were essentially
complete by the spring of 1990, but the cross-mapping effort
continued. This extensive mapping effort to 10646, IBM, Mac and
national standards proved to be an invaluable aid to producing a
complete, valid encoding.
Joan Winters started representing SHARE at Unicode
meetings. Isai Scheinberg and J.G. Van Stee of IBM joined in mid 1990.
An extensive IBM review and study was organized at Toronto University.
The results of this review include the compatibility zone for
half-width characters and Arabic glyphs. James Caldwell of Pacific Rim
began working as the editor, pulling together the different documents
into a coherent standard document. Most of the character descriptions
were completed at this time.
By October 1990 the Han characters were also in
final draft. The decision was made to have a broad distribution of a
final review draft of Unicode, to allow the working group the
opportunity to assess and incorporate feedback from a variety of
sources. Microsoft and Aldus volunteered to bear the distribution
cost. Rick McGowan of NeXT began a database of characters for addition
to future versions of the standard.
Mike Kernaghan, Bill English, Mark Davis and Asmus
Freytag organized most of the business aspects of Unicode, Inc. which
was incorporated on Jan 3, 1991 in the state of California.
The original purpose of the Unicode Consortium was
to 'standardize, extend and promote the Unicode character encoding, a
fixed-width, 16-bit character encoding for over 60,000 graphic
characters.' The statement of purpose has since been updated to
reflect that the Unicode Standard has grown beyond 16 bits.
The original members of the Board of Directors of
Unicode, Inc. were:
-
Larry Tesler, Vice President Advanced Products,
Apple Computer, Inc.
-
Robert Carr, Vice President Software Development,
GO Corporation
-
Richard Holleman, Director of Telecommunications,
IBM Corporation
-
Charles Irby, Vice President of Development,
Metaphor Computer Systems
-
Paul Maritz, Vice President Advanced Operating
Systems, Microsoft Corporation
-
Bud Tribble, Vice President Software Engineering,
NeXT Computer Inc.
-
Jay Israel, Vice President Advanced Technology,
Novell, Inc.
-
David Richards, Director of Development, The
Research Libraries Group.
-
John Gage, Vice President Desktop Development,
Sun Microsystems Inc.
The initial officers of Unicode, Inc. were:
-
Mark Davis, President
-
Mike Kernaghan, Vice-President
-
Joe Becker, Technical Vice-President
-
Ken Whistler, Secretary
-
Bill English, Treasurer