UnicodeIUC14
Abstract

This session offers three separate mini-tutorials that will cover the topics of Characters, Glyphs & Rendering.

What You Need To Know About Processing and Rendering Multilingual Text

Edwin Hart, Senior Engineer, The Johns Hopkins University

The advent of multilingual information processing with Unicode requires the designer to have a deeper knowledge of rendering characters for display and printing than is necessary for a single script, like Latin. Rendering technology that is adequate for a language of the Latin script, like English, may prove totally inadequate for scripts such as Arabic or Devanagari. This presentation introduces a framework to characterize a character in terms of its information, associated shape (or glyph) and the relationships between these two attributes. It first differentiates between the domains of characters and of glyphs, and when it is appropriate to do processing in one domain versus the other. Next, it describes three different technologies used to render Unicode characters into glyphs. Finally, it describes several considerations for design.

The Unicode Character-Glyph Model and Rendering Complex Scripts

John H. Jenkins, Engineer, International and Text Group, Apple Computer, Inc.

Statement of Purpose: This is a slightly expanded and refocused version of the session on complex Unicode rendering which I've given at the past three IUC's. It's now aimed at being a tutorial and discusses some of the requirements in greater depth.

Summary: From its beginnings, Unicode has made an explicit separation between the processes of text generation, text storage, and text rendering. The division between text storage and text rendering is perhaps the most fundamental and is explicitly formulated as the character-glyph model.

Although it's possible to represent most Western European and East Asian languages on computers without the character-glyph distinction, there are a number of scripts, notably the various Levantine and South Asian scripts where this cannot be done. Moreover, the very large number of accented Latin letters in actual use also forces Unicode-based systems to take the character-glyph separation into account, as well as the needs of high-end Western typography.

Specific examples from various writing systems illustrating the need for character-glyph separation will be given. Specific Unicode implementations will also be referenced to show how this model is taken into account and how application developers can provide support for it in their own programs.

Character Sets and Encodings

Brendan Murray, Software Architect, International Product Development, Lotus Corporation

The computing world is made up of heterogeneous systems, each with its own set of character sets and/or encodings. Until such time as the world speaks Unicode, it will be necessary to understand how these character sets are structured, and how they interact with one another.

This talk will attempt to explain the format and contents of the most common character set encodings, including:
- Simple single-byte encodings (a.k.a. SBCS)
- Simple double-byte encodings (Shift-JIS, KS-C, etc.)
- Generic multi-level encodings (EUC, ISO-2022, etc.)
This talk will also briefly address the "new" character encodings, such as JIS-X-0212 and -0213.

The goal of this talk is to provide the listener with the basic knowledge required to understand this modern Tower of Babel, without becoming too confused by the plethora of proprietary, national and international standards.

Unicode
When the world wants to talk, it speaks Unicode
ProgramShowcasePast ConferencesRegistrationUnicode StandardCall for Papers
AccommodationSponsorsTalks and PapersTravelConference BoardNext Conference
UnicodeIUC14
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS). GMS is pleased to be able to offer the International Unicode Conferences under an exclusive license granted by the Unicode Consortium. All responsibility for conference finances and operations is borne by GMS. The independent conference board serves solely at the pleasure of GMS and is composed of volunteers active in Unicode and in international software development. All inquiries regarding International Unicode Conferences should be addressed to info@global-conference.com.

Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.

25 January 1999, Webmaster