This talk discusses some problems one encounters in manipulating
multilingual Unicode text. The problems include using multiple fonts to
display Unicode plain text, dealing with neutral characters, deunifying
characters that look different in different scripts, dealing with complex
scripts, using keyboards to enter Unicode characters conveniently, keeping
compatible with previous character sets, glyph variants, and navigating
through text that includes "multicharacters" such as Unicode-surrogate
pairs, combining-mark sequences, and variable length end-of-paragraph
marks. Solutions to some of the problems will be demonstrated using the
RichEdit 3.0 controls shipped with Office 2000 and to be shipped with
Windows NT 5.0. Glyph variants and surrogate pairs comprise a couple of
ongoing topics that lend themselves to several possible approaches.
|