Text Layout for Complex Scripts: Data Structures and
Algorithms
Intended Audience: |
Software Engineers, Systems Analysts |
Session Level: |
Intermediate, Advanced |
Text layout systems, whether they are part of word processors,
web browsers, or provided as part of an operating system or
application framework, now have to deal with complex scripts. They
must solve problems such as diacritic placement, line breaking in
the absence of word spaces, bidirectional layout, ligature
building, and caret placement. The advent of Unicode has meant that
the text itself, whatever languages it may contain, now has a
common representation; one text layout system can and should handle
any content. This paper discusses data representations and
algorithms for universal multi-lingual text layout from a critical
and comparative stance, based on the author's experience in writing
such systems, and makes concrete recommendations for building
software that is reliable, fast enough to be used on relatively
slow platforms like palm-tops and mobile phones as well as desk-top
computers, and as simple as possible - but no simpler. This paper is aimed at a technical audience. Some programming
experience is advisable. |