Unicode Collation and Normalization
Intended Audience: |
Manager, Software Engineer, Systems Analyst, Marketer |
Session Level: |
Intermediate, Advanced |
The Unicode Collation Algorithm specifies a means for doing culturally
correct ordering of Unicode strings. In this talk we present the
basic concepts of the algorithm: how weights and levels are defined
and how sort keys can be constructed from the default collation weight
tables provided by the Unicode Consortium. A simple syntax for
tailoring the default collation to enable correct ordering for any
particular language or collection of languages is described and
exemplified.
We also discuss the relationship between the Unicode Collation Algorithm
and other emerging standards for international string ordering such
as ISO 14651 and the European Ordering Rules.
For Normalization, UTR#15 will be discussed.
|