L2/02-104

13.7 Variation Selectors (new section)

Unicode characters can be represented by a wide variety of glyphs, as discussed in Chapter 2. Occasionally the need arises in text processing to restrict or change the set of glyphs that are to be used to represent a character. Normally such changes are indicated by choice of font or style in rich-text documents. In special circumstances, such a variation from the normal range of appearance needs to be expressed side-by-side in the same document in plain-text contexts, where it is impossible or inconvenient to exchange formatted text. For example, in languages employing the Mongolian script, sometimes a specific variant range of glyphs is needed for a specific textual purpose for which the range of “generic” glyphs are considered inappropriate. The variation selectors are used when characters have essentially the same semantic (?and pronunciation?). [Ed Note: also foreshadow the underlined material below]

Variation selectors provide a mechanism for specifying a restriction on the set of glyphs that are used to represent a particular character. It also provides a mechanism for specifying variants, such as for CJK Ideographs and Mongolian, that have essentially the same semantic but have substantially different ranges of glyphs. A variation sequence, which always consists of a base character followed by the variation selector, may be specified as part of the Unicode Standard. The variation selector affects only the appearance of the base character, and only in the variation sequences defined in this standard. The variation selector is not used as a general code extension mechanism:

Only the variation sequences specifically defined in the Unicode Character Database in the file StandardizedVariants.html are sanctioned for standard use; in all other cases the variation selector cannot change the visual appearance of the preceding base character from what it would have had in the absence of the variation selector.

The base character in a variation sequence is never a combining character or a composite character. The variation selectors themselves are combining marks of combining class 0. A variation selector is a default ignorable character. Thus if the variation sequence is not supported, the variation selection should be invisible and ignored. As with all default ignorable characters, this does not preclude modes or environments where the variation selectors should be given visible appearance. For example, a “Show Hidden” mode could reveal the presence of such characters with specialized glyphs, or particular environment could use or require a visual indication of a base character (such as a wavy underline) to show that it is part of a standardized variation sequence that cannot be supported by the current font.

The standardization or support of a particular variation sequence does not limit the set of glyphs that can be used to represent the base character alone. If a user requires a visual distinction between a base character and a variation sequence with that character as a base, then fonts must be used that make that distinction. The existence of a variation sequence does not preclude the later encoding of a new character with a distinct semantic and a similar or overlapping range of glyphs.

[Note: the term 'variation sequence' may be changed.]