The following text excerpted from the Unicode Standard relates to Public Review Issue #5.
5.22 Default Ignorable Code Points
Default ignorable code points are those that should be ignored by default in rendering (unless explicitly supported). They have no visible glyph or advance width in and of themselves, although they may affect the display, positioning, or adornment of adjacent or surrounding characters. Some of the default ignorable code points are assigned characters, while others are reserved for future assignment.
An implementation should ignore default ignorable characters in rendering whenever it does not support the characters.
This can be contrasted with the situation for non-default ignorable characters. If an implementation does not support, U+0915 DEVANAGARI LETTER KA, for example, it should still not ignore it in rendering. Displaying nothing would give the user the impression that it does not occur in the text at all. The recommendation in that case is to display a "last-resort" glyph or a visible "missing glyph" box.
With default ignorable characters, such as U+200D ZERO WIDTH JOINER, the situation is different. If the program does not support that character, best practice is to ignore it completely without displaying a last-resort glyph or a visible box because the normal display of the character is invisible: its effects are on other characters. Because the character is not supported, those effects cannot be shown.
Other characters will have other effects on adjacent characters. For example:
- U+2060 WORD JOINER is an example of a character that does not produce a visible change in the appearance of surrounding characters; instead, its only effect is to indicate that there should be no line break at that point.
- U+2061 FUNCTION APPLICATION has no effect at all on display, and is only used in internal mathematical expression processing.
- U+00AD SOFT HYPHEN has a null default appearance: the appearance of therapist is simply "therapist"; no visible glyph. In linebreak processing, it indicates a possible intra-word break. At any intra-word break that is used for a line break — whether resulting from this character or by automatic process — a hyphen glyph (perhaps with spelling changes) or some other indication can be shown, depending on language and context.
This does not imply that default ignorable code points must always be invisible: an implementation can show a visible glyph on request, such as in a "Show Hidden" mode. A particular use of a "Show Hidden" mode is to show a visible indication of "misplaced" or "ineffectual" formatting codes. For example, this would include two adjacent U+200D ZERO WIDTH NON-JOINER characters, where the extra character has no effect at all.
The default ignorable unassigned code points lie in particular designated ranges. These ranges are designed and reserved for future default ignorable characters, to allow forward compatibility. All implementations should ignore all unassigned default ignorable code points in all rendering. Any new default ignorable characters should be assigned in those ranges, permitting existing programs to ignore them until they are supported in some future version of the program.
There are other characters that have no visible glyphs: the whitespace characters. These typically have advance-width, however. The line separation characters such as CR do not clearly exhibit this advance-width because they are always at the end of a line, but most GUIs show a visible advance width when selected.