L2/02-178
To: | UTC |
Re: | Terminology for types of code points |
From: | Ed Committee |
Date: | 2001-04-26 |
The editorial committee is seeking feedback from the UTC on a matter of terminology, having to do with types of code points.
There are the following main types of code points that we need to distinguish:
Notes:
There are different unions of these sets that are used often, and need their own names. Plus at least 'Open' above needs a good name.
We then have terminology that we have used imprecisely in the past:
Assigned | In the UCD docs, Cn is equated to this |
Also used for #1-#5 (e.g. non-open) | |
Also used for #1-#3 (e.g. code points not assigned to characters) | |
Unassigned | Inverse of Assigned |
Scalar Value | Code point |
Nonsurrogates | |
Reserved | Surrogate, Noncharacter, Open in 10646, with different adjectives qualifying the different groups. |
In coming up with names, we also need to make sure that negations are reasonable: that nonX means all code points that are not X.
The question is, what terms should we choose for A-E. There are a couple of different possible positions:
Term | Position1 | Position2 |
---|---|---|
A: all but surrogates | Nonsurrogate code point | Scalar Value code point |
B: Surrogate, Noncharacter, Open | Noncharacter code point | Unassigned code point |
C: Normal, Format, PUA | Character code point | Assigned code point |
D: Open, Noncharacter | ??? code point | ??? code point |
E: Open | Unassigned code point | Nondesignated code point |
not Open | Assigned code point | Designated code point |
Noncharacter | Internal-Use code point (Infernal-Use ;-) | Noncharacter code point |
Surrogates | Surrogate code point | Surrogate code point, Nonscalar value code point |
Feedback from the UTC would be most appreciated as to which of these choices would be the most reasonable and least confusing. The committee is not really wonderfully happy with either of these sets of terms; it is very open to different suggestions for a cohesive set!