From: Mark Davis ⌛ (mark@macchiato.com)
Date: Fri Aug 07 2009 - 15:59:48 CDT
In http://unicode.org/glossary/
*
*We do define *Reserved Code Point* = *Unassigned Code Point = **Undesignated
Code Point.*
These are different from:
*Unassigned Character*. Synonym for *not assigned to an abstract character*.
This refers to surrogate code points, noncharacters, and reserved code
points. (See Section 2.4, Code Points and
Characters<http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf#G25564>
.)
Mark
On Fri, Aug 7, 2009 at 12:50, karl williamson <public@khwilliamson.com>wrote:
> I forgot to include the public list as a cc to this, which I am now doing,
> but perhaps it is better, as I realize that I'm confused about what reserved
> means. I thought from NamesList.txt that reserved characters were
> unassigned ones that were never going to be assigned because of some
> constraint on them, such as being place-holders. Like the following:
>
> 1D51D <reserved>
> x (black-letter capital z - 2128)
>
> where the code points around it are assigned, but this one essentially
> duplicates 2128, and so is skipped.
>
> But in looking at extracted/DerivedGeneralCategory.txt, it appears that
> reserved is any Cn code point that isn't a non-character.
>
> karl williamson wrote:
>
>> Kenneth Whistler wrote:
>>
>>> Karl Williamson wrote:
>>>
>>> ... I thought I should add some things I've been thinking about to make
>>>> sure I understand. Feel free to correct me.
>>>>
>>>> Each Unicode property is defined on a subset of the Unicode code points.
>>>> Many are defined on the complete set, but some are not, such as Name, as
>>>> for example, surrogates and private use code points have no name.
>>>>
>>>
>>> Actually Name *is* defined on the complete set. The values for
>>> the Name property are strings, and for reserved code points
>>> (and some other code point types), the value of the Name property
>>> is the null string.
>>>
>>> Since this has been confusing to a lot of people, the Unicode 5.2
>>> text about Unicode character names has been substantially updated
>>> to clarify this. See Section 4.8 Name--Normative in the Chapter 4
>>> pdf posted for review. (Accessible from the Unicode 5.2 beta
>>> page.)
>>>
>>>
>> It was helpful looking at the 5.2 draft. But it brought up another
>> question. I don't see anywhere in the UCD (except in NamesList.txt) any
>> mention of reserved code points. I don't see any way to distinguish between
>> these and code points that are otherwise unassigned, and not permanently
>> non-characters. Perhaps it is thought that that information is not
>> relevant, but the draft mentions "reserved-NNNN" as a possible identifying
>> string for such a code point. Again, perhaps it is assumed that only in the
>> text of the standard would anyone wish to make this distinction.
>>
>> It's unclear to me if in releases before the Unknown property value was
>>>> added to the Script property, what the definition was, if any, of code
>>>> points that didn't have any other of the Script property values (and
>>>> similarly for a number of other catalog properties).
>>>>
>>>
>>> The issue of default values is explained now in more detail
>>> in Section 4.2.8 Default Values in UAX #44. See the Unicode 5.2
>>> proposed update:
>>>
>>> http://www.unicode.org/reports/tr44/tr44-3.html#Default_Values
>>>
>>> As far as the default value of the Script property is concerned,
>>> before Script=Unknown was introduced, the Scripts.txt file itself
>>> defined Script=Common as the default value.
>>>
>>
>> I had overlooked this. But there are other examples in which there at one
>> time was no default value given, but now there is, like NaN for numeric
>> value. Was the default the null string for earlier releases, or was it just
>> undefined?
>>
>> [snip]
>>>
>>
>>
>
This archive was generated by hypermail 2.1.5 : Fri Aug 07 2009 - 16:01:26 CDT