From: Peter Constable (petercon@microsoft.com)
Date: Mon Jan 22 2007 - 18:02:25 CST
Interesting. This isn’t obvious to me, though perhaps the character property model makes this clear. I presume you say #3 isn’t permitted because the stability policy includes this constraint wrt Noncharacter_Code_Point:
Unicode 3.1+
The Noncharacter_Code_Point property is an immutable code point property, which means that its property values for all Unicode code points will never change
The reason this isn’t obvious to me is that it’s not clear if Noncharacter_Code_Point as a property is perceived as uni-valued – i.e. a set that does not necessarily partition the code space – or as a binary-valued property that partitions the code space. If it is the former, then the open question is whether that is an open set to which new code points can be added.
If this is spelled out, where is it spelled out?
Peter
________________________________
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Mark Davis
Sent: Monday, January 22, 2007 2:05 PM
To: Eric Muller
Cc: Ruszlan Gaszanov; unicode@unicode.org
Subject: Re: Regulating PUA.
One further correction. The number of noncharacter code points is limited according to the Unicode stability policies (http://www.unicode.org/standard/stability_policy.html ), so #3 is also not permitted. However, if people needed a larger range of process-internal values, and didn't want to use private use codes, there is nothing to prevent them from using sequences of non-character code points.
Mark
On 1/21/07, Mark Davis <mark.davis@icu-project.org> wrote:
As Eric said, this is already provided for.
1. There are already 66 code points available for process-internal use, called noncharacters (see below)
2. It would be backwards incompatible for the consortium to make ANY change in PUA characters. There is, IMO, essentially zero chance of this happening. So it is not worth discussing any further.
3. If someone really wanted to propose additional noncharacter code points, on the other hand, that is certainly possible. (And as a reminder, NO proposal that is circulated on this list is taken up by the UTC unless a written proposal is submitted to http://www.unicode.org/reporting.html (or by unicode members via internal mechanisms).) One would have to make a very good case for the need, however.
FDD0..FDEF
#
Cn
[32]
FFFE..FFFF
#
Cn
[2]
1FFFE..1FFFF
#
Cn
[2]
2FFFE..2FFFF
#
Cn
[2]
3FFFE..3FFFF
#
Cn
[2]
4FFFE..4FFFF
#
Cn
[2]
5FFFE..5FFFF
#
Cn
[2]
6FFFE..6FFFF
#
Cn
[2]
7FFFE..7FFFF
#
Cn
[2]
8FFFE..8FFFF
#
Cn
[2]
9FFFE..9FFFF
#
Cn
[2]
AFFFE..AFFFF
#
Cn
[2]
BFFFE..BFFFF
#
Cn
[2]
CFFFE..CFFFF
#
Cn
[2]
DFFFE..DFFFF
#
Cn
[2]
EFFFE..EFFFF
#
Cn
[2]
FFFFE..FFFFF
#
Cn
[2]
10FFFE..10FFFF
#
Cn
[2]
Mark
On 1/21/07, Eric Muller < emuller@adobe.com <mailto:emuller@adobe.com> > wrote:
Ruszlan Gaszanov wrote:
> So, why don't we split the PUA into character-PUA (reserved for representing non-standard characters) and non-character-PUA (reserved for process-internal uses)?
This problem is already solved, using noncharacters: the last two
characters of each plane and U+FDD0..U+FDEF. See TUS 5, section 16.7,
page 549, or TUS 4, section 15.7, page 398.
Eric.
--
Mark
--
Mark
This archive was generated by hypermail 2.1.5 : Mon Jan 22 2007 - 18:05:10 CST