There was a problem in the data that required us to back out a
change in Unicode 6.3. We had added characters to Uppercase, but
not to Alphabetic. That breaks a stability constraint.
* All characters with the Lowercase property and all characters
with the Uppercase property have the Alphabetic property .
This document is about addressing the point-of-failures.
1. Fixing Derivation.
I tracked down the
technical point of failure with Alphabetic.
We have the following
definitions:
# Derived Property:
Uppercase
# Generated from: Lu +
Other_Uppercase
# Derived Property:
Lowercase
# Generated from: Ll +
Other_Lowercase
# Derived Property:
Alphabetic
# Generated from:
Lu+Ll+Lt+Lm+Lo+Nl + Other_Alphabetic
Yet we require that
Alphabetic ⊇ Uppercase and Alphabetic ⊇ Lowercase.
Therefore, I'm planning to propose that the UTC change the
derivation to:
# Derived Property:
Alphabetic
# Generated from:
Uppercase+Lowercase+Lt+Lm+Lo+Nl + Other_Alphabetic
That solves a general
problem for the future.
2.
Clarifying Casing Stability.
There is
the separate issue about casing pairs for the particular
characters that exposed the problem, and that the UTC needs
to consider. Looking at
http://www.unicode.org/policies/stability_policy.html#Case_Pair,
all we guarantee is that for existing characters, case pairs
cannot be broken or formed. That principle does not prevent
us from adding new characters, and forming case pairs with
them. However, a careful look at the previous principle (http://www.unicode.org/policies/stability_policy.html#Case_Folding)
shows that that can only happen for (a) a pair of characters
that are both new, or for (b) where the uppercase version is
new, not the lowercase.
Editorially, I think it
would be clearer if we made that very clear on the stability
page, by changing:
I'd propose we make this
editorial change.
3. Fixing Invariant
Test.
The problem was
caught by my invariant tests. However, it was caught
very late in the process. From a process standpoint,
early in the BRS process we should make sure to run the
invariant tests, and review and patch the test until it
passes. That would have caught this kind of problem much
more quickly. (I don't think those tests are taken quite
as seriously as they could be (nobody but me has
reviewed them, for example). However, they are handy for
catching problems!) For
#3, I do not currently have an invariant test that
checks for the introduction of a new casing pair that
would be disallowed by #3. I suggest that I be given an
action to add one.