Re: informative due to variation across langauges

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Jun 15 2001 - 19:28:34 EDT


Peter asked:

> It used to be that one could describe informative properties saying, "some
> properties are valid for most languages but not all and so are informative,
> such as case mappings".

This never really was the case, since from the moment that the UTC started
posting informative properties, there were some that had nothing to do
with language differences.

> Case mappings gave an easy example for why to have
> informative properties. Now that the mappings are informative (with
> normative exceptions listed in SpecialCasing.txt),

vice-versa, actually

> it's harder to give an
> easy explanation for why some properties are informative.

This comes down to the lack of what I call a "Character Properties Model"
for Unicode.

Asmus Freytag has been working on one side of this problem in an
as yet not public draft for UTR #23 "Survey of Unicode Character
Properties and Guidelines" that the UTC has been kicking around.

Chapter 4 *does* define normative and informative properties, but
does so in terms of what a claim of conformance to the property
means.

I think this is basically correct: normativity has to do with what
a claim of conformance means, rather than what kind of real-world
property we are dealing with. This is part of the reason why
a formerly informative property can change its status to become
normative.

>
> Can anyone think of other examples of informative properties that are so
> because the property is typical but not true for all languages?
>
> Can anyone give me a specific example of why Line Breaking or East Asian
> Width properties aren't normative?

Because no one is yet convinced that the specifics of either are
so widely agreed upon that the UTC would want to make
some strong claim about conformance to the particular properties
and their values for implementations of the behavior.

Put it another way, if someone claims that they are doing "Unicode
line breaking", are we yet ready to examine their line breaks
and declare them non-conformant if they make some different
choices than the informative values specified in LineBreak.txt?

On the other hand, if an API purports to be returning the
"Unicode General Property" of a character, and it returns
"Ps" instead of "Lo" for an ideograph at some version of Unicode,
I think we could now agree that that was a non-conformant API, even
though formerly both "Ps" and "Lo" were considered "informative"
values of the General Category.

--Ken

>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT