From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Apr 30 2004 - 20:40:14 EDT
----- Original Message -----
From: "Ernest Cline" <ernestcline@mindspring.com>
To: "Kenneth Whistler" <kenw@sybase.com>; <peterkirk@qaya.org>
Cc: <unicode@unicode.org>; <kenw@sybase.com>
Sent: Saturday, May 01, 2004 1:42 AM
Subject: Re: An attempt to focus the PUA discussion [long]
>
> > [Original Message]
> > From: Kenneth Whistler <kenw@sybase.com>
> >
> > On the other hand, I could not expect any software doing
> > Unicode normalization to pay any attention to *my* interpretation
> > of those equivalences, and if I really wanted to process data
> > using such equivalences, it would be up to me to write the
> > software to do so.
>
> Decompositions and canonical combining classes are the
> two things that affect normalization, and are why Unicode
> limits changes to these two to be made only in an upwardly
> compatible manner. This is what makes assigning those
> properties to private use characters so tricky.
As far as I know, the stability of normalization is important only for
interchange of data using and assuming the same standard Unicode conventions.
This is not fondamental for PUAs which are used with private conventions, using
agreements between users so that they can at the same time use their own
normalization.
Stibility of PUAs will be guaranteed only for applications that don't handle
PUAs or treat them with the Unicode default properties. If someone needs to
assign new diacritics or now decomposable characters or new precomposed
characters in PUAs, and handle them with their own normalization, this should be
OK.
After all, this is what many fonts do everyday: they assign internally some
codes to create ligatures or recognize variant forms, and these new private
"characters" are internally mapped to PUAs, using their own normalizations. As
the resulting string of reordered and rearranged glyphs will not be interchanged
but only used locally to render a text graphically, this already falls within
what is allowed in PUAs.
These fonts (and the text layout engines that use them) don't care about the
normative default properties of PUAs as they really use them with the properties
they want (joining types, case mappings for special styles like SmallCaps,
mirrored characters, bidirectional properties, etc... are freely changed from
the default assignment in Unicode, and GSUB tables can also be viewed as a
normalization step performed by renderers to translate a series of standard
Unicode points into a string of glyph ids, whose value generally match the
standard code point to represent or a PUA codepoint).
The default combining class 0 of PUAs is necessary in Unicode so that an
application that does not know their contextual semantic will not attempt to
reorder them through the standard normalization algorithm. But I don't think
there's a limitation for applications that would use PUAs contextually, using
other combining classes.
So for me all PUAs can be decomposable and reorderable within the private
convention that defines a private semantic for them, and it's not the
responsability of Unicode to forbid it, and Unicode does not need to inspect
what is in such private convention.
This archive was generated by hypermail 2.1.5 : Fri Apr 30 2004 - 21:12:00 EDT