From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Mar 31 2004 - 04:36:48 EST
From: "Michael Everson" <everson@evertype.com>
> At 17:02 -0800 2004-03-30, Mike Ayers wrote:
> >I feel obligated to take this one step further - these folks are
> >forgetting that "P" stands for "private". Their use of this space
> >is their own problem, in all senses. It does not seem reasonable to
> >me that *any* standard behavior could be expected of PUA code
> >points, from operating systems or applications, as such may have
> >chosen to, or may yet choose to, use those code points to
> >encapsulate very un-font-rendering-like behavior, and such a
> >decision, made past, present or future, is a perfectly valid private
> >use.
>
> Which I assume means: "it's wrong for Unicode to make ANY property
> pronouncements for ANY PUA characters, since that defines them, and
> removes the P from the Use."
Do you mean here that any properties currently defined in Unicode for PUAs
should be deprecated with their current normative value, and left to
implementers, so that no application can be said non-conforming if it implements
other defaults?
May be this would require some adjustments in the normative wordings related to
Unicode conformance...
And as well, variant selectors, if they are used on PUAs should not be
constrained as well (the current restrictions for variant selectors usage should
not apply to PUAs as well, given that a VSn should still be fully ignorable
including for PUAs that have no defined normative semantic in Unicode, meaning
that the combination of PUA+VSn has also no defined normative semantic in
Unicode itself).
Leave that for implementations, and may be we'll ease the development of new
scripts, by allowing other groups to work on some interchangeable formats based
on PUAs, which could then be later integrated in Unicode after an easier phase
where these scripts would have been experimented. It would ease the adoption of
a later consensus, and would offer a great tool for developers and searchers,
that could safely base their work based on Unicode encoding conventions
Also this would be a good indicator that specialized 8-bit code sets are no
longer necessary, and IANA could then close its 8-bit encodings registry, in
favor of PUA-based encodings defined by some conventional rules which could then
become a standard and open extension mechanism...
This will have the advantage of avoiding pressures on Unicode to normalize new
scripts too fast, and longer open experimentations would avoid many future
errors in the new normalized scripts.
The CSUR registry is one approach for the definition of new scripts, SIL.org has
its own, but for now I see little efforts to allow specifying these properties
in a partially interchangeable format, and one reason can be that Unicode has
made too many restrictions on the usage of PUAs, so that developers fear that
their protocols which need them become non conforming.
I do think that there must exist a way to have PUAs used safely without
ambiguities or risks of collisions, using extensions mechanisms similar to
namespaces in XML, and some normative declarations and possibly a registry of
PUA sets (why not the IANA charsets registry if it can reference the associated
properties with some URL to a script definition schema?).
This archive was generated by hypermail 2.1.5 : Wed Mar 31 2004 - 05:22:36 EST