Marco wrote:
>So, if William drops it, I will take the challenge -- at the risk of
>repeating things that others and myself already wrote.
>
>The PUA is (or might be) used for, e.g....
People, there are three distinct issues here:
1. Are there legitimate uses for the PUA?
2. How do I get software X to know how to process my PUA characters, or how
do I document my characters for others to understand my data?
3. Is there a need for some protocol to tag data (either internal to the
data, as William suggested, or as metadata) to a recipient know either what
my PUA characters mean, or where to find documentation that explains that?
I think there is no debate about 1. Marco and others have given lists of
valid scenarios.
Regarding 3, a variety of objections have been made to Williams suggestion:
- this is metadata and does not belong internal to the data
- use of PUA characters to create a protocol creates a circular problem of
documenting PUA usage and does not solve anything
- some type of markup protocol could be an appropriate mechanism for doing
this, but UTC will not establish this kind of protocol
- this is not the right forum to discuss higher-level protocols
I think that item 2 is the one thing that isn't getting discussed here, but
which is probably in greatest need of discussion.
>IMHO, It would be more interesting (and less impacting Unicode policies)
to
>discuss *what* this "PUA semantics" data could look like.
Bingo!
>Let me add that, however, all this subject is *not* exactly the
>highest-priority need that I ever heard. I personally can live even with
and
>"undefined PUA", and wouldn't spend my time in developing such a thing.
Lest we think this is unimportant, I will mention that I have heard of at
least one linguist who has created a hacked Unicode (rather than e.g.
hacked cp1252) font in order to get commercial software give the desired
shaping behaviour with their as-yet-unencoded characters. In this case, I
understand that they were given strong health warnings: "Don't give this to
anybody else lest we start getting garbage data disseminated." It won't
surprise me if these things start cropping up without those efforts to keep
it contained.
- Peter
---------------------------------------------------------------------------
Peter Constable
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:16 EDT