>Meanings to the code positions in the Private Use area shall not be
assigned.
>
>WG2 and UTC are adamant about this.
If I understand him correctly, William's not suggesting that UTC or WG2
assign meanings to PUA codepoints. Rather, he's talking about a
non-UTC/WG2-sanctioned agreement among *users* to assign particular
meanings to a block of PUA codepoints, and the establishment of a
user-owned registry to define meanings of those codepoints.
William is certainly touching on an important issue: how does your software
know how to interpret my PUA codepoints. I commend him for thinking about
the issue, and his thinking outside the box. I don't think I or SIL would
buy into his suggestion, however. The biggest flaw, which thoroughly
undermines the ability of this system to work, is that your software has no
way to actually know whether I'm following these conventions or not.
Effectively, you're still dependent upon individual agreement between users
as to the meaning of PUA codepoints.
There is absolutely no way to get around having some kind of metadata that
can tell you how to interpret my PUA data. William's suggestion merely
changes the nature of the metadata: instead of having to tell you the
semantics of my PUA codepoints, I have to tell you whether you whether I am
following these conventions, in which case I also need to tell you (via a
registry) where to find info on my semantics, and then (via my FTP or
similar repository) I tell you what my semantics are.
With or without the conventions and registry William is suggesting, the
real issue still isn't addressed: in what form do I communicate to you what
my PUA codepoints mean. I can request a "guidance codepoint" for me in some
registry, but we still need to figure out exactly what info I need to
provide you with, and in what form it is organised. Rather than go through
the bother of having guidance codepoints and a registry, which really
accomplish nothing, it would be better off if we simply work on determining
exactly what info on PUA codepoints is needed for others to make sense of
one's data, and what the best format is. This needs to be done by agreement
between two users anyway. There still is no guarantee that everyone will
use it, but again we can't get around you and me needing to agree on how we
communicate this info to one another. Rather than bother with the guidance
codepoints, etc. I'd rather just provide you with a database containing the
semantics of my PUA codepoints in some format that we've agreed upon (or,
at least, that I've documented). As suggested in a recent message on this
list (I think it was this list; maybe it was the OpenType list), I think
OLAC is a very good context in which to work out what info is needed and
how it should be organised (I envision an XML schema).
It may (but may not) be appropriate for UTC to eventually publish a TR that
provides recommendations regarding what users should document regarding PUA
codepoints they use and how it should be documented. If that were to ever
happen, though, probably the best we could expect is that it be informative
rather than a normative part of the standard.
- Peter
---------------------------------------------------------------------------
Peter Constable
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT