Re: On the possibility of guidance code points for the Private Use Area

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Tue Apr 24 2001 - 07:54:23 EDT


Peter Constable wrote:

The biggest flaw, which thoroughly
undermines the ability of this system to work, is that your software has no
way to actually know whether I'm following these conventions or not.
Effectively, you're still dependent upon individual agreement between users
as to the meaning of PUA codepoints.

end quote

Would use of the sequence U+E880,U+E880,U+E880 help give the receiving
software a good idea that guidance code points were in use?

It would not be an absolute guarantee, but it would be an unlikely per
accidens combination to be received otherwise.

Peter Constable wrote:

There is absolutely no way to get around having some kind of metadata that
can tell you how to interpret my PUA data. William's suggestion merely
changes the nature of the metadata: instead of having to tell you the
semantics of my PUA codepoints, I have to tell you whether you whether I am
following these conventions, in which case I also need to tell you (via a
registry) where to find info on my semantics, and then (via my FTP or
similar repository) I tell you what my semantics are.

end quote

I am not suggesting that a piece of software trying to read a plain unicode
text document would need to look things up at a registry nor then access the
internet. Such a piece of software would just work using a local file.

I am suggesting that a software developer may be authoring a piece of
software that in use may well need to try to make sense of any private use
area codes that it receives. I am suggesting that that software author
would, as part of the software authoring process, seek out, perhaps using
this list as a starting point, or going straight to a known registry, and
gather together details of the one or more registries that existed at that
time. The software author would then visit on the web such of those
registries as he or she chose to acknowledge in his or her work and
incorporate details of what he or she found at those registries in the inner
workings of his or her software package. For example, that perhaps U+E807
refers to U+E000 to U+E04FF only and that when a character code in the range
of U+E000 to U+E04FF is received when those blocks refer to the registry
specified by the U+E807 character that the font fontE807.ttf be used to
display the character. The software author would make provision for each of
the registries that he or she chose to acknowledge. Once the software were
made available to the user community, it would simply be a matter of the
software package accepting a plain unicode text file and displaying it,
based solely on the information about the usage of the private use area
built into the software package. Only people devising new characters or
software authors need be in contact with any of the registries. An end user
simply using the characters would simply use the software package, possibly
also needing to make sure that the appropriate font files were in the font
directory of the end user's computer.

William Overington

24 April 2001



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT