From: Doug Ewell (doug@ewellic.org)
Date: Sun Jan 04 2009 - 17:40:47 CST
Hans Aberg <haberg at math dot su dot se> wrote:
> The idea would be that the hash code is for a file containing all
> information needed for its use, including typesetting - perhaps
> including some default glyph. Then the hash code should be rich enough
> making it unlikely that independently made private files have the
> same - they need then not check if it already exists. One then need
> some URL to search for it, but the URL need not be fixed - any one
> with search capabilities will suffice.
I understand the purpose of hash functions. They are valuable for
verifying data integrity, to make sure that a local file is genuine and
has not been corrupted or maliciously altered.
For a Unicode private-use agreement, however, a much more likely use
case is that the user (1) needs to know which agreement is in place and
(2) needs access to it. In this case, the hash value is useless without
the file itself. Verification against malicious tampering is unlikely
to be an issue, and there may be a legitimate reason for the private
agreement to be updated or otherwise changed, so that a mismatched hash
value does not indicate a genuine problem.
For the Ewellic alphabet, the "private agreement" is the relevant page
on the ConScript Unicode Registry Web site. This page contains
additional things like links and copyright notices and CSS style that
can be, and have been, changed without affecting the substance of the
private agreement. In this case, a hash value would be neither
necessary nor sufficient to identify the agreement.
> I am aware of that it is sort of is against the current Unicode
> principles. But the PUA characters, except for temporary private use,
> will otherwise be quite unusable. Especially if one gets a file a few
> years old, and it uses some private characters, it may be quite
> impossible to read it. By contrast, archiving is for these practical
> purposes unlimited. So if there is an convenient search method, it
> will be easy to read such a file.
Well, that is one of the risks associated with private-use. There is no
standard format for private agreements, and documents on the Web are not
exactly guaranteed to last forever. MUFI has a pretty well defined
private agreement, and ConScript has another, and some of the mapping
tables from Apple on the Unicode site constitute another. But there is
no standard index to these and no standard way to cite which one is in
use for a given document, and it seems unlikely that Unicode will get
involved in this.
The URL of a private agreement could be embedded directly within the
PUA-using document, or it could be stored in a short accompanying file
or made available from the Web site from which the document can be
downloaded. URL shorteners such as tinyurl.com and is.gd could be used
to reduce the overhead of storing these links, although not all programs
and users accept such URLs, for obvious security reasons.
-- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
This archive was generated by hypermail 2.1.5 : Sun Jan 04 2009 - 17:45:05 CST