Re: Private-use agreements (was: Re: Emoji: emoticons vs. literacy)

From: Hans Aberg (haberg@math.su.se)
Date: Mon Jan 05 2009 - 01:48:50 CST

  • Next message: Erkki I. Kolehmainen: "VS: Emoji & compatibility characters"

    On 5 Jan 2009, at 00:40, Doug Ewell wrote:

    > I understand the purpose of hash functions. They are valuable for
    > verifying data integrity, to make sure that a local file is genuine
    > and has not been corrupted or maliciously altered.
    >
    > For a Unicode private-use agreement, however, a much more likely use
    > case is that the user (1) needs to know which agreement is in place
    > and (2) needs access to it. In this case, the hash value is useless
    > without the file itself.

    But one is using the hash value to search for the file. Then, when the
    file has been found, it can be verified for integrity, and contain
    whatever information that is deemed important.

    > Verification against malicious tampering is unlikely to be an issue,
    > and there may be a legitimate reason for the private agreement to be
    > updated or otherwise changed, so that a mismatched hash value does
    > not indicate a genuine problem.

    I did not think about he updating problem. Then the hash value can
    only be valid for a part of the file that is not updated.

    > For the Ewellic alphabet, the "private agreement" is the relevant
    > page on the ConScript Unicode Registry Web site. This page contains
    > additional things like links and copyright notices and CSS style
    > that can be, and have been, changed without affecting the substance
    > of the private agreement. In this case, a hash value would be
    > neither necessary nor sufficient to identify the agreement.

    The problem is that then the text file is only readable as long as
    this site is kept up-to-date.

    >> I am aware of that it is sort of is against the current Unicode
    >> principles. But the PUA characters, except for temporary private
    >> use, will otherwise be quite unusable. Especially if one gets a
    >> file a few years old, and it uses some private characters, it may
    >> be quite impossible to read it. By contrast, archiving is for these
    >> practical purposes unlimited. So if there is an convenient search
    >> method, it will be easy to read such a file.
    >
    > Well, that is one of the risks associated with private-use. There
    > is no standard format for private agreements, and documents on the
    > Web are not exactly guaranteed to last forever. MUFI has a pretty
    > well defined private agreement, and ConScript has another, and some
    > of the mapping tables from Apple on the Unicode site constitute
    > another. But there is no standard index to these and no standard
    > way to cite which one is in use for a given document, and it seems
    > unlikely that Unicode will get involved in this.
    >
    > The URL of a private agreement could be embedded directly within the
    > PUA-using document, or it could be stored in a short accompanying
    > file or made available from the Web site from which the document can
    > be downloaded. URL shorteners such as tinyurl.com and is.gd could
    > be used to reduce the overhead of storing these links, although not
    > all programs and users accept such URLs, for obvious security reasons.

    Again, then the text files are only readable as long these URLs are
    kept up-to-date. This might still good for temporary use. But it
    lessens the use for private characters that should be of general use.
    And anything that isn't automatic is likely to not be of much use
    these days - perhaps PUA characters might be used as a new form of
    cryptography :-).

       Hans



    This archive was generated by hypermail 2.1.5 : Mon Jan 05 2009 - 01:50:33 CST