L2/04-333

Proposed distribution format for the UCD

Eric Muller, Adobe Systems Inc.
August 5, 2004

Document History

The UCD is currently offered on http://www.unicode.org/Public as:

The main reason for this organization was to minimize the size of the UCD data, in part to make downloads easier. However, this approach also has a number of problems:

The overall proposal is to stop publishing new Update directories, and instead to publish each version of the UCD as a self-contained set of files.

Here are more specific details of this proposal:

The proposed layout is to have one subdirectory in http://www.unicode.org/Public for each release:

      4.0.0/
           ucd/
                ArabicShaping.txt
                BidiMirroring.txt
                ...

      4.0.1/
           ucd/
                ArabicShaping.txt
                BidiMirroring.txt  (same content as 4.0.0)
                ...
    

The ucd directories would contains all the UCD files for the corresponding releases, and hyperlinks between the files (represented as relative links) would be allowed.

The purpose of the new intermediate ucd directory is to provide a home for other data that is part of a release, such as specific versions of the UAXes or the code charts. Ultimately, the last published book plus the content of the directory for a release would form a complete definition of corresponding version of the standard. However, adding those components is not part of this proposal.

The UNIDATA entry would be retained, and be made to have the same content as the directory of the latest version (either by some linking/redirecting magic, or simply by having a copy of the same content).

We should also rebuild the directories corresponding to earlier releases, starting with 2.0.0 (it is just not worth going further back in time, and the data is not available in electronic form for all the 1.x releases).

We should also provide a ZIP file for each release, simply to facilitate http-based access. The proposal is to have one ZIP file named ucd-release.zip per release, placed in the directory for that release (that is, next to the ucd directory it contains). Filenames in that ZIP file would include directories, starting with the directory of the release; in other words, the file 4.0.1/ucd-4.0.1.zip would contain:

      4.0.1/ucd/ArabicShaping.txt
      4.0.1/ucd/BidiMirroring.txt
      ...
    

Document History

Author: Eric Muller

RevisionDateComments
1August 5, 2004

Initial version