Re: Where is the First> Last> convention documented?

From: Eric Muller (emuller@adobe.com)
Date: Thu Sep 06 2007 - 12:39:04 CDT

  • Next message: Mark Davis: "Re: [icu-support] complete binary/utf mapping"

    Stephane Bortzmeyer wrote:
    > This leads to another issue in the database format, which I prefer to
    > discuss here first: why are they ranges in UnicodeData.txt rather than
    > explicit records for every character? Being explicit would avoid
    > generating names for the implicit records (something which is not
    > obvious and not well documented, IMHO).
    >
    > Or, a variant, why not a DervivedUnicodeData.txt file with the
    > all the characters?
    >

    Just in case, we currently have a Public Review Issue for an alternate
    representation of the UCD (PRI 109,
    <http://www.unicode.org/review/pr-109.html>). The main goal of that
    representation is to take advantage of the current hardware to require
    as little intelligence as possible from the user of the representation
    (remember that the typical machine at the time the UCD was designed was
    characterized by MHz, kBytes of main memory and MBytes of disk). In
    particular, a full representation of the UCD would/could have all the
    character names explicitly listed, among may other things.

    Personally, I think we should invest on that representation rather than
    keep tweaking the current one.

    Eric.



    This archive was generated by hypermail 2.1.5 : Thu Sep 06 2007 - 12:43:48 CDT