Re: Clones (was RE: Hexadecimal)

From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Mon Aug 18 2003 - 12:43:53 EDT

  • Next message: Philippe Verdy: "Re: [OT] Beer measurements (was: Re: Handwritten EURO sign)"

    On 18/08/2003 09:06, Jim Allan wrote:

    > Jill Ramonsky posted:
    >
    >> I would really like it if these, and
    >> every single other character which is "only there for reasons of
    >> round trip
    >> compatibility" with something else, were explicity marked in the
    >> machine-readable charts with something meaning "Don't introduce this
    >> character, at all, ever. Don't try to interpret it. Just preserve it, in
    >> case it ever gets turned back to its original character set".
    >
    >
    > That would probably be too strong.
    >
    > If characters are available then some people will use them. :-(
    >
    > See section 2.3 at http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf
    >
    > Unicode 3.0 contained under section D21 on compatibility characters:
    >
    > << Their use is discouraged other than for legacy data. >>
    >
    > I don't know whether this statement was intentionally removed was
    > accidently dropped in the changes in 4.0 which distinguish
    > "compatitiblity character" from "compatibility composite character".
    >
    > In any case people can't be prevent from doing things that are
    > officially discouraged, especially as for some particular use it might
    > be wrong to discourage them. So if you are handling Roman numerals in
    > an application and wish your handling to be complete then
    > unfortunately you do have to take the compatibility Roman numerals
    > into account.

    Yes, but people can be clearly discouraged from using them, and that is
    not currently happening. It seems that currently if you come across a
    character by browsing through the charts and want to discover if use of
    it is officially discouraged you have to wade through huge databases and
    hundreds of pages of text to find out if a particular set of properties
    implies that use is discouraged. Well, even that won't tell me
    definitively, for I read, "The compatibility decomposable characters are
    precisely defined in the Unicode Character Database, whereas the
    compatibility characters in the more inclusive sense are not." (from
    section 2.3) - and it is the latter whose use is discouraged. But is it
    in fact safe to assume that the list of such characters includes, but is
    not limited to, those which have defined compatibility mappings?

    It would be much simpler if each such character were clearly labelled in
    the code charts etc. DO NOT USE!, and with its glyph presented on a grey
    background or in some other way to indicate its special status.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Mon Aug 18 2003 - 13:09:04 EDT