Re: Medievalist ligature character in the PUA

From: Julian Bradfield (jcb+unicode@inf.ed.ac.uk)
Date: Tue Dec 15 2009 - 13:17:23 CST

  • Next message: Michael Everson: "Re: Medievalist ligature character in the PUA"

    Asmus wrote:
    >On 12/15/2009 2:31 AM, Julian Bradfield wrote:
    >> On 2009-12-14, Michael Everson <everson@evertype.com> wrote:
    >>
    >>> On 14 Dec 2009, at 20:56, Julian Bradfield wrote:
    ...
    >> As Asmus has pointed out, the question then is, do you ask users to
    >> understand this, and magically know that two apparently different
    >> strings are actually the same?
    >>
    >This is where the disconnect is, and where you may be misquoting me. The
    >typical user knows a writing system but not the code sequence.
    >Programmers have tools that make code sequences visible to them, so they
    >can distinguish them. Correctly formatted and displayed, ordinary users
    >cannot tell the difference between alternative code sequences for the
    >same abstract character. That is as it should be, because what is
    >encoded is the abstract character.

    Yes - but how many users can distinguish the different abstract
    characters (Latin) o, (Greek) ο and (Cyrillic) о ? I certainly
    can't. Is this inherently different from the distinction between
    precomposed and combining characters?

    >Unix users have inherited the mess created by the design approach that
    >was based on "character set independence". That approach seemed a nice,
    >value-neutral way to handle competing character sets, until it became
    >clear that it would in many instances lead to the creation of
    >effectively uninterpretable byte-streams. Hence Unicode. But all of that
    >is, of course, history.

    I wonder why we didn't settle on IS2022 encoded filenames before
    Uniocde came along? Just because of the overhead? Or just because of
    the timeline of non-ASCII use of computers?

    >How the encoding relates an abstract character to code sequence(s), on
    >the other hand, is well defined in the Standard.

    But the definition of abstract character doesn't necessarily match
    what users think!

    -- 
    The University of Edinburgh is a charitable body, registered in
    Scotland, with registration number SC005336.
    


    This archive was generated by hypermail 2.1.5 : Tue Dec 15 2009 - 13:20:35 CST