Re: Roundtripping in Unicode

From: John Cowan (jcowan@reutershealth.com)
Date: Tue Dec 14 2004 - 11:47:43 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Roundtripping in Unicode"

    Peter Kirk scripsit:

    > I think the problem here is that a Unix filename is a string of octets,
    > not of characters. And so it should not be converted into another
    > encoding form as if it is characters; it should be processed at a quite
    > different level of interpretation.

    Unfortunately, that is simply a counsel of perfection.

    Unix filenames are in general input as character strings, output as character
    strings, and intended to be perceived as character strings. The corner
    cases in which this does not work are not sufficient to overthrow the
    power and generality to be achieved by assuming it 99% of the time.

    (A private correspondent has come up with an ingenious trick which
    depends on being able to create files named 0x08 and 0x7F, but it
    truly is a trick, and in any case depends only on an ASCII interpretation.)

    -- 
    Income tax, if I may be pardoned for saying so,         John Cowan
    is a tax on income.  --Lord Macnaghten (1901)           jcowan@reutershealth.com
    


    This archive was generated by hypermail 2.1.5 : Tue Dec 14 2004 - 11:50:43 CST