RE: Subject: Re: 32'nd bit & UTF-8

From: Lars Kristan (lars.kristan@hermes.si)
Date: Sat Jan 22 2005 - 11:22:40 CST

  • Next message: Lars Kristan: "RE: BOM in HTML"

    Philippe Verdy wrote:
    > > Not that I believe that, but I've been told to process UNIX
    > filenames
    > > as
    > > binary data. Guess the same is then true for Windows
    > filenames. Nice.
    >
    > You are completely wrong here!

    OK, who is wrong? I said I don't believe it. Whoever told me that must be
    wrong then.

    My understanding would be that filenames are text when there can be no
    invalid sequences or characters in them. I've just created a file containing
    U+FFFF on NTFS. Now tell me how to process the filenames using a conformant
    Unicode application. I cannot, hence I deduct that this is not text and
    should be treated as binary data.

    > Never assume, even on Unix, that filenames are binary safe.

    Let's get one thing straight. I can make every effort to feed text data when
    creating filenames, and act on any other restrictions. But if the filesystem
    cannot guarantee that it will only feed me text data, then I have no other
    option than to store the retrieved data in binary format. Or use a
    non-conformant application.

    Lars



    This archive was generated by hypermail 2.1.5 : Sat Jan 22 2005 - 11:25:21 CST