From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Nov 29 2005 - 10:14:55 CST
From: "Antoine Leca" <Antoine10646@Leca-Marti.org>
> On Tuesday, November 29th, 2005 07:03Z, Chris Jacobs wrote:
>>
>> What happens when two files have different, but canonical equivalent,
>> file names?
>
> The operating system sees two different files (without any relationship
> one
> with the other), and you (the user, the "human") see two files with
> apparently the same handle to grasp them (the same name).
>
> My idea is that you are going to loose, so probably thou shalt not do
> that.
Why that? The user interface can disambiguate the "user-friendly" name by
displaying additional meta-data properties about any file, using for example
the URL-encoding syntax (starting by "file:"), if the name must be used in
secured program interfaces.
If a name can't be correctly decoded as valid UTF-8, or if itisdifferent
from its NFC form, or if it starts by "file:" I would suggest storing the
filename only with its URL-encoding syntax (starting by "file:"), and simply
avoid using any "shell escaping" mechanism (because they are not portable,
even on the same Unix/Linux system as it depends on the capabilities of the
Shell, and because the URL syntax is independant of the filesystem type
actually used).
My opinion is that the OS just needs to support the "file:" URL-encoding
mechanism natively in all its filesystem APIs (file opening, creation,
deletion, linking, dirent..., and all the problems caused by variable
interpretations of binary encodings of Unix filenames are definitely gone.)
This means that existing filenames that currently start by "file:" or by a
URL-encoding scheme must be given to this interface with "%2A" instead of
":".
There's absolutely NO need to override UTF-8.
This archive was generated by hypermail 2.1.5 : Tue Nov 29 2005 - 12:20:05 CST