emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multibyte and unibyte file names


From: Stefan Monnier
Subject: Re: Multibyte and unibyte file names
Date: Sun, 27 Jan 2013 20:55:16 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)

>> > OK, but as long as file-name primitives are required to support
>> > unibyte strings, you cannot be sure these situations won't pop up in
>> > the future.
>> I don't see a need to disallow unibyte strings, but I don't see the need
>> to be particularly careful about it either.  Basically Elisp code which
>> provides unibyte file names does it at its own risks.
> What about C code that calls these primitives?  Can we consider every
> such instance a bug in the caller?

Most likely, yes.

>> But that's exactly the behavior stipulated by POSIX (tho for '/' rather
>> than '\\').  I.e. if you use file names on a POSIX host with
>> a coding-system that occasionally uses '/' within its multibyte
>> sequences, you'll get those surprises regardless of Emacs.  And for that
>> reason, Emacs would be right to cut those file names in the middle of
>> a multibyte sequence.
> Then why did you regard this:
>  (let ((file-name-coding-system 'cp932))
>    (expand-file-name "่กจ" "C:/"))
>   => "c:/\225/"
> as a bug?

Because expand-file-name works on Emacs strings, not on
file-system strings.

>> And since Emacs is largely based on "POSIX semantics for the generic
>> code, plus an emulation layer in w32.c", we have a problem of subtly
>> incompatible semantics.
> Maybe so, but it certainly isn't the only place in Emacs with subtly
> incompatible semantics.  And anyway, I don't see how this observation
> helps to decide what, if anything, to do to fix this.

It helps me understand the problem, at least.
Maybe it also points out that we might like to change the interface so
that generic code does not encode strings before passing them to the
OS-specific primitives.

>> Could you specify a bit more precisely which primitives you have
>> in mind?
> Those in fileio.c and in dired.c.  I could give an explicit list, if
> you want.

At least I disagree with your Ffile_name_directory suggestion: if the
file-name is already encoded and it results in bugs, the fix should be
in the caller.


        Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]