emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multibyte and unibyte file names


From: Paul Eggert
Subject: Re: Multibyte and unibyte file names
Date: Wed, 23 Jan 2013 10:08:25 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 01/23/13 09:45, Eli Zaretskii wrote:

>   if (srclen > 1
>       && IS_DIRECTORY_SEP (dst[srclen - 1]))
>     {
>       dst[srclen - 1] = 0;
>       srclen--;
>     }
> 
> If dst[] is an encoded string that uses a multibyte encoding, it is
> wrong to look at just the last byte of the string, because it could be
> a trailing byte of some multibyte sequence, right?

If memory serves, the answer to that question is different for
GNU / POSIX / etc (GNUish) systems than for MS-Windows systems.
On GNUish systems, the kernel doesn't know about encodings,
so the above code is correct for the file system even if
it produces a byte string that is not properly encoded for
the file name coding system.  On MS-Windows systems, as I
understand it, the operating system is cognizant of which
file name encoding you're using, so the above is indeed an error.

In practice nobody in the GNUish world uses encodings that
are unsafe for '/', so to some extent this is just a theoretical
issue in the GNUish world -- it just doesn't come up.

Unfortunately I don't understand the ins and outs of the
MSish side, or of the Tramp side, so I can't speak to how
that should work.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]