bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#56469: 29.0.50; Unibyte dir in directory_files_internal


From: Stefan Monnier
Subject: bug#56469: 29.0.50; Unibyte dir in directory_files_internal
Date: Sun, 10 Jul 2022 10:58:30 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Eli Zaretskii [2022-07-10 17:32:17] wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: 56469@debbugs.gnu.org
>> Date: Sun, 10 Jul 2022 10:23:28 -0400
>> 
>> W.r.t to the comment, it's indeed unrelated to the patch (other than
>> the fact that it touches the same code).  The question is when we do:
>> 
>>        finalname = (nchars == nbytes)
>>                    ? make_uninit_string (nbytes)
>>                    : make_uninit_multibyte_string (nchars, nbytes);
>> 
>> the actual bytes are "decoded" (i.e. in our internal UTF-8 encoding), so
>> (nchars == nbytes) checks whether its "pure ASCII" or not and if it's
>> pure ASCII we return a unibyte string.
>
> I don't think this is true, because early during startup we don't yet
> have the coding-systems set up, and so the file names are unibyte and
> undecoded.  So that place in dired.c doesn't only handle ASCII when it
> sees that ncahrs == nbytes.

Hmm... the early startup is actually not a worry here (according to my
tests `directory_files_internal` is first called when we get to
native-compile the macroexp/bytecomp, at which point all our coding
systems have been setup).

But indeed, if the file name coding system is something like `binary`,
DECODE_FILE will always return a unibyte string, so we may have non-ASCII
bytes when (nchars == nbytes).
Thanks, I'll update the comment accordingly.


        Stefan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]