bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bizzare bug in find, potential security implications


From: Eric Blake
Subject: Re: Bizzare bug in find, potential security implications
Date: Tue, 19 Dec 2017 15:48:20 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0

On 12/19/2017 03:31 PM, Bernhard Voelker wrote:


The test case in your attachment is a bit different, but also shows
the problem.  It seems that gnulib's regex does not find a match for
the pattern '.*\.exe$' for the files in the following directory:

   $ LC_ALL=C /usr/bin/ls -log htdocs
   ...
   drwxr-xr-x 2 4096 Dec 18 20:45 'Zielona G'$'\363''ra'
   ...

I'm not an expert on UTF and regex, but it seems that the $'\363'
character is not matched by the dot '.' meta character in your
locale.

POSIX says that regex only has to match characters (in particular, the glob '.' matches characters, not encoding errors). If you pick a locale with multibyte characters that are subject to encoding errors when processing random bytes (as is the case when using a UTF-8 locale to process single-byte ISO filenames), then POSIX says regex behavior is undefined. So while it is indeed annoying that find can't match files with encoding errors, it is somewhat expected behavior, because there's no sane way to make regex well-specified on encoding errors.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]