Re: Ensure consistent order of find output

From: Bernhard Voelker
Subject: Re: Ensure consistent order of find output
Date: Tue, 28 Jan 2020 00:49:23 +0100
On 2020-01-27 01:16, James Youngman wrote:
> On Sun, Jan 26, 2020 at 3:23 PM Peng Yu <address@hidden> wrote:
>> I'd like to make sure the `find -printf '%P\n'` output of a directory
>> (i.e., only relative paths are printed) be consistent among different
>> runs as long as the file paths in the directory are the same.
>> I can pass the `find` output to `sort`. Is it the best way to do so? Thanks.
> With one tweak.   Use "LC_ALL=C sort" rather than "sort" because file
> names are not text.

To the OPs question: no, there are no options to sort or guarantee the
order of the output in any way.  Find is for finding, sort is for sorting.
One tool for one purpose. ;-)

A further note regarding security:

With `find -printf '%P\n'` you assume that file names do not contain
newlines ... which is usually true, but - depending on how the actual
post-processing is done - may be quite dangerous.  An attacker
might e.g. prepare this:

  # Create a directory ending on a newline, and 'etc/' inside it.
  $ mkdir -vp dir/bad$'\n'/etc

  # Then create a regular file inside.
  $ touch dir/bad$'\n'/etc/passwd

Now, running the above find in good faith, and a post-processing to remove
all regular files in 'dir' would do (here shown with 'echo'):

  $ find dir -type f -printf '%P\n' | LC_ALL=C sort | xargs -n1 echo rm -v
  rm -v /etc/passwd
  rm -v bad

(The 'xargs -n1' option is only used here to demonstrate that 'rm' would
really see 2 arguments, '/etc/passwd' and 'bad', instead a single argument:

So it's good habit to always assume the worst case.
Regarding the combination find|sort, this means to use:

  find ... -print0 | LC_ALL=C sort -z

and for the 'rm' example above:

  $ find dir -type f -print0 \
      | LC_ALL=C sort -z \
      | xargs -0r rm -v

Have a nice day,

