[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Using .ignore file in find
From: |
Bernhard Voelker |
Subject: |
Re: Using .ignore file in find |
Date: |
Fri, 14 Feb 2025 00:40:33 +0100 |
User-agent: |
Mozilla Thunderbird |
On 2/13/25 09:16, Grigorii Sokolik wrote:
Dear findutils maintainers team!
In the software development world there is a widespread practice of using
ignore files. Examples are:
- .gitignore
- .dockerignore
- .clang-format-ignore
- ...
All of them have the same syntax and the same logic. It would be super
great if it would be possible to use the same syntax file in find for
listing ignored files or unignore files
Use cases:
# list all ignored files
find . -type f -name .gitignore | sed
's/\(.*\.gitignore\)/--match-patterns \1/' | xargs find . -type f
# list all files not violating the rules
find . -type f -name .my-fancy-system-ignore | sed
's/\(.*\.my-fancy-system-ignore\)/--ignore-patterns \1/' | xargs find . |
xargs add_file_to_my_fancy_system_context
What should I do to make it possible? I could try to implement it myself,
but I would like to discuss this feature first.
It's tempting to read in white- or black-lists via a file or file descriptor.
E.g. rsync(1) has --exclude-from=FILE and --include-from=FILE (and FWIW the
--cvs-exclude option).
Also grep(1) has --exclude-from=FILE.
What I'm worried about is being hooked on certain input file formats of
various seemingly similar systems:
While - as far as I know - CVS only allowed patterns for regular files,
.gitignore files allow:
- to distinguish in .gitignore files between files and directories
('configure' vs. '.deps/'),
- to bind the pattern to the current directory with a leading slash
('/ABOUT-NLS'),
- to specify to ignore files in certain subdirectories
('/xargs/test-sigusr'),
- and even to negate with a leading '!'.
Every system or tool will have their own subtleties.
Therefore, I'd rather not add support for specific ignore formats to find(1).
Eventually a --exclude-from=FILE in the style of grep(1) to filter globbing
patterns may make sense.
Re. this one:
> # list all ignored files
This seems to be feasible with:
$ find | git check-ignore --stdin
One can feed that into another find(1) with the rather new -files0-from option
for further tests; e.g. in my findutils directory, find all non-versioned files
which are older than 100 days ('check-ignore' seems to have problems with the
'gnulib' submodule, hence the prune-ing):
$ find -name gnulib -prune -o -print0 \
git check-ignore -z --stdin \
| find -files0-from - -mtime +100 -ls
7995557 48 -rw-r--r-- 1 berny users 49140 Nov 11 2022
./po/.reference/id.po
7995578 84 -rw-r--r-- 1 berny users 82251 Nov 28 2022
./po/.reference/ru.po
...
Have a nice day,
Berny