--- Begin Message ---
Subject: |
grep -v -l and -v -L fail to early terminate |
Date: |
Wed, 7 May 2014 00:15:48 +0200 |
Hi,
i have a bunch of very big files which _should_ follow a simple line format.
I spotted some errors in these files and now want to search for files
containing at least one line violating the specified format.
As soon as such a line is found grep could terminate, but it doesn't seem to.
The use case i describe is neither plain -l (--files-with-matches) nor -L
(--files-without-match), it's rather --files-with-at-least-one-non-match.
I tried grep -v -l and this seems to work but doesn't do -l's early termination
:(.
Jörn
Toy example for line format 'a b c d':
# insert offender as first line:
echo 'a c d c' > test.tmp
# insert many valid lines (warning ~ 70 MB file):
awk 'BEGIN { for (i=0 ; i < 10000000 ; i++) print "a b c d" }' >> test.tmp
# run grep:
time grep -v -l 'a b c d' test.tmp
test.tmp
real 0m2.758s
user 0m2.692s
sys 0m0.060s
# counter example which is very fast (matches the 2nd line and quits):
time grep -l 'a b c d' test.tmp
test.tmp
real 0m0.032s
user 0m0.000s
sys 0m0.032s
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#17427: grep -v -l and -v -L fail to early terminate |
Date: |
Thu, 08 May 2014 16:03:11 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 |
Thanks for the patch. I tweaked its commit message (see first
attachment). While reviewing it I found opportunities to clarify and/or
simplify related code, so I did that too (see second attachment). Both
are installed and I am marking this bug report as done.
0001-grep-improve-performance-of-v-when-combined-with-L-l.patch
Description: Text document
0002-grep-simplify-and-clarify-invert-related-code.patch
Description: Text document
--- End Message ---