--- Begin Message ---
Subject: |
grep with -m reads the entire input |
Date: |
Fri, 30 May 2014 01:45:19 -0400 |
With grep 2.18, the -m option would cause grep to stop reading input
after printing the requested number of matching lines. With version
2.19, grep reads the entire input before exiting. Interestingly, grep
does not read the entire input if the -c or -C0 options are added in
addition to -m, and also when using -l or -q instead of -m. I believe
this is caused by commit 5122195.
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#17640: grep with -m reads the entire input |
Date: |
Fri, 30 May 2014 09:34:55 -0700 |
On Fri, May 30, 2014 at 8:58 AM, Jim Meyering <address@hidden> wrote:
> On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <address@hidden> wrote:
>> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <address@hidden> wrote:
>>> With grep 2.18, the -m option would cause grep to stop reading input
>>> after printing the requested number of matching lines. With version
>>> 2.19, grep reads the entire input before exiting. Interestingly, grep
>>> does not read the entire input if the -c or -C0 options are added in
>>> addition to -m, and also when using -l or -q instead of -m. I believe
>>> this is caused by commit 5122195.
>>
>> Thanks a lot for the report. Just in time.
>> I confirm that it's a bug introduced in 2.19.
>> To test, run "seq 1000000 > million", then
>> "strace -e read grep 0 million" first using grep-2.18
>> (shows just a few read syscalls), and then with 2.19,
>> which shows grep reading the entire million-line file.
>
> Correction: to reproduce, you'll have to insert -m1 in that grep command.
>
>> Here's an incomplete patch. Obviously there's a lot more
>> to be added, including NEWS and a nontrivial test. This
>> was introduced by commit v2.18-140-g6f07900
This bears some explanation. I've attached a more complete patch
(albeit still hastily composed, so I'll wait a few hours,
in case there's feedback)
Prior to grep-2.19, with --max-count=N, this first disjunct would
be true after the Nth match, because pending would be 0:
if ((!outleft && !pending) || (nlines && done_on_match))
goto finish_grep;
However, a seemingly unrelated change affected how "pending" is set:
pending = out_quiet ? 0 : out_after;
We used to ensure that "out_after" was non-negative, because
default_context was always non-negative:
if (out_after < 0)
out_after = default_context;
But the recent context-related change invalidated that assumption:
- default_context = 0;
+ default_context = -1;
Here's the patch:
0001-grep-fix-max-count-N-m-N-to-stop-reading-after-Nth-m.txt
Description: Text document
--- End Message ---