[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17640: grep with -m reads the entire input

From: Jim Meyering
Subject: bug#17640: grep with -m reads the entire input
Date: Fri, 30 May 2014 09:34:55 -0700

On Fri, May 30, 2014 at 8:58 AM, Jim Meyering <address@hidden> wrote:
> On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <address@hidden> wrote:
>> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <address@hidden> wrote:
>>> With grep 2.18, the -m option would cause grep to stop reading input
>>> after printing the requested number of matching lines.  With version
>>> 2.19, grep reads the entire input before exiting.  Interestingly, grep
>>> does not read the entire input if the -c or -C0 options are added in
>>> addition to -m, and also when using -l or -q instead of -m.  I believe
>>> this is caused by commit 5122195.
>> Thanks a lot for the report.  Just in time.
>> I confirm that it's a bug introduced in 2.19.
>> To test, run "seq 1000000 > million", then
>>  "strace -e read grep 0 million" first using grep-2.18
>> (shows just a few read syscalls), and then with 2.19,
>> which shows grep reading the entire million-line file.
> Correction: to reproduce, you'll have to insert -m1 in that grep command.
>> Here's an incomplete patch.  Obviously there's a lot more
>> to be added, including NEWS and a nontrivial test. This
>> was introduced by commit v2.18-140-g6f07900

This bears some explanation.  I've attached a more complete patch
(albeit still hastily composed, so I'll wait a few hours,
in case there's feedback)

Prior to grep-2.19, with --max-count=N, this first disjunct would
be true after the Nth match, because pending would be 0:

          if ((!outleft && !pending) || (nlines && done_on_match))
            goto finish_grep;

However, a seemingly unrelated change affected how "pending" is set:

      pending = out_quiet ? 0 : out_after;

We used to ensure that "out_after" was non-negative, because
default_context was always non-negative:

      if (out_after < 0)
        out_after = default_context;

But the recent context-related change invalidated that assumption:

      -  default_context = 0;
      +  default_context = -1;

Here's the patch:

Attachment: 0001-grep-fix-max-count-N-m-N-to-stop-reading-after-Nth-m.txt
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]