bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22655: grep -Pz '^' now fails!


From: Paul Eggert
Subject: bug#22655: grep -Pz '^' now fails!
Date: Fri, 18 Nov 2016 15:37:16 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

Stephane Chazelas wrote:
2016-11-18 08:48:04 -0800, Paul Eggert:
Stephane Chazelas wrote:
Why would it make it slower. AFAICT, PCRE_MULTILINE *adds*
some overhead.

As I understand it, PCRE_MULTILINE lets 'grep' apply a pattern to an
entire buffer that contains many lines, and this lets PCRE
efficiently find the first match in the whole buffer. If grep
doesn't use PCRE_MULTILINE, grep would have to apply the pattern to
each line separately, which could be significantly slower.
[...]

That might have been the case a long time ago, as I remember
some discussion about it as it explained some wrong information
in the documentation, but as far as I and gdb can tell, grep
2.26 at least call pcre_exec for every line of the input with
grep -P.


Although that was true starting with commit a14685c2833f7c28a427fecfaf146e0a861d94ba (2010-03-04), it became false starting with commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5 (2014-09-16).

If it didn't

echo test | grep -P '\n$'

would match.

No, because grep omits the trailing newline in that particular input. And for this example:

printf 'test\n\n' | grep -p '\n$'

grep passes "test\n" to jit_exec, determines that jit_exec returns a match that crosses a line boundary, and rejects the match.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]