bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22655: grep -Pz '^' now fails!


From: Stephane Chazelas
Subject: bug#22655: grep -Pz '^' now fails!
Date: Fri, 18 Nov 2016 12:52:29 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

Hello,

I'm just finding out that ^ and $ no longer work with grep -Pz:
https://unix.stackexchange.com/questions/324263/grep-command-doesnt-support-and-anchors-when-its-with-pz

$ grep -Pz '^'
grep: unescaped ^ or $ not supported with -Pz

Which points to this bug.

Note that, it's not that pcre doesn't support NULL-delimited
records, it's that grep calls pcre with the wrong flag
(PCRE_MULTILINE) which is like the m flag in /.../m perl RE
operator which is explicitely to tell ^ to match at the
beginning of the subject *but also after every newline* (same
for $).

As already noted at
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=16871#8

printf 'a\nb\0' | grep -Pz '^b'

did match which was a bug indeed, but only because of that
PCRE_MULTILINE flag. If you turned off that flag:

printf 'a\nb\0' | grep -Pz '(?-m)^b'

Then it wouldn't match.

With grep 2.10:

$ printf 'a\nb\0c\0' | grep -Poz '^.'
a
b    # BUG
c
$ printf 'a\nb\0c\0' | grep -Poz '(?-m)^.'
a
c

Or use \A and \z in place of ^ and $ that match at the beginning
of the subject regardless of the state of the "m" flag:

$ printf 'a\nb\0c\0' | grep -Poz '\A.'
a
c

Now with the new version, we need to use those \A, \z. Or if we
want to match at the beginning of any of the lines in a NUL
delimited record, we need ugly things like:

grep -Pz '(?:\A|(?<=\n))'

instead of

grep -Pz '(?m)^'


Can that bug please be reopened so it can be addressed
differenly (PCRE_MULTILINE removed, PCRE_DOLLAR_ENDONLY added)?

-- 
Stephane






reply via email to

[Prev in Thread] Current Thread [Next in Thread]