[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17245: GREP BUG: grep -P and binary files

From: Jim Meyering
Subject: bug#17245: GREP BUG: grep -P and binary files
Date: Sun, 13 Apr 2014 12:13:12 -0700

On Fri, Apr 11, 2014 at 4:47 PM, damon <address@hidden> wrote:
> Hi there -
> I recently noticed a bug after upgrading grep and have tracked it
> through a few versions now.
> I was using grep -P (PCRE grep) in some scripts to grep through
> directory of files, and the process would keep aborting with a
> segmentation fault.
> The last known good version is grep-2.14.  Every version after that has
> failed in a slightly different way, making me think this could be a bug
> in grep, not in pcre.
> I tried compiling greps 2.14 through 2.18 against the latest pcre
> library, pcre-8.33.  Here's what happens when i try each version against
> a random binary file, attached to this message as test-image.png.  This
> file was just one of many that caused the errors, though not every
> binary file does.
> Below are some results demonstrating what's going wrong.  Note that all
> of these seem to work fine with regular grep or with grep -E.  Please
> let me know what else i can do to help track this down!
> # grep-2.14/src/grep -P '\[.?max' test-image.png
> (works, does not match)
> # grep-2.18/src/grep -P '\[.?max' test-image.png
> Segmentation fault
> # grep-2.18/src/grep -P '.?ma' test-image.png
> Segmentation fault
> # grep-2.18/src/grep -P '.?m' test-image.png
> Binary file test-image.png matches

Thank you for the bug report.
That is due to a bug in libpcre.  I've confirmed that it is still
triggered even when using the latest grep.git linked with
the latest from pcre.git (latest commit has "Final tidies for
8.35 release." as the subject).  I built grep as usual, and
then ran this:

  rm src/grep; make LIB_PCRE=$PWD/../pcre/.libs/libpcre.a

Confirm that grep is not using a shared libpcre (this must print nothing):

  ldd src/grep|grep pcre

That presumes I had already built the latest pcre/ in ../pcre.
Then, run this to test it with a non-UTF8 locale, and it is
error-free, correctly finding no match:

  LC_ALL=ja_JP.eucJP valgrind src/grep -P '\[.?max' test-image.png

Repeat using a UTF8 locale, and you see that valgrind reports
numerous buffer overrun and heap-use-after-free errors:

  LC_ALL=en_US.utf8 valgrind src/grep -P '\[.?max' test-image.png

Here is an equivalent but much smaller test case:

  $ printf 'a\201b\r'|LC_ALL=en_US.utf8 valgrind src/grep -P 'a.?XXb'

That segfaults.  Interestingly, if I replace each X with a ".",
grep gets into an infinite loop within libpcre's match function.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]