[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#15759: closed (regression in grep 2.15 with PCRE s

From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#15759: closed (regression in grep 2.15 with PCRE searches)
Date: Fri, 13 Dec 2013 18:34:03 +0000

Your message dated Fri, 13 Dec 2013 10:33:35 -0800
with message-id <address@hidden>
and subject line Re: bug#15758: grep 2.15 calls abort() on larger searches with 
has caused the debbugs.gnu.org bug report #15758,
regarding regression in grep 2.15 with PCRE searches
to be marked as done.

(If you believe you have received this mail in error, please contact

15758: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15758
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: regression in grep 2.15 with PCRE searches Date: Wed, 30 Oct 2013 13:23:10 -0400 User-agent: Mutt/1.5.21 (2010-09-15)

A user reported a regression in grep 2.15 which is easily reproducible
as ``grep -P foo /bin/mount''. The root cause is that pcre_exec is
returning PCRE_ERROR_BADUTF8 when the current locale supports UTF-8.
This is unhandled by grep and causes it to call abort().

I bisected the breakage to commit 67436786c110bb which essentially
introduces UTF-8 validation for all searched data. In a large number of
file hierarchies, one may easily hit this via a recursive search.

I crafted the following inline diff which fixes the problem. While I'm
not sure of its correctness, it at least describes one possible fix.

  diff --git a/src/pcresearch.c b/src/pcresearch.c
  index ad5999d..ce55ab3 100644
  --- a/src/pcresearch.c
  +++ b/src/pcresearch.c
  @@ -176,6 +176,9 @@ Pexecute (char const *buf, size_t size, size_t 
         switch (e)
           case PCRE_ERROR_NOMATCH:
  +        case PCRE_ERROR_BADUTF8:
             return -1;

           case PCRE_ERROR_NOMEMORY:


--- End Message ---
--- Begin Message --- Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P Date: Fri, 13 Dec 2013 10:33:35 -0800
On Tue, Nov 26, 2013 at 6:30 AM, Santiago <address@hidden> wrote:
> This bug was also reported in Debian ( http://bugs.debian.org/730472 ).
> Taking a look on it, I think the most suitable solution for the moment
> is to flag PCRE_NO_UTF8_CHECK instead of PCRE_UTF8, so
> PCRE does not check if inputs are UTF8 valid. Resulting behavior is
> similar to pre-grep-2.15. (See 15758-PCRE-no-check-UTF8.patch)

Thanks for the suggested patches and report.  Your first patch is
almost right.  The problem is that we cannot remove the PCRE_UTF8 flag.
If we did that, it would disable UTF-8, reverting an older fix.
See tests/pcre-utf8 for examples, or run this:

  printf '\342\202\254\n' | LC_ALL=en_US.UTF-8 src/grep -P '^\p{S}'

I've added a commit log, improved a related test and attached
a slightly different patch, but left you as the "Author".
I'll wait for an explicit ACK before pushing it.

With that, there is no need to handle PCRE_ERROR_BADUTF8
because that should not happen.

--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]