bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales

From:	Vincent Lefevre
Subject:	bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Date:	Sat, 20 Dec 2014 02:23:27 +0100
User-agent:	Mutt/1.5.23-6371-vl-r75100 (2014-11-04)

On 2014-09-12 03:24:49 +0200, Vincent Lefevre wrote:
> Timings with the Debian packages on my personal svn working copy
> (binary + text files):
> 
> 2.18-2   0.9s with -P, 0.4s without -P
> 2.20-3  11.6s with -P, 0.4s without -P

I've done another test on a large PDF file. Let's forget grep 2.18,
which is indeed too buggy (I could reproduce a buffer overflow). But
let's compare with pcregrep, using the "zzz" pattern:

Debian grep 2.20-3      6.64s (with -P)
Upstream grep 2.21      5.39s (with -P)
Debian pcregrep 8.35    0.71s

In all cases, PCRE is used, but pcregrep is much faster than grep -P.

(Note: on this example, "grep" alone is much faster than pcregrep,
but this is not related to the invalid encoding, and depending on
the pattern, either grep or PCRE can be significantly faster.)

So, perhaps that the right method would be to do what pcregrep does,
even though "grep -P" can currently be a bit faster than pcregrep in
some cases.

-- 
Vincent Lefèvre <address@hidden> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

[Prev in Thread]

Current Thread

[Next in Thread]

bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Vincent Lefevre, 2014/12/18
- bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Norihiro Tanaka, 2014/12/19
  - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Vincent Lefevre, 2014/12/19
  - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Norihiro Tanaka, 2014/12/19
    - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Vincent Lefevre, 2014/12/19
    - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Paul Eggert, 2014/12/19
    - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Norihiro Tanaka, 2014/12/19
- bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Vincent Lefevre <=
  - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Paul Eggert, 2014/12/19
  - bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales, Norihiro Tanaka, 2014/12/19

Prev by Date: bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Next by Date: bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Previous by thread: bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Next by thread: bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Index(es):
- Date
- Thread