[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22655: grep -Pz '^' now fails!
From: |
Paul Eggert |
Subject: |
bug#22655: grep -Pz '^' now fails! |
Date: |
Sat, 19 Nov 2016 23:57:22 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
Stephane Chazelas wrote:
I don't find a x220 factor, more like a x2.5 factor:
I think I found the factor-of-hundreds slowdown, and fixed it in the 2nd
attached patch.
When I tried your benchmark with pcregrep (pcre 8.39, configured with
--enable-unicode-properties), and with ./grep0 (which has the PCRE_MULTILINE
implementation, i.e., commit da94c91a81fc63275371d0580d8688b6abd85346), and with
./grep (which is grep after the attached patches are installed), I got timings
like the following:
user sys
1.972 0.072 LC_ALL=en_US.utf8 pcregrep -u "z.*a" k
0.234 0.076 LC_ALL=en_US.utf8 ./grep0 -P "z.*a" k
1.280 0.064 LC_ALL=en_US.utf8 ./grep -P "z.*a" k
1.487 0.077 LC_ALL=C pcregrep "z.*a" k
0.193 0.067 LC_ALL=C ./grep0 -P "z.*a" k
0.825 0.096 LC_ALL=C ./grep -P "z.*a" k
All times are CPU seconds. This is Fedora 24 x86-64, AMD Phenom II X4 910e. As
before, k was created by the shell command: yes 'abcdefg hijklmn opqrstu vwxyz'
| head -n 10000000 >k
So, on this benchmark using PCRE_MULTILINE gave a speedup of a factor of ~4.3 in
a multibyte locale, and a speedup of ~3.5 in a unibyte locale.
On the other hand if you change the pattern to "z[^+]*a",
pcregrep still takes about one second, but GNU grep a lot longer
Yes, that example makes GNU grep -P look really bad. So installed the 1st
attached patch, which mostly just reverts the January multiline patch, i.e., it
goes back to the slower "./grep -P" lines measured above.
0001-grep-P-no-longer-uses-PCRE_MULTILINE.patch
Description: Text Data
0002-grep-further-P-performance-fix.patch
Description: Text Data
- bug#22655: grep -Pz '^' now fails!, (continued)
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Aaron Crane, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!,
Paul Eggert <=
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19