[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#34053: [PATCH] grep: fix slow for multiple word matching
From: |
Norihiro Tanaka |
Subject: |
bug#34053: [PATCH] grep: fix slow for multiple word matching |
Date: |
Sun, 13 Jan 2019 08:45:47 +0900 |
Hi,
grep uses KWset matcher for multiple word matching. It is very slow when
most of the parts matched to a pattern are not words. So, if a part firstly
matched to pattern is not a word, use the grep matcher to match for its line.
By the way, if START_PTR is set, grep matcher uses regex matcher which is
very slow to match words. Therefore, we use grep matcher when only START_PTR
is not set.
Example, although it is a very extreme case...
$ cat >pat <<EOF
0
00 0
00 00 0
00 00 00 0
00 00 00 00 0
00 00 00 00 00 0
00 00 00 00 00 00 0
00 00 00 00 00 00 00 0
00 00 00 00 00 00 00 00 0
00 00 00 00 00 00 00 00 00 0
00 00 00 00 00 00 00 00 00 00 0
00 00 00 00 00 00 00 00 00 00 00 0
00 00 00 00 00 00 00 00 00 00 00 00 0
00 00 00 00 00 00 00 00 00 00 00 00 00 0
EOF
$ yes '00 00 00 00 00 00 00 00 00 00 00 00 00' | head -1000000 >inp
$ env LC_ALL=C time -p src/grep -wf pat inp
real 5.75
user 5.67
sys 0.02
Retry after applied the patch.
$ env LC_ALL=C time -p src/grep -wf pat inp
real 0.32
user 0.31
sys 0.00
Thanks,
Norihiro
0001-grep-fix-slow-multiple-word-matching.patch
Description: Text document
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#34053: [PATCH] grep: fix slow for multiple word matching,
Norihiro Tanaka <=