--- Begin Message ---
Subject: |
[PATCH] dfa: speed-up at initial state |
Date: |
Sat, 24 May 2014 12:31:37 +0900 |
Thanks to Jim and for the new release. Let's just start with, for next
release I want to add further improvement to it.
In dfaexec, DFA state is always 0 until have found potential match. So
we can improve matching there by continuing to use the transition table
without replacing it.
I tested the patch, and got about 3x speed-up.
$ yes j | head -10000000 >k
$ env LC_ALL=C time -p src/grep '\(a\|b\)' k
Before: real 1.12 user 1.06 sys 0.04
After : real 0.39 user 0.34 sys 0.04
I also tested for non-utf8 multibyte locale.
$ env LC_ALL=ja_JP.eucJP time -p src/grep '\(a\|b\)' k
Before: real 1.41 user 1.35 sys 0.05
After : real 0.38 user 0.32 sys 0.06
By the way, below on grep 2.18 (non-patch). (^_^)
$ env LANG=ja_JP.eucJP time -p src/grep '\(a\|b\)' k
real 12.00 user 11.86 sys 0.13
0001-dfa-speed-up-at-initial-state.patch
Description: Text document
--- End Message ---
--- Begin Message ---
Subject: |
Re: [PATCH] dfa: speed-up at initial state |
Date: |
Sat, 27 Sep 2014 21:01:14 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 |
Norihiro Tanaka wrote:
The test case "k" is 50%
faster and "l" is also about 16% faster with GCC 4.8.2 on my platform by
two changes.
Thanks, I finally got around to looking at this and got similar performance
results to yours. That __attribute__((noinline)) bothers me, though, as it's
not portable and is a bit inelegant. I figured out a different way to avoid the
inlining, and tweaked the commentary a bit, and so installed the attached
additional patch after installing your patches.
0001-dfa-minor-tweaks-mostly-to-remove-__attribute__-noin.patch
Description: Text document
--- End Message ---