|
From: | Paul Eggert |
Subject: | bug#32750: [PATCH 2/2] dfa: optmization of alternation in NFA |
Date: | Wed, 19 Sep 2018 00:48:42 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
Jim Meyering wrote:
seq 10000 | sed 's/$/ abcdefghijklmnopqrstuvwxyz/; s/$/./' >in time -p env LC_ALL=C grep -vf in in (before) real 63.43 user 61.67 system 1.65 (after) real 1.64 user 1.61 system 0.03 # If we do not add '.' at last in pattern, not dfa but kwset is used. grep also speeds-up about 3x in following case. time -p env LC_ALL=C grep -vf /usr/share/dict/linux.words /usr/share/dict/linux.words (before) real 2.48 user 2.09 system 0.38 (after) real 7.69 user 6.32 system 1.29Thank you for the patches. However, the before/after numbers you show here suggest that the "after" code takes more than triple of the time of "before". > Also, when I compared grep compiled at 123620af88f55c3e0cc9f0aed7311c72f625bc82 (latest, including your changes) and that compiled at the prior commit, 9c11510507ebcd31671f10d9b88532f8e6657ad2, I find that the new version takes over 30 seconds, while the prior one took about 20 seconds.
Is that last pair of times for the second benchmark he gave? I confess to being lazy and not trying that benchmark, as I was on an Ubuntu system that didn't have the linux.words file.
Did you try the first benchmark? On my Ubuntu 18.04 x86-64 (Xeon E3-1225 V2) platform, I got (before) real 55.51 user 51.53 sys 3.98, (after) real 0.64 user 0.60 sys 0.04, so the change is a big performance win there.
On the other hand, I just now did the 2nd benchmark with a copy of a Fedora 28 linux.words file, and got (before) real 8.06 user 6.20 sys 1.85, (after) real 21.69 user 21.21 sys 0.47, so it's about three times slower. Ouch.
[Prev in Thread] | Current Thread | [Next in Thread] |