bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32750: [PATCH 2/2] dfa: optmization of alternation in NFA


From: Norihiro Tanaka
Subject: bug#32750: [PATCH 2/2] dfa: optmization of alternation in NFA
Date: Mon, 17 Sep 2018 23:14:52 +0900

Hi,

Even when similer states exists in alternation, DFA treats them as
separate items.  It may complicate the transition in NFA and cause
slowdown.  This change assembles the states into one.

For example, ab|ac is changed into a(b|c).

This change speeds-up matching for many branched pattern.  For
example, grep speeds-up more than 30x in following case.

  seq 10000 | sed 's/$/ abcdefghijklmnopqrstuvwxyz/; s/$/./' >in
  time -p env LC_ALL=C grep -vf in in

  (before) real 63.43 user 61.67 system 1.65
  (after)  real  1.64 user  1.61 system 0.03

  # If we do not add '.' at last in pattern, not dfa but kwset is used.

grep also speeds-up about 3x in following case.

  time -p env LC_ALL=C grep -vf /usr/share/dict/linux.words 
/usr/share/dict/linux.words

  (before) real  2.48 user  2.09 system 0.38
  (after)  real  7.69 user  6.32 system 1.29

Thanks,
Norihiro

Attachment: 0001-dfa-simplify-initial-state.patch
Description: Text document

Attachment: 0002-dfa-optmization-of-alternation-in-NFA.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]