faster DFA.C state merge

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

faster DFA.C state merge

From:	Ivan Yanikov
Subject:	faster DFA.C state merge
Date:	Sun, 25 Aug 2013 04:57:25 +0400
User-agent:	Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8

Hello! Thanks for great command line tool.

I have a large regexp, which works very slow. I take a profiler, and it shows, that bottleneck is 'insert function', which is called too often from this code during file matching, and makes O(N^2)assigmend, O(N) memory reallocatons:


-------------------
for (j = 0; j < d->states[0].elems.nelem; ++j) { 
insert(d->states[0].elems.elems[j], &follows); }
-------------------

I have a patch, which makes dfa-states merge in dfa.c faster. Also I rewrite 
this code, using priority-queue data-structure:

-------------------
 /* Find the union of the follows of the positions of the group. This is a 
hideously inefficient loop.  Fix it someday. */
 for (j = 0; j < grps[i].nelem; ++j)
   for (k = 0; k < d->follows[grps[i].elems[j]].nelem; ++k)
     insert (d->follows[grps[i].elems[j]].elems[k], &follows);
-------------------

Will it be usefull for you?

How I can properly send/commit it for review?
Currently I have copied the patch to http://pastebin.com/p6TbvM8Y and  
https://gist.github.com/dobrokot/6331243
it need little work for proper styling and optimization shortcut for common 
cases when merging 2 sets or only one set(i.e. copy).

Thanks for attention, Ivan Yanikov.

[Prev in Thread]

Current Thread

[Next in Thread]

faster DFA.C state merge, Ivan Yanikov <=
- Re: faster DFA.C state merge, Paul Eggert, 2013/08/25
  - bug#15191: faster DFA.C state merge, Ivan Yanikov, 2013/08/26
    - Message not available
    - bug#15191: Ivan Yanikov, Ivan Yanikov, 2013/08/27

Prev by Date: Invitation to connect on LinkedIn
Next by Date: Re: faster DFA.C state merge
Previous by thread: Invitation to connect on LinkedIn
Next by thread: Re: faster DFA.C state merge
Index(es):
- Date
- Thread