Re: performance/hang bug in regex.c

From: Thomas Lord
Subject: Re: performance/hang bug in regex.c
Date: Tue, 02 Sep 2008 20:48:52 -0700
User-agent: Thunderbird (X11/20060808)

Richard M. Stallman wrote:
    Regexps can be implemented in mainly 2 ways: backtracking or not.
    If you don't do backtracking, the above problem doesn't occur.
    But many/most regexp implementations use a backtracking algorithm
    because it's almost indispensable in order to handle backreferences.

    Emacs provides backreferences and doesn't bother to provide 2 regexp
    implementations (a backtracking one for regexps with backrefs and
    a non-backtracking one for all the others), so you get pathological
    behaviors for regexps such as the one above.  It's too bad, and I hope
    we can fix it at some point, but don't hold your breath,

This comes up often enough that maybe it should be explained in one
of the manuals.

You can probably clean up Rx (found in GNU Arch) well
enough to fix all of those problems although the work of
doing so will require the worker to understand regexp
stuff deeply  (not much (but some) beyond undergrad courses
about DFAs, NFAs and all that but you have to be really
into it -- it's a fairly involved application of that stuff).


