[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Preventing matches in regular expressions

From: Tom Lord
Subject: Re: [Gnu-arch-users] Preventing matches in regular expressions
Date: Wed, 11 Aug 2004 00:51:48 -0700 (PDT)

    > From: Andrew Suffield <address@hidden>

    > More normally you just step outside the bounds of regexps though. Perl
    > does it by having regexps that aren't; perlre cannot be expressed as
    > an NFA, although it still uses one to do *most* of the work.

Right.  perlre is kind of like a programmable tree traversal exploring
a particular part of the state space of a small array of indexes into
some fixed string.  Its resemblence to regular expressions is mostly
coincidental, imo, since there aren't very many simple programs (so
many simple programs somewhat resemble one another).    Thus,
things like negation and conjunction are trivially simple in perlre
(though, especially in combination with disjunction, can have
absolutely horrible computational complexity) while, in contrast,
everything in real regular expressions is fast.   Fast or flexible,
pick one.

So, those of us into "fast" work on "flexible" just by trying to nail
special cases.   Hence things like `cut'.

Hey: here's a volunteer project for any brave readers.   There's been
a long pause in hackerlab history while unicode support comes on line.
During that time, the posix regexps stuff in Rx hasn't been
unicodified.  There's a separate unicode-capable true-regular
expression matcher in rx --- but not via the posix interfaces.  

So, it's very mechanical and astoundingly detail oriented and is an
opportunity to spend a few hundred hours playing around with
profilers: someone needs to unicodify the posix regexp engine in rx. 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]