[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
regular expression causing regex engine to go into a lengthy computation
From: |
Mel Hatzis |
Subject: |
regular expression causing regex engine to go into a lengthy computation |
Date: |
Fri, 19 Sep 2003 19:47:08 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624 |
libregex maintainers...
I have a fairly simple regular expression which appears to cause
the regex engine to misbehave given certain search-strings.
The regular expression is as follows:
^(([[:alnum:]._-]\+(@([[:alnum:]._-]\+)\?juniper\.net)\?)\+[[:blank:]]*,[[:blank:]]*)*$
and the search string I'm matching it against is:
address@hidden, address@hidden, address@hidden, address@hidden, address@hidden,
address@hidden
I have attached a simple C program to demonstrate the problem.
If I shorten the search-string by removing some of the 's' characters,
the program returns. It seems to return progressively faster with
each 's' removed.
If I remove the last '\+' in the regular expression
(just prior to the [[:blank:]]*,) the problem goes away.
I am unable to determine why the regex engine is degrading
so rapidly because of this additional '\+' character.
Can someone please shed some light on what's going on?
Is this caused by a bug in the regex engine perhaps?
--
Mel Hatzis
#include <stdio.h>
#include <regex.h>
static int
tst_regcmp (const char *pattern, const char *value)
{
struct re_pattern_buffer buf;
struct re_pattern_buffer *bufp;
const char *comp_retval;
int i;
bufp = &buf;
memset ((void *) &buf, 0, sizeof (buf));
if (bufp->translate == NULL)
{
bufp->fastmap = (char *) malloc (256);
comp_retval = re_compile_pattern (pattern, strlen (pattern), bufp);
if (comp_retval)
{
fprintf (stderr, "unable to compile pattern: %s\n", comp_retval);
exit (1);
}
}
i = re_match (bufp, value, strlen (value), 0, 0);
buf.translate = NULL;
regfree (&buf);
switch (i)
{
case -2:
fprintf (stderr, "re_match failed\n");
case -1:
return 0;
default:
return 1;
}
}
int
main(int argc, char **argv)
{
int retval;
const char *pattern =
"^(([[:alnum:]._-]\\+(@([[:alnum:]._-]\\+)\\?juniper\\.net)\\?)\\+[[:blank:]]*,[[:blank:]]*)*$";
const char *search_string = "address@hidden, address@hidden, address@hidden,
address@hidden, address@hidden, address@hidden";
re_set_syntax((RE_SYNTAX_POSIX_EXTENDED | RE_BK_PLUS_QM) & ~RE_DOT_NEWLINE);
retval = tst_regcmp(pattern, search_string);
printf("match returned %d\n", retval);
exit(0);
}
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- regular expression causing regex engine to go into a lengthy computation,
Mel Hatzis <=