bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

regular expression causing regex engine to go into a lengthy computation


From: Mel Hatzis
Subject: regular expression causing regex engine to go into a lengthy computation
Date: Fri, 19 Sep 2003 19:47:08 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624

libregex maintainers...

I have a fairly simple regular expression which appears to cause
the regex engine to misbehave given certain search-strings.

The regular expression is as follows:

^(([[:alnum:]._-]\+(@([[:alnum:]._-]\+)\?juniper\.net)\?)\+[[:blank:]]*,[[:blank:]]*)*$

and the search string I'm matching it against is:

address@hidden, address@hidden, address@hidden, address@hidden, address@hidden, 
address@hidden

I have attached a simple C program to demonstrate the problem.

If I shorten the search-string by removing some of the 's' characters,
the program returns. It seems to return progressively faster with
each 's' removed.

If I remove the last '\+' in the regular expression
(just prior to the [[:blank:]]*,) the problem goes away.

I am unable to determine why the regex engine is degrading
so rapidly because of this additional '\+' character.

Can someone please shed some light on what's going on?
Is this caused by a bug in the regex engine perhaps?

--
Mel Hatzis
#include <stdio.h>
#include <regex.h>

static int
tst_regcmp (const char *pattern, const char *value)
{
  struct re_pattern_buffer buf;
  struct re_pattern_buffer *bufp;
  const char *comp_retval;
  int i;

  bufp = &buf;
  memset ((void *) &buf, 0, sizeof (buf));

  if (bufp->translate == NULL)
    {
      bufp->fastmap = (char *) malloc (256);
      comp_retval = re_compile_pattern (pattern, strlen (pattern), bufp);
      if (comp_retval)
        {
          fprintf (stderr, "unable to compile pattern: %s\n", comp_retval);
          exit (1);
        }
    }
  i = re_match (bufp, value, strlen (value), 0, 0);
  buf.translate = NULL;
  regfree (&buf);
  switch (i)
    {
    case -2:
      fprintf (stderr, "re_match failed\n");
    case -1:
      return 0;
    default:
      return 1;
    }
}


int
main(int argc, char **argv)
{
  int retval;
  const char *pattern = 
"^(([[:alnum:]._-]\\+(@([[:alnum:]._-]\\+)\\?juniper\\.net)\\?)\\+[[:blank:]]*,[[:blank:]]*)*$";
  const char *search_string = "address@hidden, address@hidden, address@hidden, 
address@hidden, address@hidden, address@hidden";

  re_set_syntax((RE_SYNTAX_POSIX_EXTENDED | RE_BK_PLUS_QM) & ~RE_DOT_NEWLINE);

  retval = tst_regcmp(pattern, search_string);

  printf("match returned %d\n", retval);
  exit(0);
}


reply via email to

[Prev in Thread] Current Thread [Next in Thread]