bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

grep-2.5g: grep -i --color is very slow! and sometimes fails; w/patch


From: Jim Meyering
Subject: grep-2.5g: grep -i --color is very slow! and sometimes fails; w/patch
Date: Sat, 26 Jan 2002 23:09:48 +0100
User-agent: Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.2.50 (i686-pc-linux-gnu)

Thanks for taking the time to make a release.

I noticed that `grep --color -i' is much slower than
grep --color or grep -i, when there are more than a few matches:

Using just -i takes almost no time:

  $ seq 30000|time grep -i 999 >/dev/null
  0.00user 0.00system 0:00.08elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (128major+25minor)pagefaults 0swaps

In this example, using -i and --color=always takes 1000(!) times longer:

  $ seq 30000|time grep -i --color=always 999 >/dev/null
  99.96user 0.02system 1:40.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (135major+282minor)pagefaults 0swaps

After inspecting the code, I discovered another problem:
This highlights foo:

  $ printf 'foo\0bar'|grep -a --color=always -i foo|cat -A
  ^[[01;address@hidden

but this doesn't highlight `bar'

  $ printf 'foo\0bar'|grep -a --color=always -i bar|cat -A
  address@hidden

Those problems are related.
The trouble is that the current code uses strlen to compute
a length, but shouldn't.  For one, the data may contain NUL bytes
(as in the example above) and make it stop early.  For another,
since strlen counts bytes up until the first NUL, it may go way
beyond the desired limit and make grep do lots of unnecessary work
(that is quadratic in the size of the output).  Finally, we already
know the length.

Here's a patch that fixes both of those problems as well as a leak:
(I also took the liberty of adding a few spaces)

2002-01-26  Jim Meyering  <address@hidden>

        * src/grep.c (prline): Use xmalloc, not malloc.  Don't leak.
        Don't use strlen on data.  We already know the length: lim - beg.

--- grep-2.5g/src/grep.c        Wed Jan 23 10:38:24 2002
+++ grep-2.5g-new/src/grep.c    Sat Jan 26 20:20:34 2002
@@ -559,26 +559,27 @@ prline (char const *beg, char const *lim
       if(match_icase)
         {
          /* Yuck, this is tricky */
-          char *ibeg, *ilim;
+          char *buf = (char*) xmalloc (lim - beg);
+         char *ibeg = buf;
+         char *ilim = ibeg + (lim - beg);
          int i;
-         ibeg=(char*)malloc(strlen(beg));
-         ilim=ibeg+(lim-beg);
-         for(i=0; i<strlen(beg); i++)
-           ibeg[i]=tolower(beg[i]);
-         while ((match_offset = (*execute) (ibeg, ilim - ibeg, &match_size, 1))
-             != (size_t) -1)
+         for (i = 0; i < lim - beg; i++)
+           ibeg[i] = tolower (beg[i]);
+         while ((match_offset = (*execute) (ibeg, ilim-ibeg, &match_size, 1))
+                != (size_t) -1)
            {
-             char const *b=beg + match_offset;
-             if(b == lim)
+             char const *b = beg + match_offset;
+             if (b == lim)
                break;
-             fwrite(beg, sizeof (char), match_offset, stdout);
-             printf("\33[%sm", grep_color);
-             fwrite(b, sizeof(char), match_size, stdout);
-             fputs("\33[00m", stdout);
+             fwrite (beg, sizeof (char), match_offset, stdout);
+             printf ("\33[%sm", grep_color);
+             fwrite (b, sizeof (char), match_size, stdout);
+             fputs ("\33[00m", stdout);
              beg = b + match_size;
              ibeg = ibeg + match_offset + match_size;
            }
          fwrite (beg, 1, lim - beg, stdout);
+         free (buf);
          lastout = lim;
          return;
        }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]