bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Vague bug in egrep -i with enormous pattern


From: Karl Fox
Subject: Vague bug in egrep -i with enormous pattern
Date: Mon, 9 Oct 2006 23:20:46 -0400

Hi,

I don't know whether this report can be useful or not. I have a 3,515 character pattern file I use with egrep -i -f filename that exhibits bizarre behavior. I use it to search a proprietary form of Windows event log files, but it acts very strangely. When I omit the -i option it acts normally. Sometimes I get segmentation faults. If I use the -n option, it often reports very far off line numbers. Here is an example of strange output. Look at the three calls to egrep -i below:

% rpm -q --all | grep -i suse-release
suse-release-9.2-3
% uname -a
Linux richwood 2.6.8-24-smp #1 SMP Wed Oct 6 09:16:23 UTC 2004 i686 i686 i386 GNU/Linux
% grep --version
grep (GNU grep) 2.5.1

Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

% wc event.tmp
   793  27392 356521 event.tmp
% egrep -f pattern.grep < event.tmp | wc
    646   24032  326794
% egrep -f pattern.grep < event.tmp | egrep -f pattern.grep | wc
    646   24032  326794
%
%
% egrep -i -f pattern.grep < event.tmp | wc
415 14900 200704 <-- Note the strange line counts in these three examples
% egrep -i -f pattern.grep < event.tmp | egrep -i -f pattern.grep | wc
135 4857 65536 <-- Note the strange line counts in these three examples % egrep -i -f pattern.grep < event.tmp | egrep -i -f pattern.grep | egrep -i -f pattern.grep | wc 136 4857 65537 <-- Note the strange line counts in these three examples
%

I've attached the pattern file, but the event.tmp file contains customer proprietary information that I cannot share. I can try to reproduce the problem with artificial data if you cannot reproduce it yourself, but my initial attempts seem to indicate that it needs a large file, not just a few lines. The event.tmp file consists of newline terminated lines containing seven tab-separated fields, the last of which can be several hundred characters.

I've attached the pattern file.  Please let me know how I can help.

Sincerely,

Karl Fox

Attachment: pattern.grep
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]