emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#23052: closed (Make grep be able to separate outpu


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#23052: closed (Make grep be able to separate output by NULL characters?)
Date: Fri, 18 Mar 2016 16:49:02 +0000

Your message dated Fri, 18 Mar 2016 09:48:25 -0700
with message-id <address@hidden>
and subject line Re: bug#23052: Make grep be able to separate output by NULL 
characters?
has caused the debbugs.gnu.org bug report #23052,
regarding Make grep be able to separate output by NULL characters?
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
23052: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=23052
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: Make grep be able to separate output by NULL characters? Date: Fri, 18 Mar 2016 10:20:06 +0100
Suppose we are doing a multiline regex pattern search on a bunch of files and we want to extract the matches, e.g. for further processing. By default, grep outputs matches separated by newlines, but since we are doing multiline patterns this creates the inconvenience that we cannot easily extract the individual matches. So we would want to have the matches separated by null bytes. This seems to be a very straightforward feature, and I was surprised that this was not already possible.

Here is a tiny example

grep -rzPIho '}\n\n\w\w\b' | od -a

Depending on the files in your file tree, this may yield an output like

0000000   }  nl  nl   m   y  nl   }  nl  nl   i   f  nl   }  nl  nl   m
0000020   y  nl   }  nl  nl   m   y  nl   }  nl  nl   i   f  nl   }  nl
0000040  nl   m   y  nl
0000044
As you can see, we cannot split on newlines to obtain the matches for further processing, since the matches contain newline characters themselves.

Now grep already has the -z/--null flag, but that works only in conjunction with the -l flag, which makes grep output filenames instead of matches. 

So here the feature request: can we make the -z flag also affect the normal output?


Regards,

Chiel

--- End Message ---
--- Begin Message --- Subject: Re: bug#23052: Make grep be able to separate output by NULL characters? Date: Fri, 18 Mar 2016 09:48:25 -0700 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
On 03/18/2016 02:20 AM, Chiel ten Brinke wrote:
Suppose we are doing a multiline regex pattern search on a bunch of files
and we want to extract the matches, e.g. for further processing. By
default, grep outputs matches separated by newlines, but since we are doing
multiline patterns this creates the inconvenience that we cannot easily
extract the individual matches.

Thanks for pointing this out. The problem is more than just an inconvenience: grep's behavior is incorrect, since (as you mention) its output can be ambiguous. -o and -z were added separately to GNU grep, and it appears that their combination wasn't considered. I installed the attached patch to the grep master so that -z uses null bytes to terminate output lines even when -o is used.

Attachment: 0001-grep-oz-now-outputs-null-bytes-not-newlines.patch
Description: Source code patch


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]