|
From: | GNU bug Tracking System |
Subject: | [debbugs-tracker] bug#23052: closed (Make grep be able to separate output by NULL characters?) |
Date: | Fri, 18 Mar 2016 16:49:02 +0000 |
Your message dated Fri, 18 Mar 2016 09:48:25 -0700 with message-id <address@hidden> and subject line Re: bug#23052: Make grep be able to separate output by NULL characters? has caused the debbugs.gnu.org bug report #23052, regarding Make grep be able to separate output by NULL characters? to be marked as done. (If you believe you have received this mail in error, please contact address@hidden) -- 23052: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=23052 GNU Bug Tracking System Contact address@hidden with problems
--- Begin Message ---Subject: Make grep be able to separate output by NULL characters? Date: Fri, 18 Mar 2016 10:20:06 +0100 Suppose we are doing a multiline regex pattern search on a bunch of files and we want to extract the matches, e.g. for further processing. By default, grep outputs matches separated by newlines, but since we are doing multiline patterns this creates the inconvenience that we cannot easily extract the individual matches. So we would want to have the matches separated by null bytes. This seems to be a very straightforward feature, and I was surprised that this was not already possible.Here is a tiny examplegrep -rzPIho '}\n\n\w\w\b' | od -a
Depending on the files in your file tree, this may yield an output like
0000000 } nl nl m y nl } nl nl i f nl } nl nl m 0000020 y nl } nl nl m y nl } nl nl i f nl } nl 0000040 nl m y nl 0000044
As you can see, we cannot split on newlines to obtain the matches for further processing, since the matches contain newline characters themselves.Now grep already has the -z/--null flag, but that works only in conjunction with the -l flag, which makes grep output filenames instead of matches.So here the feature request: can we make the -z flag also affect the normal output?Regards,Chiel
--- End Message ---
--- Begin Message ---Subject: Re: bug#23052: Make grep be able to separate output by NULL characters? Date: Fri, 18 Mar 2016 09:48:25 -0700 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 On 03/18/2016 02:20 AM, Chiel ten Brinke wrote:Suppose we are doing a multiline regex pattern search on a bunch of files and we want to extract the matches, e.g. for further processing. By default, grep outputs matches separated by newlines, but since we are doing multiline patterns this creates the inconvenience that we cannot easily extract the individual matches.Thanks for pointing this out. The problem is more than just an inconvenience: grep's behavior is incorrect, since (as you mention) its output can be ambiguous. -o and -z were added separately to GNU grep, and it appears that their combination wasn't considered. I installed the attached patch to the grep master so that -z uses null bytes to terminate output lines even when -o is used.0001-grep-oz-now-outputs-null-bytes-not-newlines.patch
Description: Source code patch
--- End Message ---
[Prev in Thread] | Current Thread | [Next in Thread] |