--- Begin Message ---
Subject: |
GNU grep matching discrepancy between -a/--text and not. |
Date: |
Sat, 2 Apr 2016 15:00:12 +0300 |
Hi all,
as can be seen in this repository:
https://github.com/shlomif/gnu-grep-trailing-space-and-CR-on-riddles.he-false-match
GNU grep says a document it suspects to be binary matches without -a/--text and
doesn't match it or return results with that flag applied. perl sides with the
latter.
I'm on Mageia linux x86-64 v6 and have built GNU grep from the latest git
commit ( c767ed70eca9a82d76f07dcdbcaafa21ec7f86d6 ) to test.
Regards,
Shlomi Fish
P.S: it seems the build system uses gperf but configure does not verify that it
exists in the path.
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
Interview with Ben Collins-Sussman - http://shlom.in/sussman
Can I SCO now? Sue who you wanna sue, it doesn't matter anyhoo, it's time to
litigate.
— http://www.shlomifish.org/humour/bits/Can-I-SCO-Now/
Please reply to list if it's a mailing list post - http://shlom.in/reply .
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#23185: GNU grep matching discrepancy between -a/--text and not. |
Date: |
Tue, 5 Apr 2016 23:56:22 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 |
Thanks for pointing out the seeming inconsistency. The documentation mentions
the issue but is perhaps not clear enough, so I installed the attached patch.
The input file contains NUL bytes and so is treated as binary data, and the grep
documentation (secton "File and Directory Selection", option "--binary-files")
says "When processing binary data, ‘grep’ may treat non-text bytes as line
terminators". This behavior was added to GNU grep in release 2.21 dated 2014,
partly for performance reasons.
There are two instances in riddle.he of a space followed by a NUL byte, so
grep -P '[ \t]\r?$' riddles.he
finds a match when the $ matches just before the NUL byte.
-a is one way to get the behavior you evidently expected. Another (perhaps
better) way is -z. The command:
grep -zP '[ \t]\r?\n' riddles.he
outputs nothing and exits with status 1.
0001-Give-another-example-of-binary-file-processing.patch
Description: Text Data
--- End Message ---