bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#29668: grep: Fatal problem with (big) file


From: Paul Eggert
Subject: bug#29668: grep: Fatal problem with (big) file
Date: Wed, 13 Dec 2017 16:03:57 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0

On 12/13/2017 03:25 PM, Norihiro Tanaka wrote:
I don't seem that that's problem.  the user pass output of grep to wc -l,
so `Binary file ... matches' line is also counted by `wc' as one line.

The intent of 'grep PATTERN | wc -l' is to count the number of matches, like 'grep -c PATTERN' would. But it doesn't work that way here. E.g., on Fedora 27 with LANG=en_US.UTF-8:

$ grep -c Volvo Tieliikenne5.0.csv
266175
$ grep Volvo Tieliikenne5.0.csv | wc -l
241264
$ grep Volvo Tieliikenne5.0.csv | tail -n 1
Binary file Tieliikenne5.0.csv matches

If the "Binary file ... matches" line were sent to stdout instead of to stderr, the problem would be more obvious to the user:

$ grep -c Volvo Tieliikenne5.0.csv
266175
$ grep Volvo Tieliikenne5.0.csv | wc -l
Binary file Tieliikenne5.0.csv matches
241264
$ grep Volvo Tieliikenne5.0.csv | tail -n 1
Binary file Tieliikenne5.0.csv matches
T;2017-09-29;75;01;;;19550000;;;;;1;1570;;3000;2595;1670;;01;2200;20.6;4;false;false;Volvo;;;;;01;;01;977;;;841;;5092946

I believe that in the past I've thought that the "Binary file" message should be sent to stdout, but these examples are a reasonably compelling reason to send them to stderr instead.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]