[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#29668: grep: Fatal problem with (big) file
From: |
Paul Eggert |
Subject: |
bug#29668: grep: Fatal problem with (big) file |
Date: |
Wed, 13 Dec 2017 16:03:57 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 |
On 12/13/2017 03:25 PM, Norihiro Tanaka wrote:
I don't seem that that's problem. the user pass output of grep to wc -l,
so `Binary file ... matches' line is also counted by `wc' as one line.
The intent of 'grep PATTERN | wc -l' is to count the number of matches,
like 'grep -c PATTERN' would. But it doesn't work that way here. E.g.,
on Fedora 27 with LANG=en_US.UTF-8:
$ grep -c Volvo Tieliikenne5.0.csv
266175
$ grep Volvo Tieliikenne5.0.csv | wc -l
241264
$ grep Volvo Tieliikenne5.0.csv | tail -n 1
Binary file Tieliikenne5.0.csv matches
If the "Binary file ... matches" line were sent to stdout instead of to
stderr, the problem would be more obvious to the user:
$ grep -c Volvo Tieliikenne5.0.csv
266175
$ grep Volvo Tieliikenne5.0.csv | wc -l
Binary file Tieliikenne5.0.csv matches
241264
$ grep Volvo Tieliikenne5.0.csv | tail -n 1
Binary file Tieliikenne5.0.csv matches
T;2017-09-29;75;01;;;19550000;;;;;1;1570;;3000;2595;1670;;01;2200;20.6;4;false;false;Volvo;;;;;01;;01;977;;;841;;5092946
I believe that in the past I've thought that the "Binary file" message
should be sent to stdout, but these examples are a reasonably compelling
reason to send them to stderr instead.