bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61300: wc -c doesn't advance stdin position when it's a regular file


From: Pádraig Brady
Subject: bug#61300: wc -c doesn't advance stdin position when it's a regular file
Date: Sun, 5 Feb 2023 19:59:58 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Thunderbird/109.0

On 05/02/2023 18:27, Stephane Chazelas wrote:
"wc -c" without filename arguments is meant to read stdin til
EOF and report the number of bytes it has read.

When stdin is on a regular file, GNU wc has that optimisation
whereby it skips the reading, does a pos = lseek(0,0,SEEK_CUR)
to find out its current position within the file, fstat(0) and
reports st_size - pos (assuming st_size > pos).

However, it does not move the position to the end of the file.
That means for instance that:

$ echo test > file
$ { wc -c; wc -c; } < file
5
5

Instead of 5, then 0:

$ { wc -c; cat; } < file
5
test

So the optimisation is incomplete.

It also reports the size of the file even if it could not possibly read it
because it's not open in read mode:

{ wc -c; } 0>> file
5

IMO, it should only do the optimisation if
- fcntl(F_GETFL) to check that the file is opened in O_RDONLY or O_RDWR
- current checks for /proc /sys-like filesystems
- pos > st_size
- lseek(0,st_size,SEEK_POS) is successful.

(that leaves a race window above where it could move the cursor
backward, but I would think that can be ignored as if something
else reads at the same time, there's not much we can expect
anyway).

Yes I agree.

Adjusting would also avoid the following inconsistencies:

$ { wc -c; wc -c; } < file
5
5

$ { wc -l; wc -l; } < file
1
0

$ truncate -s $(getconf PAGESIZE) file
$ { wc -c; wc -c; } < file
4096
0

Hopefully the attached addresses this.
Note it doesn't add the constraint on the input being readable,
which I'll think a bit more about.

cheers,
Pádraig

Attachment: wc-update-offset.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]