[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#10282: change in behavior of du with multiple arguments (commit efe5

From: Eric Blake
Subject: bug#10282: change in behavior of du with multiple arguments (commit efe53cc)
Date: Tue, 13 Dec 2011 10:16:12 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20111115 Thunderbird/8.0

On 12/13/2011 09:46 AM, Kamil Dudka wrote:
>> I agree that printing "0 X" for these seems inconsistent with the
>> elision mandated for the second and subsequent encounter of a file,
>> but I suppose command line arguments are intrinsically different
>> enough that handling them specially makes sense.  Maybe even as
>> the default.
>>> Perhaps 'du' needs a new option to control what to do with
>>> files that 'du' has already seen before. something that
>>> generalizes --count-links.
>> That sounds like a good way to do it.
>> Anyone interested?
> Thank all of you for looking at the issue.  If I understand it correctly, the 
> old behavior was violating POSIX whereas the current default behavior is 
> correct.

Not quite.  The POSIX wording does not match historical practice, and
appears to be contradictory (or at least ambiguous), so we may need to
ask for clarification from the Austin Group.  The problem is that POSIX
says that if an inode is encountered more than once, it is only listed
once (without reference to whether those encounters were from recursion
on a single command line argument, recursion across multiple command
line arguments, or even if the duplication occurs on the command line
itself); but it also says that with '-s', listings are output for all
command line arguments.  Historically, du implementations elided output
for inode duplication found within a single command line argument, but
not across multiple command line arguments.

The coreutils behavior was changed to elide duplicates across multiple
command line arguments; particularly so that in the -s case, you can sum
the total usage and get an accurate feel, no matter which order the
command line arguments were listed in.  But in doing so, we elided
duplicate command line arguments, which goes against the POSIX wording
that -s will list a summary for all arguments.  Hence our proposal of
using '0' for a directory previously counted.

>  I tried du --count-links with the original reproducer and it seemed 
> to work fine.  So what would be the point in adding a new option?

I think the proposal is to add a new option that forces du to reset its
duplicate inode hash table for each command line argument, to make
behavior more like traditional du, even though it means -s can then
output a larger usage by summing the first column than what you would
get by the default behavior, when encountering command line arguments
that are a duplicate with an inode already traversed earlier in the
command line.  --count-links isn't quite right, because you still want
to elide links within a single directory of the command-line argument.
Or maybe --count-links gains an optional argument, that says how to
count links:

--count-links=none -> POSIX behavior (if POSIX requires elision across
command line arguments
--count-links=per-directory -> traditional behavior, resetting hash
between command line arguments
--count-links == --count-links=all -> count every file on every encounter

Eric Blake   address@hidden    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]