[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#10281: change in behavior of du with multiple arguments (commit efe5

From: Elliott Forney
Subject: bug#10281: change in behavior of du with multiple arguments (commit efe53cc)
Date: Mon, 18 Feb 2013 01:18:23 -0700

> My reading of <http://austingroupbugs.net/view.php?id=527#c1104>
> is that POSIX allows but does not require the current GNU behavior,
> and that future versions of POSIX may require the current GNU behavior.

Thanks Paul, I agree with your reading of this.  Sounds like POSIX
allows both the new and old behaviors.

I must express, however, that I think this is a case where both the
standard and the current implementation were well-intentioned but not
well thought out.  Please allow me to state some reasons why I am
opposed to the current behavior followed by an example.  If I fail to
persuade people then I will let this issue be.

1.  I find it unintuitive that the number in a line of output from du
does not necessarily reflect the size of the corresponding directory
or file.  Without being privy to du's behavior regarding links and
multiple command-line arguments, this would be my expectation.

2.  Although I can see how it might add functionality to avoid
recounting files with a link count greater than one (although I don't
find it personally useful) I do not see any added benefit of not
recounting files with link count equal to one (e.g., across multiple
command-line arguments).  This is where I think the implications of
the POSIX standard were not well thought out.  I think that the
intention was to prevent counting files multiple times if there were
multiple links to the same file.  As an ill-considered side effect of
this, and particularly in the current implementation, we now find that
du will not recount across multiple command-line arguments.  I have an
examples of how this is confusing below.

3.  I find it surprising that the order of command-line arguments to
du may affect the output of du.  Users don't expect this.

4.  This deviates significantly from other implementations and
historical behavior.  To my knowledge, gnu-coreutils is the
odd-man-out with all other implementations following the previous

5.  I couldn't agree with Bernhard Voelker more:
> That reminds me about a real-life question you could ask your little
> daughter: "how many pupils are in each class and in total at school?".
> I guess you would send her to extra math courses if she said "Class A
> has 20, class B and class C have 25 each, and the school has 0."
> This example doesn't claim to be 100% relevant for du, but shows
> how "counting" and "summarizing" is burnt into human brains.

Personally, I think that du should recount links and command-line
arguments everywhere except in the total, as reported by the -c flag.
This would add to the information reported by du without violating 1

Let's consider an example.  I have actually had several people in my
office confused over variations of this same problem since du's
behavior has changed.  If several people have come looking for help,
this means that many more are confused and, worse yet, some
people/scripts probably haven't even caught the inconsistency.

Let's say that I want to answer the following question:  What are the
sizes of the directories "one" and "two" and "two/three" and all of
these directories combined.  Perhaps I know that "one" and "two" are
often too large and that "three" often causes "two" to grow too large.

Below is what my first principals would suggest I do.  Interestingly,
however, I notice that "two/three" is not reported at all.  So my
question is not answered.

$ du -ksc one two two/three
75096   one
4283824 two
4358920 total

Next, I might try reversing the order of arguments to see what
happens.  Now, I see that all are reported.  A hurried user might stop
here and go about their day.  A sly user will notice, however, that
"two/three" appears larger than "two" How is this possible?!

$ du -ksc one two/three two
75096   one
3184072 two/three
1099752 two
4358920 total

So, I might wander down the hall and visit with a friend.  He suggests
that I use the --count-links flag to allow recounting (even though
there are not multiple links in this scenario).  Now, everything is
reported and the numbers on each line of output match, but what
happened to my total?  This can't be right, it's larger than my entire
disk quota!

$ du -ksc --count-links one two two/three
75096   one
4283824 two
3184072 two/three
7542992 total

Turns out that --count-links sums all the output, even those that are
recounted, yielding an incorrect total.  Finally, I break down and use
three commands get the answer I was looking for.  Not pretty.

$ du -ks one two; du -ks two/three; du -ksc one two | tail -1
75096   one
4283824 two
3184072 two/three
4358920 total

Is it just me or does anyone else think this is convoluted?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]