[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bug in find's -size (+patch)
Re: Bug in find's -size (+patch)
Sun, 6 Feb 2005 23:47:49 -0500
> > It's simply broken. There's no other way to put it.
> Well, it complies with both the POSIX standard and historic usage of
> "find". I might concede that under some circumstances the current
> behaviour is _inconvenient_, but I find it hard to agree that it is
So, searching for 'file . -size -2G' and finding files which are really
under 1G and not under 2, is an inconvenience?
> > In any case, according to the reference you gave, it seems that 'k' is a
> > GNU extention anyway, which means we only need to preserve the broken
> > part when it's on 'b'. Will you apply a patch that will do that if I
> > submit it?
> No. I think that violates the principle of least surprise. People
> expect that changing the suffix changes only the multiplier, and won't
> change the semantics of the predicate itself, I think.
That's exactly what's _not_ happening right now. If that were the
situation right now, then -size -2048c and -2k would have yielded the
same results. As an administrator (and I was wearing that hat before I
develd into the code), I was assuming that the two were completely
identical, which they are (Well, about. I won't get into the Mega/Mebi
thing). You're thinking as a programmer. As a user, I don't care about
algorithm semantics, and I shouldn't. You're telling me that as an
administrator, I _expect_ it to behave like it does? Apparently I
didn't, and I don't think I should have. You said it yourself. When I
change a multiplier, that's all I want it to do. I want -2M to find
files that are smaller than 1024*1024*8 bytes in size. It doesn't do
> > Assume that I'll submit a patch that will add 'm' (for megabyte)
> > support,
> But 'M' is already provided.
Sorry about that, I missed it in the CVS version. By the way - why
uppercase for M and G and k is lowercase? Any special reason for this?
> > and the code for comparison doesn't change. You'll get +/-1MB
> > inaccuracies in the results.
> Are you saying that you would prefer there to be a test which would
> match only if the size of the file is an exact multiple of 1048576
> bytes (in the case of m/M)? That doesn't seem that useful to me.
> Also if you want to select files which are an exact number of bytes,
> you can do it this way for example:-
> find . -size $(expr 6 '*' 1024)c -print
No, that's not what I meant. I meant that as an administrator, a
file the size of 1234567 bytes is less than 2m just as it's less than
> > I think we should confine this POSIXLY CORRECT (and broken)
> > behaviour as much as we can, rather than let it cascade into GNU
> > extensions.
> I disagree; size suffixes should change only the multiplier on the
> number to which they are applied, and not modify the semantics of the
> test in which they are used. However, there is still scope for having
> a new _prefix_ for the number, which normally does affect the
> semantics of the match (as do + and - for example). If there's a
> coherent and useful alternative that does what you want, it would be
> possible to use a new proefix to accomodate that.
Right. Again this is not the behaviour that it exhibits. If anything,
it's _most surprise_ now rather than _least surprise_. Are you saying
that finding out that -2048c and -2k are not the same and don't yield
the same results "least surprise"?
This way the programmatic algorithm is the same for all suffixes, and
yet for a human this is all but logical (Again, for people that haven't
read the source). Am I missing something here? I mean, correct me if I'm
wrong but the current code simply does not find files between 1 and two
gigabytes in size when used with -size -1G? This is supposed to be a
predicted and expected behaviour by find users?