bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Makes sort create random order


From: Paul Eggert
Subject: Re: [PATCH] Makes sort create random order
Date: Sat, 11 Sep 2004 15:48:54 -0700
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux)

Thomas Habets <address@hidden> writes:

> Once upon a midnight dreary, Paul Eggert pondered, weak and weary:
>> > Or should a random permutation merge all equal values?
>> Only if the ordinary sort would merge the equal values (i.e., if the
>> -u option is specified).
>
> I mean merge them, then sort, then randomize, then split them. With no 
> randomization after the split, causing equal values to always turn up next to 
Well, I'm not sure what you mean by "randomize then split".

Under the proposed semantics, "sort" uses a particular random
permutation of the correct sort order when it compares two elements.
You're proposing a different algorithm, and are asking me whether it's
equivalent.  The algorithm is a bit fuzzy, so I'm not sure how to
answer.  However, if it is equivalent to the proposed semantics, then
yes, it would be an acceptable implementation.

>> > c b b b d d a
> (note that all equal values are next to each other)
>> > That's not what my patch does, so are you saying that is the right thing
>> > to do?
>> Yes, that's what I was thinking of, given the same seed.
>
> That, in my view. Is wrong. It's not random nor arbitrary.

Sure it is, if you use the definition of "random" that I mentioned above,
i.e., sort uses a random permutation of the correct sort order.

It may not be right for some other definition of "random", but if so,
we need to figure out exactly what that other definition is, and why
it belongs in a sort program rather than in some other program.

> "sort -ns" doesn't sort permutations of the same input to the same 
> output. :-)

That's true: it preserves some of the input structure.  But you're
asking for an option that destroys the input structure.  In that case,
why does it need to be part of "sort"?  It could be a separate
program, which can operate either as a prepass or as a postpass to
"sort", or to any other utility.  That sounds more useful than putting
it in some poorly-understood corner of "sort".

> Sorts ability to handle large files makes this a non-oneliner

OK, but I still don't see why it's useful to make this a part of
"sort".  If it's not well thought-out as how this option interacts
with the other options, it ought to be a separate program.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]