[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#9780: sort -u throws out non-duplicates

From: Jim Meyering
Subject: bug#9780: sort -u throws out non-duplicates
Date: Fri, 17 Aug 2012 23:09:15 +0200

Paul Eggert wrote:
> OK, I scratched my head for a bit and came up with the following
> further patch, which addresses the issues that I mentioned.
> Subject: [PATCH] sort: simpler fix for sort -u data-loss bug
> * src/sort.c (overlap): Remove.
> (fillbuf): Do not try to copy saved lines, as that is too risky
> in the presence of parallelism, reallocated buffers, etc.
> (sort): Invalidate any saved line before sorting a new batch.
> ---
>  src/sort.c |   36 +-----------------------------------

Very nice!  That fixes not just the original bug, but also the FMR,
and eliminates my entire patch.  The only cost is in writing at most
one more line per buffer.

I hate to look such a nice gift horse in the mouth, but it's getting
late here...  Would you mind adjusting that to add NEWS and mention that
you've fixed the second, free-memory-read bug, too?

And even add the test?
If you don't find time, I'll get to that over the weekend.

Regarding your patch...

For the record, at first I thought an input that used one (long) line per
buffer would make --unique a no-op, but then I realized that in that case,
each buffers-worth (one line each) would be written to its own temporary
file, and the merge phase would handle the --unique semantics.

Thanks again!

reply via email to

[Prev in Thread] Current Thread [Next in Thread]