coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: potential feature addition to coreutils' sort.c: print at most N lin


From: Pádraig Brady
Subject: Re: potential feature addition to coreutils' sort.c: print at most N lines
Date: Wed, 06 Mar 2013 09:24:25 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 03/06/2013 01:26 AM, James Dowdell wrote:
> Pádraig,
> 
> Thanks for the pointers.  I am slowly making my way through sort.c,
> and preparing a patch; I estimate I'll have something ready in a few
> days or weeks.
> 
> One final question before I go make this thing, regarding the thread
> conversation you included from 2004, in which Philippe De Muyter
> suggested he already had a functional patch involving a heap.  Is
> there a code change I should be starting from, or has that been lost /
> never submitted?

I can't see it submitted.
At least there are performance numbers to compare against.

> Originally I also thought that the 'optimal' solution would involve
> heaps in some way, as everyone in that thread suspected, but I've
> surprised myself by discovering an n-log-k algorithm based off of
> merge sort, which I think would be much easier to dovetail with the
> existing code in sort.c and may actually work better too.  If there is
> no patch to begin from, I'll be preparing my patch based on that
> unless someone objects.

n-log-k sounds good.
Is that dependent on the input data?

Is it worth special casing for performance,
when only getting a single top/bottom entry?

Have you started the copyright assignment process in parallel?

It like Jim's suggestion of a single --range option,
rather than the more traditional separated --head and --tail.
There are arguments for calling this --slice, to align
with javascript/php/python/..., but that's just minor renaming.

thanks!
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]