[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Generate random numbers with shuf
From: |
Pádraig Brady |
Subject: |
Re: Generate random numbers with shuf |
Date: |
Fri, 05 Jul 2013 19:12:49 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 07/05/2013 07:04 PM, Assaf Gordon wrote:
> Hello,
>
> On 07/04/2013 05:40 PM, Pádraig Brady wrote:
>> On 07/04/2013 09:41 PM, Assaf Gordon wrote:
>>>
>>> Regarding old discussion here:
>>> http://lists.gnu.org/archive/html/coreutils/2011-02/msg00030.html
>>>
>>> Attached is a patch with adds "--repetition" option to shuf, enabling
>>> random number generation with repetitions.
>>>
>>
>> I like this.
>> --repetition seems to be a very good interface too,
>> since it aligns with standard math nomenclature in regard to permutations.
>>
>> I'd prefer to generalize it though, to supporting stdin as well as -i.
>
> Attached is an updated patch, supporting "--repetitions" with STDIN/FILE/-e
> (using the naive implementation ATM).
> e.g.
> $ shuf --repetitions --head-count=100 --echo Head Tail
> or
> $ shuf -r -n100 -e Head Tail
Excellent thanks.
> But the code is getting a bit messy, I guess from evolving features over time.
> I'd like to re-organize it a bit, re-factor some functions and make the code
> clearer - what do you think?
> it will make the code slightly more verbose (and slightly bigger), but
> shouldn't change the running performance.
If you're getting your head around the code enough to refactor,
then it would be great if you could handle the TODO: item in shuf.c
That would handle a performance regression in the common case
with reservoir sampling, and would be a good fit for the
upcoming release, given its performance theme.
cheers,
Pádraig.