coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Generate random numbers with shuf


From: Pádraig Brady
Subject: Re: Generate random numbers with shuf
Date: Wed, 10 Jul 2013 16:20:27 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 07/05/2013 10:43 PM, Assaf Gordon wrote:
> 
> On 07/05/2013 12:12 PM, Pádraig Brady wrote:
>> On 07/05/2013 07:04 PM, Assaf Gordon wrote:
>>> Hello,
>>>>
>>>>> Regarding old discussion here:
>>>>> http://lists.gnu.org/archive/html/coreutils/2011-02/msg00030.html
>>>>>
>>>>> Attached is a patch with adds "--repetition" option to shuf, enabling 
>>>>> random number generation with repetitions.
>>>>>
>>>>
>>>> I like this.
>>>> --repetition seems to be a very good interface too,
>>>> since it aligns with standard math nomenclature in regard to permutations.
>>>>
>>>> I'd prefer to generalize it though, to supporting stdin as well as -i.
>>>
>>> Attached is an updated patch, supporting "--repetitions" with STDIN/FILE/-e 
>>> (using the naive implementation ATM).
>>> e.g.
>>>    $ shuf --repetitions --head-count=100 --echo Head Tail
>>> or
>>>    $ shuf -r -n100 -e Head Tail
>>
>> Excellent thanks.
>>
>>> But the code is getting a bit messy, I guess from evolving features over 
>>> time.
>>> I'd like to re-organize it a bit, re-factor some functions and make the 
>>> code clearer - what do you think?
>>> it will make the code slightly more verbose (and slightly bigger), but 
>>> shouldn't change the running performance.
>>
>> If you're getting your head around the code enough to refactor,
>> then it would be great if you could handle the TODO: item in shuf.c
> 
> Attached is an updated patch, with some code cleanups (not including said 
> TODO item yet).
> 
> -gordon

I've split to two patches.
1. Unrelated test improvements.
2. All the rest

Note in both patches I made adjustments to the tests like

-c=$(cat exp | wc -l) || framework_failure_
+c=$(wc -l < exp) || framework_failure_

-c=$(cat exp | sort -nu | fmt ) || framework_failure_
+c=$(sort -nu exp | paste -s -d ' ') || framework_failure_

I.E. avoid cat unless needed, and paste is more general than fmt in this usage.

Also I simplified the --help a little like:

-  -r, --repetitions         output COUNT values, with repetitions.\n\
-                            with -iLO-HI, output random numbers.\n\
-                            with -e, stdin or FILE, output random lines.\n\
-                            count defaults to 1 if -n COUNT is not used.\n\
+  -r, --repetitions         output COUNT items, allowing repetition.\n\
+                              -n 1 is implied if not specified.\n\

I'll push the 2 attached patches soon.

thanks!
Pádraig.

Attachment: shuf-repetition.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]