[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#49217: 'shuf' returns nothing if the low range number is higher by 1
bug#49217: 'shuf' returns nothing if the low range number is higher by 1 than the high number
Fri, 25 Jun 2021 20:00:13 +0200
On Fri, Jun 25, 2021 at 09:29:04AM -0700, Paul Eggert wrote:
> On 6/24/21 11:49 PM, Erik Auerswald wrote:
> > $ shuf -i 2-0 ; echo %exit code $?
> > shuf: invalid input range: ‘2-0’
> > %exit code 1
> > $ shuf -i 1-0 ; echo %exit code $?
> > %exit code 0
> >This looks inconsistent and possibly not exactly as intended.
> It's exactly what I intended and there's no inconsistency. When you
> say 'shuf -i M-N' you select from a collection of N-M+1 lines.
It also specifies the contents of those lines, unless there is less than
> N-M+1 = 0 (no input lines) makes sense, but N-M+1 < 0 (negative number
> of input lines?) does not.
I do not think that it makes sense to specify the contents of no input
lines. Perhaps we can agree to disagree on this?
Then the documentation does not describe it that way. I think that can
lead to confusion.
The documentation describes the option as simulating input "from a file
containing the range of unsigned decimal integers LO...HI, one per line."
>From this description it is not obvious that "1-0" is OK, but "2-0"
is not. In both cases LO > HI, but one is accepted without error,
but the other is not.
I think that "select from a negative number of lines" makes just as much
sense as "select from no lines at all." Here we seem to disagree, which
is OK with me.
Similarly to "shuf -iLO-HI", "seq FIRST LAST" produces LAST-FIRST+1 lines.
But seq does allow to ask, to adapt your wording, for a negative number
$ seq 2 0 ; echo %exit code $?
%exit code 0
$ seq 1 0 ; echo %exit code $?
%exit code 0
$ seq 0 0 ; echo %exit code $?
%exit code 0
The problem I see is that the intention behind "shuf -i" that can be
gleaned from your implementation, and that you have described above,
is not obvious from the documentation or from similar functionality in
the GNU Core Utilities.
I see three views regarding the case of LO > HI in this thread:
1. The bug reporter expected LO > HI to always produce an error,
or possibly to never produce an error.
2. Your "shuf" implementation sees LO == HI+1 as the one allowed
possibility to specify no input, based on the HI-LO+1 formula for
the number of lines to choose from.
3. The "seq" implementation in the GNU Core Utilities allows LO > HI
and interprets it as the empty sequence. I actually like this best.
Thus I think that it is not as clear and obvious as you seem to
expect that the current "shuf" behavior is the obviously correct one.
No offense intended!
I do not care deeply which behavior is selected. I just want to make
it clearer for others, including me, to understand that the current
implementation is as intended. Adding to the documentation (for users)
and the tests (for developers) seems to be helpful to me.
> >I'd like to document it and add test cases.
> Feel free,
Thanks, I'll think about a wording both simple to understand and including
the special case. I intend to send a patch to this bug report in a
couple of days.
> though we need to reserve the right to extend 'shuf' in
> the future. In other words, not every invocation of 'shuf' that
> provokes a diagnostic now will provoke a diagnostic in the future.
I like that.
Be water, my friend.
-- Bruce Lee
bug#49217: [PATCH] shuf: fix bug with "-i 1-0", Erik Auerswald, 2021/06/25
bug#49217: [PATCH] doc: clarify valid ranges for shuf -i, Erik Auerswald, 2021/06/26
bug#49217: [PATCH] tests: exercise shuf --input-range edge cases, Erik Auerswald, 2021/06/26