[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: remove minor arbitrary numeric restrictions from join, sort, uniq

From: Jim Meyering
Subject: Re: remove minor arbitrary numeric restrictions from join, sort, uniq
Date: Wed, 13 Dec 2006 22:19:01 +0100

Paul Eggert <address@hidden> wrote:
> While looking into some POSIX conformance issues for 'sort', I noticed
> an arbitrary limit in 'sort' that we can remove.  "sort -k
> 1000000000000000000" currently complains that the count is too large,
> but we can implement that case correctly without too much trouble.
> The same issue affects join and uniq.
> It's kind of silly, I suppose, but since the fix is easy and (except
> for uniq) simplifies the code we might as well do it.

I went through the "do what I mean" vs. "do what I say" debate for
a couple of minutes, but simplification won.

> Here is a proposed patch.
> This patch verifies that SIZE_MAX <= ULONG_MAX in a couple of places,
> but there are many, many more places in coreutils where that
> assumption is made, so perhaps that test should be moved to system.h?
> Your call.

I'd like the code that verifies such assumptions to be as close as
possible to the affected code.  That will help maintainers (me at least)
keep such assumptions in mind when changing nearby code.
It wouldn't hurt to add a "package-global" one in system.h, as
long as that doesn't serve as an excuse for not adding them elsewhere.

> 2006-12-12  Paul Eggert  <address@hidden>
>       Remove some arbitrary restrictions on size fields, so that
>       commands like "sort -k 18446744073709551616" no longer fail merely
>       because 18446744073709551616 doesn't fit in uintmax_t.  The trick
>       is that these fields can all be treated as effectively infinity;
>       their exact values don't matter, since no internal buffer can be
>       that long.
>       * src/join.c (string_to_join_field): Verify that SIZE_MAX <=
>       ULONG_MAX if the code assumes this.  Silently truncate too-large
>       values to SIZE_MAX, as the remaining code will do the right thing
>       in this case.
>       * src/sort.c (parse_field_count): Likewise.
>       * src/uniq.c (size_opt, main): Likewise.
>       * tests/join/Test.pm (bigfield): New test.
>       * tests/sort/Test.pm (bigfield): New test.
>       * tests/uniq/Test.pm (121): New test.

Thanks.  I liked all but the latter hunk of your uniq.c patch.
I couldn't swallow the four occurrences of "skip_fields" and three
of "SIZE_MAX" all in one expression :-)
I do understand your sentiment (it's ugly to mix assignment through
the DECIMAL_DIGIT_ACCUMULATE macro and an explicit assignment),
but the alternative requires too much syntax, and imho, leaves too
much room for error.

So the commit with your name on it has this additional change.
Hope you don't mind :)

--- src/uniq.c  2006-12-13 21:27:16.000000000 +0100
+++ src/uniq.c  2006-12-13 21:58:11.000000000 +0100
@@ -480,12 +480,8 @@ main (int argc, char **argv)
            if (skip_field_option_type == SFO_NEW)
              skip_fields = 0;

-           skip_fields =
-             ((skip_fields < SIZE_MAX / 10
-               || (skip_fields == SIZE_MAX / 10
-                   && optc - '0' <= (int) (SIZE_MAX % 10)))
-              ? 10 * skip_fields + (optc - '0')
-              : SIZE_MAX);
+           if (!DECIMAL_DIGIT_ACCUMULATE (skip_fields, optc - '0', size_t))
+             skip_fields = SIZE_MAX;

            skip_field_option_type = SFO_OBSOLETE;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]