bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

sort --batch-size non-merge bug


From: Bo Borgerson
Subject: sort --batch-size non-merge bug
Date: Thu, 19 Jun 2008 16:01:55 -0400
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Hi,

I'm embarrassed to say that I've discovered a bug in the recently added
--batch-size option of sort.

If --batch-size is used with a non-merge sort (to govern the merge of
temp files), and there is no --buffer-size set in conjunction, then the
minimum SORT_SIZE will be enforced resulting in severe performance
degradation.

I've attached a fix for this bug, including a test that exercises it.
I've also pushed to repo.or.cz.

Sorry for introducing this.

Thanks,

Bo
>From 91aa3fb5a2636dc918bafa67f3a097d646cac075 Mon Sep 17 00:00:00 2001
From: Bo Borgerson <address@hidden>
Date: Thu, 19 Jun 2008 15:37:21 -0400
Subject: [PATCH] sort: Fix bug where --batch-size option shrank SORT_SIZE.

* src/sort.c (specify_nmerge, main): Only adjust SORT_SIZE if it's already set.
* tests/misc/sort-merge: Test bug fix.

Signed-off-by: Bo Borgerson <address@hidden>
---
 src/sort.c            |   14 ++++++--------
 tests/misc/sort-merge |    7 +++++++
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/sort.c b/src/sort.c
index 1393521..2039dab 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1105,14 +1105,7 @@ specify_nmerge (int oi, char c, char const *s)
              e = LONGINT_OVERFLOW;
            }
          else
-           {
-             /* Need to re-check that we meet the minimum
-                requirement for memory usage with the new,
-                potentially larger, nmerge. */
-             sort_size = MAX (sort_size, MIN_SORT_SIZE);
-
-             return;
-           }
+           return;
        }
     }
 
@@ -3320,6 +3313,11 @@ main (int argc, char **argv)
       files = &minus;
     }
 
+  /* Need to re-check that we meet the minimum requirement for memory
+     usage with the final value for NMERGE. */
+  if (0 < sort_size)
+    sort_size = MAX (sort_size, MIN_SORT_SIZE);
+
   if (checkonly)
     {
       if (nfiles > 1)
diff --git a/tests/misc/sort-merge b/tests/misc/sort-merge
index a2524c4..fb7c63c 100755
--- a/tests/misc/sort-merge
+++ b/tests/misc/sort-merge
@@ -27,6 +27,8 @@ my $prog = 'sort';
 # three empty files and one that says 'foo'
 my @inputs = (+(map{{IN=> {"empty$_"=> ''}}}1..3), {IN=> {foo=> "foo\n"}});
 
+my $big_input = "aaa\n" x 1024;
+
 # don't need to check for existence, since we're running in a temp dir
 my $badtmp = 'does/not/exist';
 
@@ -66,6 +68,11 @@ my @Tests =
      ['nmerge-no', "-m --batch-size=2 -T$badtmp", @inputs,
         {ERR_SUBST=>"s|: $badtmp/sort.+||"},
         {ERR=>"$prog: cannot create temporary file\n"}, {EXIT=>2}],
+
+     # This used to fail because setting batch-size without also setting
+     # buffer size would cause the buffer size to be set to the minimum.
+     ['batch-size', "--batch-size=16 -T$badtmp", {IN=> {big=> $big_input}},
+       {OUT=>$big_input}],
     );
 
 my $save_temps = $ENV{DEBUG};
-- 
1.5.4.3


reply via email to

[Prev in Thread] Current Thread [Next in Thread]