findutils-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Findutils-patches] [PATCH 1/3] xargs: patch for varying parallelism


From: James Youngman
Subject: [Findutils-patches] [PATCH 1/3] xargs: patch for varying parallelism
Date: Sat, 6 Dec 2008 14:30:59 +0000

From: John Gilmore <address@hidden>

Increase/decrease parallelism with SIGUSR1 and SIGUSR2.
---
 ChangeLog     |    6 +++
 NEWS          |    5 +++
 doc/find.texi |  103 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 xargs/xargs.1 |   11 ++++++
 xargs/xargs.c |   71 +++++++++++++++++++++++++++++++++++++--
 5 files changed, 193 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 87ebd4a..36f9e59 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2008-12-06  John Gilmore  <address@hidden>
+
+       * xargs/xargs.c: Increase parallelism in mid-run with SIGUSR1;
+       decrease it with SIGUSR2.
+       * doc/find.texi, xargs/xargs.1, NEWS: Document SIGUSR1/2.
+
 2008-11-08  James Youngman  <address@hidden>
 
        * import-gnulib.config (gnulib_version): Update version of gnulib,
diff --git a/NEWS b/NEWS
index c10c317..45dc2e5 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,11 @@ Performance imporvements may only exist for some find 
command lines
 (performance testing was done for the fts implementation itself but
 we haven't done the analogous performance tests in find).
 
+** Functional enhancements to xargs
+
+You can now increase the parallelism of xargs in mid-run by sending
+it SIGUSR1, and decrease the parallelism with SIGUSR2.
+
 * Major changes in release 4.5.2, 2008-09-07
 
 ** Bug Fixes
diff --git a/doc/find.texi b/doc/find.texi
index 56d3e32..e6c93ad 100644
--- a/doc/find.texi
+++ b/doc/find.texi
@@ -2210,6 +2210,7 @@ with the @samp{--eof} option.
 * Safe File Name Handling::
 * Unusual Characters in File Names::
 * Limiting Command Size::
+* Controlling Parallelism::
 * Interspersing File Names::
 @end menu
 
@@ -2440,7 +2441,20 @@ less 2048 bytes of headroom.  If this value is more than 
128KiB,
 128Kib is used as the default value; otherwise, the default value is
 the maximum.
 
address@hidden Controlling Parallelism
address@hidden Controlling Parallelism
 
+Normally, @code{xargs} runs one command at a time.  This is called
+"serial" execution; the commands happen in a series, one after another.
+If you'd like @code{xargs} to do things in "parallel", you can ask it
+to do so, either when you invoke it, or later while it is running.
+Running several commands at one time can make the entire operation
+go more quickly, if the commands are independent, and if your system
+has enough resources to handle the load.  When parallelism works in
+your application, @code{xargs} provides an easy way to get your work
+done faster.
+
address@hidden @code
 @item address@hidden
 @itemx -P @var{max-procs}
 Run up to @var{max-procs} processes at a time; the default is 1.  If
@@ -2450,6 +2464,88 @@ with @samp{-P}; otherwise chances are that the command 
will be run
 only once.
 @end table
 
+For example, suppose you have a directory tree of large image files
+and a @code{makeallsizes} script that takes a single file name and
+creates various sized images from it (thumbnail-sized, web-page-sized,
+printer-sized, and the original large file).  The script is doing enough
+work that it takes significant time to run, even on a single image.
+You could run:
+
address@hidden
+find originals -name '*.jpg' | xargs -1 makeallsizes
address@hidden example
+
+This will run @code{makeallsizes @var{filename}} once for each @code{.jpg}
+file in the @code{originals} directory.  However, if your system has
+two central processors, this script will only keep one of them busy.
+Instead, you could probably finish in about half the time by running:
+
address@hidden
+find originals -name '*.jpg' | xargs -1 -P 2 makeallsizes
address@hidden example
+
address@hidden will run the first two commands in parallel, and then
+whenever one of them terminates, it will start another one, until
+the entire job is done.
+
+The same idea can be generalized to as many processors as you have handy.
+It also generalizes to other resources besides processors.  For example,
+if xargs is running commands that are waiting for a response from a
+distant network connection, running a few in parallel may reduce the
+overall latency by overlapping their waiting time.
+
address@hidden also allows you to "turn up" or "turn down" its parallelism
+in the middle of a run.  Suppose you are keeping your four-processor
+system busy for hours, processing thousands of images using @code{-P 4}.
+Now, in the middle of the run, you or someone else wants you to reduce
+your load on the system, so that something else will run faster.
+If you interrupt @code{xargs}, your job will be half-done, and it
+may take significant manual work to resume it only for the remaining
+images.  If you suspend @code{xargs} using your shell's job controls
+(e.g. @code{control-Z}), then it will get no work done while suspended.
+
+Find out the process ID of the @code{xargs} process, either from your
+shell or with the @code{ps} command.  After you send it the signal
address@hidden, @code{xargs} will run one fewer command in parallel.
+If you send it the signal @code{SIGUSR1}, it will run one more command
+in parallel.  For example:
+
address@hidden
+shell$ xargs <allimages -1 -P 4 makeallsizes &
+[4] 27643
+   ... at some later point ...
+shell$ kill -USR2 27643
+shell$ kill -USR2 %4
address@hidden example
+
+The first @code{kill} command will cause @code{xargs} to wait for
+two commands to terminate before starting the next command (reducing
+the parallelism from 4 to 3).  The second @code{kill} will reduce it from
+3 to 2.  (@code{%4} works in some shells as a shorthand for the process
+ID of the background job labeled @code{[4]}.)
+
+Similarly, if you started a long xargs job without parallelism, you
+can easily switch it to start running two commands in parallel by sending
+it a @code{SIGUSR1}.
+
address@hidden will never terminate any existing commands when you ask it
+to run fewer processes.  It merely waits for the excess commands to
+finish.  If you ask it to run more commands, it will start the next
+one immediately (if it has more work to do).
+
+If you send several identical signals quickly, the operating system
+does not guarantee that each of them will be delivered to @code{xargs}.
+This means that you can't rapidly increase or decrease the parallelism by
+more than one command at a time.  You can avoid this problem by sending
+a signal, observing the result, then sending the next one; or merely by
+delaying for a few seconds between signals (unless your system is very
+heavily loaded).
+
+Whether or not parallel execution will work well for you depends on
+the nature of the commmand you are running in parallel, on the
+configuration of the system on which you are running the command, and
+on the other work being done on the system at the time.
+
 @node Interspersing File Names
 @subsubsection Interspersing File Names
 
@@ -5308,6 +5404,13 @@ return status 255.
 
 @item <program>: terminated by signal 99
 See the description of the similar message for @code{find}.
+
address@hidden cannot set SIGUSR1 signal handler
address@hidden is having trouble preparing for you to be able to send it
+signals to increase or decrease the parallelism of its processing.
+If you don't plan to send it those signals, this warning can be ignored
+(though if you're a programmer, you may want to help us figure out
+why @code{xargs} is confused by your operating system).
 @end table
 
 @node Error Messages From locate
diff --git a/xargs/xargs.1 b/xargs/xargs.1
index 413b55e..d840bf1 100644
--- a/xargs/xargs.1
+++ b/xargs/xargs.1
@@ -300,6 +300,16 @@ possible at a time.  Use the
 option with 
 .BR \-P ;
 otherwise chances are that only one exec will be done.
+While
+.B xargs
+is running, you can
+send its process
+a SIGUSR1 signal to increase the number of commands to run simultaneously,
+or a SIGUSR2 to decrease the number.  You cannot decrease it below 1.
+.B xargs
+never terminates its commands; when asked to decrease, it merely
+waits for more than one existing command to terminate before starting
+another.
 .SH "EXAMPLES"
 .nf
 .B find /tmp \-name core \-type f \-print | xargs /bin/rm \-f
@@ -402,6 +412,7 @@ current system.
 .SH "SEE ALSO"
 \fBfind\fP(1), \fBlocate\fP(1), \fBlocatedb\fP(5), \fBupdatedb\fP(1),
 \fBfork\fP(2), \fBexecvp\fP(3), 
+\fBkill\fP(1), \fBsignal\fP(2),
 \fBFinding Files\fP (on-line in Info, or printed)
 .SH "BUGS"
 The
diff --git a/xargs/xargs.c b/xargs/xargs.c
index 1365a71..1791a6a 100644
--- a/xargs/xargs.c
+++ b/xargs/xargs.c
@@ -206,7 +206,8 @@ static boolean initial_args = true;
 
 /* If nonzero, the maximum number of child processes that can be running
    at once.  */
-static unsigned long int proc_max = 1uL;
+/* TODO: check conversion safety (i.e. range) for -P option. */
+static volatile sig_atomic_t proc_max = 1;
 
 /* Did we fork a child yet? */
 static boolean procs_executed = false;
@@ -220,6 +221,9 @@ static pid_t *pids = NULL;
 /* The number of allocated elements in `pids'. */
 static size_t pids_alloc = 0u;
 
+/* If nonzero, we've been signaled that we can start more child processes. */
+static volatile sig_atomic_t stop_waiting = 0;
+
 /* Exit status; nonzero if any child process exited with a
    status of 1-125.  */
 static volatile int child_error = 0;
@@ -269,6 +273,8 @@ static void exec_if_possible PARAMS ((void));
 static void add_proc PARAMS ((pid_t pid));
 static void wait_for_proc PARAMS ((boolean all, unsigned int minreap));
 static void wait_for_proc_all PARAMS ((void));
+static void increment_proc_max PARAMS ((int));
+static void decrement_proc_max PARAMS ((int));
 static long parse_num PARAMS ((char *str, int option, long min, long max, int 
fatal));
 static void usage PARAMS ((FILE * stream));
 
@@ -412,6 +418,8 @@ main (int argc, char **argv)
   enum BC_INIT_STATUS bcstatus;
   enum { XARGS_POSIX_HEADROOM = 2048u };
   
+  struct sigaction sigact;
+
   program_name = argv[0];
   original_exit_value = 0;
   
@@ -639,6 +647,26 @@ main (int argc, char **argv)
   act_on_init_result();
   assert (BC_INIT_OK == bcstatus);
 
+#ifdef SIGUSR1
+#ifdef SIGUSR2
+  /* Accept signals to increase or decrease the number of running
+     child processes.  Do this as early as possible after setting
+     proc_max.  */
+  sigact.sa_handler = increment_proc_max;
+  sigemptyset(&sigact.sa_mask);
+  sigact.sa_flags = 0;
+  if (0 != sigaction (SIGUSR1, &sigact, (struct sigaction *)NULL))
+         error (0, errno, _("Cannot set SIGUSR1 signal handler"));
+
+  sigact.sa_handler = decrement_proc_max;
+  sigemptyset(&sigact.sa_mask);
+  sigact.sa_flags = 0;
+  if (0 != sigaction (SIGUSR2, &sigact, (struct sigaction *)NULL))
+         error (0, errno, _("Cannot set SIGUSR2 signal handler"));
+#endif /* SIGUSR2 */
+#endif /* SIGUSR1 */
+
+
   if (0 == strcmp (input_file, "-"))
     {
       input_stream = stdin;
@@ -1193,7 +1221,8 @@ add_proc (pid_t pid)
 
 
 /* If ALL is true, wait for all child processes to finish;
-   otherwise, wait for at least MINREAP child processes to finish.
+   otherwise, wait for one child process to finish, or for another signal
+   that tells us that we can run more child processes.
    Remove the processes that finish from the list of executing processes.  */
 
 static void
@@ -1221,6 +1250,7 @@ wait_for_proc (boolean all, unsigned int minreap)
            }
        }
 
+      stop_waiting = 0;
       do
        {
          /* Wait for any child.   We used to use wait() here, but it's 
@@ -1231,6 +1261,17 @@ wait_for_proc (boolean all, unsigned int minreap)
            {
              if (errno != EINTR)
                error (1, errno, _("error waiting for child process"));
+
+             if (stop_waiting && !all)
+               {
+                 /* Receipt of SIGUSR1 gave us an extra slot and we
+                  * don't need to wait for all processes to finish.
+                  * We can stop reaping now, but in any case check for 
+                  * further dead children without waiting for another 
+                  * to exit.
+                  */
+                 wflags = WNOHANG;
+               }
            }
          
          /* Find the entry in `pids' for the child process
@@ -1263,7 +1304,7 @@ wait_for_proc (boolean all, unsigned int minreap)
            }
          break;
        }
-      
+
       /* Remove the child from the list.  */
       pids[i] = 0;
       procs_executing--;
@@ -1311,6 +1352,30 @@ wait_for_proc_all (void)
   
 }
 
+
+/* Increment or decrement the number of processes we can start simultaneously,
+   when we receive a signal from the outside world.
+   
+   We must take special care around proc_max == 0 (unlimited children),
+   proc_max == 1 (don't decrement to zero).  */
+static void
+increment_proc_max (int ignore)
+{
+       /* If user increments from 0 to 1, we'll take it and serialize. */
+       proc_max++;
+       /* If we're waiting for a process to die before doing something,
+          no need to wait any more. */
+       stop_waiting = 1;
+}
+
+static void
+decrement_proc_max (int ignore)
+{
+       if (proc_max > 1)
+               proc_max--;
+}
+
+
 /* Return the value of the number represented in STR.
    OPTION is the command line option to which STR is the argument.
    If the value does not fall within the boundaries MIN and MAX,
-- 
1.5.6.5





reply via email to

[Prev in Thread] Current Thread [Next in Thread]