bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Patch] Xargs: vary parallelism with SIGUSR1/2


From: John Gilmore
Subject: [Patch] Xargs: vary parallelism with SIGUSR1/2
Date: Sat, 18 Jun 2005 04:21:50 -0700

I've been pleasantly surprised at how well xargs --max-procs N works.
I've been using it with wget, and it lets me trivially parallelize
some kinds of web spidering.  As more people get multiprocessor
systems and multiprocessor cores, it will become even more useful.

The enclosed patch lets the parallelism of a long-running xargs be
turned up or down with SIGUSR1 and SIGUSR2.  This lets me dynamically
increase or decrease the network bandwidth I consume, without doing a
cumbersome stop, rework the arguments, and restart cycle.

The cleanliness of the existing code made this an easy improvement to
contribute.  I've also improved the documentation of --max-procs with
a few examples.

        John

Suggested ChangeLog-entry:

2005-06-18  John Gilmore  <address@hidden>

        * xargs/xargs.c: Increase parallelism in mid-run with SIGUSR1;
        decrease it with SIGUSR2.
        * doc/find.texi, xargs/xargs.1, NEWS: Document SIGUSR1/2.


diff -ru findutils-4.2.22/NEWS findutils-4.2.22+gnu/NEWS
--- findutils-4.2.22/NEWS       Sun Jun 12 15:28:13 2005
+++ findutils-4.2.22+gnu/NEWS   Sat Jun 18 03:55:17 2005
@@ -1,4 +1,8 @@
 GNU findutils NEWS - User visible changes.     -*- outline -*- (allout)
+
+You can now increase the parallelism of xargs in mid-run by sending
+it SIGUSR1, and decrease the parallelism with SIGUSR2.
+
 * Major changes in release 4.2.22
 
 ** Security Fixes
diff -ru findutils-4.2.22/doc/find.texi findutils-4.2.22+gnu/doc/find.texi
--- findutils-4.2.22/doc/find.texi      Sun Jun 12 14:47:02 2005
+++ findutils-4.2.22+gnu/doc/find.texi  Sat Jun 18 03:50:31 2005
@@ -1773,6 +1773,7 @@
 * Safe File Name Handling::
 * Unusual Characters in File Names::
 * Limiting Command Size::
+* Controlling Parallelism::
 * Interspersing File Names::
 @end menu
 
@@ -1981,7 +1982,22 @@
 the argument strings.  If you specify a value for this option which is
 too large or small, a warning message is printed and the appropriate
 upper or lower limit is used instead.
address@hidden table
+
address@hidden Controlling Parallelism
address@hidden Controlling Parallelism
+
+Normally, @code{xargs} runs one command at a time.  This is called
+"serial" execution; the commands happen in a series, one after another.
+If you'd like @code{xargs} to do things in "parallel", you can ask it
+to do so, either when you invoke it, or later while it is running.
+Running several commands at one time can make the entire operation
+go more quickly, if the commands are independent, and if your system
+has enough resources to handle the load.  When parallelism works in
+your application, @code{xargs} provides an easy way to get your work
+done faster.
 
address@hidden @code
 @item address@hidden
 @itemx -P @var{max-procs}
 Run up to @var{max-procs} processes at a time; the default is 1.  If
@@ -1991,6 +2007,83 @@
 once.
 @end table
 
+For example, suppose you have a directory tree of large image files
+and a @code{makeallsizes} script that takes a single file name and
+creates various sized images from it (thumbnail-sized, web-page-sized,
+printer-sized, and the original large file).  The script is doing enough
+work that it takes significant time to run, even on a single image.
+You could run:
+
address@hidden
+find originals -name '*.jpg' | xargs -1 makeallsizes
address@hidden example
+
+This will run @code{makeallsizes @var{filename}} once for each @code{.jpg}
+file in the @code{originals} directory.  However, if your system has
+two central processors, this script will only keep one of them busy.
+Instead, you could probably finish in about half the time by running:
+
address@hidden
+find originals -name '*.jpg' | xargs -1 -P 2 makeallsizes
address@hidden example
+
address@hidden will run the first two commands in parallel, and then
+whenever one of them terminates, it will start another one, until
+the entire job is done.
+
+The same idea can be generalized to as many processors as you have handy.
+It also generalizes to other resources besides processors.  For example,
+if xargs is running commands that are waiting for a response from a
+distant network connection, running a few in parallel may reduce the
+overall latency by overlapping their waiting time.
+
address@hidden also allows you to "turn up" or "turn down" its parallelism
+in the middle of a run.  Suppose you are keeping your four-processor
+system busy for hours, processing thousands of images using @code{-P 4}.
+Now, in the middle of the run, you or someone else wants you to reduce
+your load on the system, so that something else will run faster.
+If you interrupt @code{xargs}, your job will be half-done, and it
+may take significant manual work to resume it only for the remaining
+images.  If you suspend @code{xargs} using your shell's job controls
+(e.g. @code{control-Z}), then it will get no work done while suspended.
+
+Find out the process ID of the @code{xargs} process, either from your
+shell or with the @code{ps} command.  After you send it the signal
address@hidden, @code{xargs} will run one fewer command in parallel.
+If you send it the signal @code{SIGUSR1}, it will run one more command
+in parallel.  For example:
+
address@hidden
+shell$ xargs <allimages -1 -P 4 makeallsizes &
+[4] 27643
+   ... at some later point ...
+shell$ kill -USR2 27643
+shell$ kill -USR2 %4
address@hidden example
+
+The first @code{kill} command will cause @code{xargs} to wait for
+two commands to terminate before starting the next command (reducing
+the parallelism from 4 to 3).  The second @code{kill} will reduce it from
+3 to 2.  (@code{%4} works in some shells as a shorthand for the process
+ID of the background job labeled @code{[4]}.)
+
+Similarly, if you started a long xargs job without parallelism, you
+can easily switch it to start running two commands in parallel by sending
+it a @code{SIGUSR1}.
+
address@hidden will never terminate any existing commands when you ask it
+to run fewer processes.  It merely waits for the excess commands to
+finish.  If you ask it to run more commands, it will start the next
+one immediately (if it has more work to do).
+
+If you send several identical signals quickly, the operating system
+does not guarantee that each of them will be delivered to @code{xargs}.
+This means that you can't rapidly increase or decrease the parallelism by
+more than one command at a time.  You can avoid this problem by sending
+a signal, observing the result, then sending the next one; or merely by
+delaying for a few seconds between signals (unless your system is very
+heavily loaded).
+
 @node Interspersing File Names
 @subsubsection Interspersing File Names
 
@@ -3019,7 +3112,9 @@
 @itemx -P @var{max-procs}
 Run up to @var{max-procs} processes at a time; the default is 1.  If
 @var{max-procs} is 0, @code{xargs} will run as many processes as
-possible at a time.
+possible at a time.  This parameter can be incremented in mid-run
+by sending @code{SIGUSR1}, or decremented with @code{SIGUSR2}.
+It will not decrement to zero with a signal.
 @end table
 
 @node Security Considerations, Error Messages, Reference, Top
@@ -3587,6 +3682,13 @@
 
 @item <program>: terminated by signal 99
 See the description of the similar message for @code{find}.
+
address@hidden cannot set SIGUSR1 signal handler
address@hidden is having trouble preparing for you to be able to send it
+signals to increase or decrease the parallelism of its processing.
+If you don't plan to send it those signals, this warning can be ignored
+(though if you're a programmer, you may want to help us figure out
+why @code{xargs} is confused by your operating system).
 @end table
 
 @node Error Messages From locate, Error Messages From updatedb, Error Messages 
From xargs, Error Messages
diff -ru findutils-4.2.22/xargs/xargs.1 findutils-4.2.22+gnu/xargs/xargs.1
--- findutils-4.2.22/xargs/xargs.1      Tue Jun  7 15:18:41 2005
+++ findutils-4.2.22+gnu/xargs/xargs.1  Sat Jun 18 03:50:31 2005
@@ -139,6 +139,16 @@
 \fImax-procs\fR is 0, \fBxargs\fR will run as many processes as
 possible at a time.  Use the \fI\-n\fR option with \fI\-P\fR;
 otherwise chances are that only one exec will be done.
+While
+.B xargs
+is running, you can
+send its process 
+a SIGUSR1 signal to increase the number of commands to run simultaneously,
+or a SIGUSR2 to decrease the number.  You cannot decrease it below 1.
+.B xargs
+never terminates its commands; when asked to decrease, it merely
+waits for more than one existing command to terminate before starting
+another.
 .SH "EXAMPLES"
 .nf
 .B find /tmp \-name core \-type f \-print | xargs /bin/rm \-f
@@ -189,6 +199,7 @@
 
 .SH "SEE ALSO"
 \fBfind\fP(1), \fBlocate\fP(1), \fBlocatedb\fP(5), \fBupdatedb\fP(1),
+\fBkill\fP(1), \fBsignal\fP(2),
 \fBFinding Files\fP (on-line in Info, or printed)
 .SH "BUGS"
 .P 
diff -ru findutils-4.2.22/xargs/xargs.c findutils-4.2.22+gnu/xargs/xargs.c
--- findutils-4.2.22/xargs/xargs.c      Tue Jun  7 15:18:41 2005
+++ findutils-4.2.22+gnu/xargs/xargs.c  Sat Jun 18 03:53:53 2005
@@ -248,7 +248,7 @@
 
 /* If nonzero, the maximum number of child processes that can be running
    at once.  */
-static int proc_max = 1;
+static volatile sig_atomic_t proc_max = 1;
 
 /* Total number of child processes that have been executed.  */
 static int procs_executed = 0;
@@ -262,6 +262,9 @@
 /* The number of allocated elements in `pids'. */
 static int pids_alloc = 0;
 
+/* If nonzero, we've been signaled that we can start more child processes. */
+static volatile sig_atomic_t stop_waiting = 0;
+
 /* Exit status; nonzero if any child process exited with a
    status of 1-125.  */
 static int child_error = 0;
@@ -307,6 +310,8 @@
 static void add_proc PARAMS ((pid_t pid));
 static void wait_for_proc PARAMS ((boolean all));
 static void wait_for_proc_all PARAMS ((void));
+static void increment_proc_max PARAMS ((int));
+static void decrement_proc_max PARAMS ((int));
 static long parse_num PARAMS ((char *str, int option, long min, long max, int 
fatal));
 static long env_size PARAMS ((char **envp));
 static void usage PARAMS ((FILE * stream));
@@ -353,6 +358,7 @@
   long size_of_environment = env_size(environ);
   char *default_cmd = "/bin/echo";
   int (*read_args) PARAMS ((void)) = read_line;
+  struct sigaction sigact;
 
   program_name = argv[0];
 
@@ -499,7 +505,7 @@
           break;
 
        case 'v':
-         printf (_("GNU xargs version %s\n"), version_string);
+         printf (_("GNU xargs version %s + SIGUSR patch\n"), version_string);
          return 0;
 
        default:
@@ -508,6 +514,25 @@
        }
     }
 
+#ifdef SIGUSR1
+#ifdef SIGUSR2
+  /* Accept signals to increase or decrease the number of running
+     child processes.  Do this as early as possible after setting
+     proc_max.  */
+  sigact.sa_handler = increment_proc_max;
+  sigemptyset(&sigact.sa_mask);
+  sigact.sa_flags = 0;
+  if (0 != sigaction (SIGUSR1, &sigact, (struct sigaction *)NULL))
+         error (0, errno, _("Cannot set SIGUSR1 signal handler"));
+
+  sigact.sa_handler = decrement_proc_max;
+  sigemptyset(&sigact.sa_mask);
+  sigact.sa_flags = 0;
+  if (0 != sigaction (SIGUSR2, &sigact, (struct sigaction *)NULL))
+         error (0, errno, _("Cannot set SIGUSR2 signal handler"));
+#endif /* SIGUSR2 */
+#endif /* SIGUSR1 */
+
   if (0 == strcmp (input_file, "-"))
     {
       input_stream = stdin;
@@ -969,7 +994,7 @@
   
   if (!query_before_executing || print_args (true))
     {
-      if (proc_max && procs_executing >= proc_max)
+      while (proc_max && procs_executing >= proc_max)
        wait_for_proc (false);
       if (!query_before_executing && print_command)
        print_args (false);
@@ -1039,7 +1064,8 @@
 }
 
 /* If ALL is true, wait for all child processes to finish;
-   otherwise, wait for one child process to finish.
+   otherwise, wait for one child process to finish, or for another signal
+   that tells us that we can run more child processes.
    Remove the processes that finish from the list of executing processes.  */
 
 static void
@@ -1053,9 +1079,16 @@
        {
          pid_t pid;
 
+         stop_waiting = 0;
          while ((pid = wait (&status)) == (pid_t) -1)
-           if (errno != EINTR)
-             error (1, errno, _("error waiting for child process"));
+           {
+             if (errno != EINTR)
+               error (1, errno, _("error waiting for child process"));
+             /* If a signal handler has given us more processes, no
+                need to keep waiting (unless waiting for all). */
+             if (stop_waiting && !all)
+               return;
+           }
 
          /* Find the entry in `pids' for the child process
             that exited.  */
@@ -1099,6 +1132,30 @@
   waiting = false;
 }
 
+
+/* Increment or decrement the number of processes we can start simultaneously,
+   when we receive a signal from the outside world.
+   
+   We must take special care around proc_max == 0 (unlimited children),
+   proc_max == 1 (don't decrement to zero).  */
+static void
+increment_proc_max (int ignore)
+{
+  /* If user increments from 0 to 1, we'll take it and serialize. */
+  proc_max++;
+  /* If we're waiting for a process to die before doing something,
+     no need to wait any more. */
+  stop_waiting = 1;
+}
+
+static void
+decrement_proc_max (int ignore)
+{
+  if (proc_max > 1)
+         proc_max--;
+}
+
+
 /* Return the value of the number represented in STR.
    OPTION is the command line option to which STR is the argument.
    If the value does not fall within the boundaries MIN and MAX,




reply via email to

[Prev in Thread] Current Thread [Next in Thread]