parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sshloginfile changes like -j procfile


From: Ole Tange
Subject: Re: sshloginfile changes like -j procfile
Date: Mon, 27 Jun 2011 00:24:55 +0200

On Thu, Jun 23, 2011 at 2:47 AM, Jon Wilson <parallel@wilsonjc.us> wrote:

>  1) submit jobs that will take several hours to run, during which time I
> won't have anything else in particular to do
>  2) Go work on bringing cluster nodes back up
>  3) Change ~/.parallel/sshloginfile
>  4) GNU parallel notices that the file has changed, just like if I were
> using -j procfile, and immediately starts jobs on those additional nodes.
>
> I am using parallel 20110522.  Is this behavior already implemented?  If
> not, I would like to request this feature.

That is currently not implemented.

A workaround for you may be to put all the nodes in
~/.parallel/sshloginfile and use --retry to retry the job if it fails
on a node (e.g. if it is not up). You should set --retry to
number_of_nodes_down+1, so that if GNU Parallel retries on another
node that is down, it will retry until it finds at least one that is
up.

It is abusing the --retry and if a job actually _does_ fail, then you
will run that job number_of_nodes_down+1 times.

If you still want the feature, file a Whislist at
https://savannah.gnu.org/bugs/?group=parallel

/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]