[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: fault tolerance, retry task on different node, recovery orientation?
From: |
Ole Tange |
Subject: |
Re: fault tolerance, retry task on different node, recovery orientation? |
Date: |
Thu, 29 May 2014 13:56:26 +0200 |
On Tue, May 27, 2014 at 7:22 PM, Mitchell Wyle <mfw@wyle.org> wrote:
> If a manifest job fails or times out, I want to re-try the job on a
> different host and continue until all the tasks complete.
>
> Can gnu parallel be used in such a way that it retries jobs on different
> hosts?
How does that differ from what --retries does now?
--retries n
If a job fails, retry it on another computer. Do
this n times. If there are fewer than n computers
in --sshlogin GNU parallel will re-use the
computers. This is useful if some jobs fail for no
apparent reason (such as network failure).
/Ole