parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fault tolerance, retry task on different node, recovery orientation?


From: Mitchell Wyle
Subject: fault tolerance, retry task on different node, recovery orientation?
Date: Tue, 27 May 2014 10:22:53 -0700

I am slowly going through the tutorials and looking at the --retries and --filter-hosts.  

I would like to engineer a recovery-oriented robust parallel job that retries tasks on "the next" (usually different) host in a round-robin when a task times out or  fails.

I have 10 hosts with the same NFS mounted file system and I want to dispatch manifests of tens of thousands of files to the 10 hosts round-robin style.  If a manifest job fails or times out, I want to re-try the job on a different host and continue until all the tasks complete.

Can gnu parallel be used in such a way that it retries jobs on different hosts?

find -type f -name "*manifest*" | parallel . . .

Thanks in advance.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]