[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parallel Digest, Vol 52, Issue 10

From: Ole Tange
Subject: Re: Parallel Digest, Vol 52, Issue 10
Date: Tue, 26 Aug 2014 16:15:27 +0200

On Sat, Aug 16, 2014 at 6:12 PM, Mitchell Wyle <> wrote:
> Here are some other ideas to consider for flexibly and dynamically adding /
> removing servers:
> Consider implementing what Hadoop calls "speculative execution," where you
> send the same job to two or more servers and the first to complete the job
> wins.

Look at --halt. Using that you can make the first "failing" win. Maybe
we should extend it to include a value for

> Consider using aggressive timeouts for each job -- keep the jobs small and
> schedule very many of them to run; don't wait long for an individual one to
> be considered a failure.

Look at --timeout %. This is useful if you know your jobs take
approximately the same amount of time, but do not know how long in
seconds. So --timeout 300% will kill any job that takes 200% longer
than the median runtime.

> Consider "heart beats" of some kind where parallel on remote servers respond
> to the parallel dispatching jobs that they are available

The only way I can imagine heartbeats is by having some sort of daemon
monitoring the servers. As discussed elsewhere that could be
implemented, but needs further discussion.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]