parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New behavior proposal --halt -% with job killing


From: Ole Tange
Subject: Re: New behavior proposal --halt -% with job killing
Date: Sat, 25 Apr 2015 09:25:01 +0200

On Thu, Apr 23, 2015 at 5:10 PM, Martin d'Anjou
<address@hidden> wrote:

Good summary:

> You could come up with many options here:
>
> When to halt:
> --halt [condition for halt]
> --timeout [condition for halt is an amount of time]
> --memfree [condition for halt is an amount of memory]
> kill -TERM [condition for halt is the signal]

I think our solution should make it possible to extend this list. E.g.
maybe it will be possible to detect whether the remote job failed or
the network connection to the remote server failed.

--memfree is special, however. It retries indefinitely, if the job
gets killed due to low memory.

> How to handle jobs after a halt:
> --halt-job-handling [killpending[,killrunning]]
> --timeout-job-handling [killpending[,killrunning]]
> and so on. Users could use both kills if they wanted both.

And --kill-TERM-job-handling

killrunning will always imply killpending, but the opposite is not the
case, right?

--retries should be thrown into the mix, too.

I can easily think of real life situations where the handling of a
death due to --memfree is different from a --timeout, and I think the
current behaviour (retrying indefinitely) is correct. But do we have a
real life situation where we want --halt-job-handling to be different
from --timeout-job-handling given that we have --retries?

I am reluctant to put in 3 options that are extremely rarely used
(there are plenty of options as it is and testing becomes harder the
more combinations needs to be tested).

> Or, you could use an explicit "plus" sign to mean halt and kill all running
> and pending jobs:
> --halt +1-99%

POLA would say that --halt +1-99% == --halt 1-99%


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]