parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New behavior proposal --halt -% with job killing


From: Martin d'Anjou
Subject: Re: New behavior proposal --halt -% with job killing
Date: Thu, 23 Apr 2015 11:10:04 -0400

On Thu, Apr 23, 2015 at 9:56 AM, Rasmus Villemoes <address@hidden> wrote:
On Thu, Apr 23 2015, Ole Tange <address@hidden> wrote:

> On Tue, Apr 21, 2015 at 9:42 PM, Martin d'Anjou
> <address@hidden> wrote:
>> Hi Ole,
>>
>> When I run parametrized tests and the failure rate exceeds a certain
>> threshold, I need to kill the remaining running tests and abort with non
>> zero and without starting new tests.
>>
>> GNU Parallel supports --halt with a positive percentage, I was thinking it
>> could support this feature when a negative percentage.
>>
>> Does that sound good?
>
> So the current --halt X% works similar to --halt 1. As I understand
> you want a --halt with % that works similar to --halt 2.
>
> I am a bit reluctant to let -X% be similar to 2, because it is not
> symetrical with --halt -1 and --halt -2 (which look at successes).
>
> Another idea is to let even percentage be similar to --halt 2, while
> odd percentages behave like --halt 1:
>
> 1-99%  (odd numbers) If val% of the jobs fail and minimum 3: Do not
> start new jobs, but complete the running jobs including cleanup. The
> exit status will be the exit status from the last failing job.
>
> 2-100%  (even numbers) If val% of the jobs fail and minimum 3: Kill
> off all jobs immediately and exit without cleanup. The exit status
> will be the exit status from the failing job.
>
> I am not really sure if any of those adhere to the Principle of Least
> Astonishment.

Hm, I don't think that would be particularly intuitive. How about
allowing --halt be given multiple times, with each invocation overriding
certain aspects of the behaviour requested so far. Then one could do

--halt 25% --halt 2

with --halt 25% --halt 1 being the same as --halt 25% by
itself. Encountering --halt 0 would restore the default behaviour.

I really prefer when any command-line option can be overridden (or reset
to the default) by a later option - that makes it a lot easier to write
wrapper scripts which contain a few default flags but allow the user to
tweak these by just appending options. For example in Martin's case, one
could have the above --halt 25% --halt 2 hardcoded in whatever script
runs the tests, and the script could then support a --no-halt-on-error
which would just append --halt 0 to the parallel options (instead of the
error-prone process of trimming the current list of options, where one
has to be careful with --halt=25% versus --halt 25% etc.).

> Can we find a better way to express the meaning and which is not
> surprising to the user?

Not sure my proposal is better - just thinking out loud.

Rasmus


So today, there is an option to control when/how to kill jobs after some have passed:
--halt [-1,-2]

And there is an option to control when/how to kill jobs after some have failed
--halt [0,1,2,1-99%]

It seems that deciding when to halt (the halt condition) and how to handle the jobs after the halt condition is met are two separate things (the when and the how are different). Currently the --halt attempts to answer both questions. The same situation present itself when other "halt" conditions are met like kill -TERM, timeout or memfree (are there others?). And there are two types of jobs to consider: the running ones, and the pending ones. You could come up with many options here:

When to halt:
--halt [condition for halt]
--timeout [condition for halt is an amount of time]
--memfree [condition for halt is an amount of memory]
kill -TERM [condition for halt is the signal]

How to handle jobs after a halt:
--halt-job-handling [killpending[,killrunning]]
--timeout-job-handling [killpending[,killrunning]]
and so on. Users could use both kills if they wanted both.

Or, you could use an explicit "plus" sign to mean halt and kill all running and pending jobs:
--halt +1-99%

Just some ideas.

Cheers,
Martin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]