parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU Parallel host sticky jobs - a first proposal


From: Michel Courtine
Subject: Re: GNU Parallel host sticky jobs - a first proposal
Date: Wed, 12 Nov 2014 19:54:39 +0100

Hi Ole!

Thanks for your response. In my particular case, I had little care for the core detection as my script takes care of the core usage optimization for our build tasks.
You are right, the number of cores became mandatory and makes the syntax more cryptic than it should.
I completely agree on the wording 'label' rather than 'tags' btw and even if hostgroup sounds less fancy, it is very clear.
I also back you're redefinition of the syntax and I'm happy you see this as an option of interest.
None of your competitors implement anything similar while it's as simple as powerful.
I should be able to give a go to your commit this week-end and will let you know if it still covers my particular needs in the same way.

Thank you!

M

On Tue, Nov 11, 2014 at 5:30 AM, Ole Tange <ole@tange.dk> wrote:
First of all thanks for putting effort into the idea.

On Mon, Nov 10, 2014 at 8:54 PM, Michel Courtine <michaK@ivoltage.me> wrote:

> I guess you must be busy with other things at the moment. So I went along
> and forked the source code to make the necessary changes and make my
> solution work.

So I did some rudimentary tests on the code. These fail:

    parallel echo ::: a@b c@d
    parallel -S "3//usr/bin/ssh localhost" echo ::: a
    cd /; parallel -S "3/usr/bin/ssh localhost" echo ::: a

At the very least the changes cannot break current behaviour.

This should either complete or fail:

    parallel -S 2/l/localhost echo {}\;hostname ::: a@l b@k

I would find it reasonable that since there is no sshlogin for 'k', it
will just be run on the first available sshlogin.

>From your description I cannot see, how I can ask for a job to be run
on the machine 'foo' without having to specify the number of cpus on
'foo'. Normally you will let GNU Parallel figure out the number of
cpus on remote systems.

I can see you intend to be able to give an sshlogin multiple labels.
It seems your code also supports a label can be on multiple sshlogins.
And are we then really defining something similar to host groups? If
we are, we should consider an already known syntax for host groups.

I am not fond of calling the grouping 'tags', because we already have
the --tag option that does something completely different. I will
prefer hostgroup or label or something similar.

Inspired by your code, I have committed [2aeb79b]. It has changed the
syntax of the SSHLogin to:

    @group1+group2/ncpu//path/to/ssh server

Because of the starting @ it does not break the above examples functionality.

I really do not like that you cannot have @ in your argument, so I
have instead created the option --hostgroups. If you use that, then @
takes on the magical function of determining which hostgroups are to
be used for this arg:

    parallel --hostgroup -S @grp1/serverA -S @grp2+grp3/serverB echo
:::  arg@grp1

I think it would also be reasonable that a job could be in multiple hostgroups:

    parallel --hostgroup -S @grp1/serverA -S @grb2/serverB echo :::
arg@grp1+grp2

The sshlogin string (The stuff after ncpu/) is now also a hostgroup:

    parallel --hostgroup -S serverA -S serverB echo ::: arg@serverA+serverB

I haven't really thought through what it would be reasonable for
--onall might do:

    parallel --onall -S @grp1/serverA -S @grp2+grp3/serverB echo :::
arg@grp1+grp2


/Ole



--
Michel Courtine | CEO | iVoltage
+33 6 189 354 89 | michaK@ivoltage.me
http://ivoltage.me

reply via email to

[Prev in Thread] Current Thread [Next in Thread]