parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: run parallel via qsub


From: Ole Tange
Subject: Re: run parallel via qsub
Date: Tue, 27 Jan 2015 23:28:32 +0100

On Tue, Jan 27, 2015 at 10:42 AM, drizzle <drizzle2013@foxmail.com> wrote:
> Hi, everyone. I am going to invoke GNU parallel on a multi-machine (nodes)
> cluster via "qsub".  a toy pbs script looks like:
> #!/bin/sh
> #PBS -l nodes=2:ppn=2
> (...skip..)
> cat parameters.txt | parallel bash MYSCRIPT.sh {}
>
> However, the GNU parallel appears to not accurately spawn across nodes (i.e.
> all MYSCRIPT.sh are invoked on the same nodes and consume cores/processes
> more than I request by PBS).

You have not told GNU Parallel how many cores it should use, so it
defaults to all cores on the local machine. That is not what you want.

If you use ppn=2, you should do -j2.

> Following the tutorial, I add "-S" option and
> manually specify the hosts and cores. Now the command looks like:
> (...skip...)
> HOSTS=`cat $PBS_NODEFILE‍ | uniq -c | awk 'BEGIN{OFS=""}{print
> $1,"/",$2}'|tr '\n' ','|sed 's/,$/ /'`
> cat parameters.txt | parallel -S $HOSTS MYSCRIPT.sh {}‍

It just so happens, that you can use PBS_NODEFILE directly:

    cat parameters.txt | parallel --slf $PBS_NODEFILE -j2 MYSCRIPT.sh {}‍

(I would love to say that is due to the designer of GNU Parallel
having done this on purpose - but it is actually just a happy
coincidence that the file format is compatible).

> Seems this naive modification works for me. But my questions are:
>  (1) what is a more optimal setting to invoke parallel on cluster via PBS?
> Since the job management system (like PBS) have assigned cores for the job,
> It should be better to just use these resources.

I might be wrong, but I am pretty sure PBS does not assign cores. It
allocates a resource that is not tied to a specific core. But this
might just be splitting hairs.

>  (2) similar as the question 1. Does the parallel just spawn jobs on the
> cores assigned by PBS, or randomly spawn on all available cores?

Neither. It spawns a number of jobs. This number of jobs is by default
the number of cores. You adjust this with --jobs.

Depending on your script, your script might actually use 2 cores, in
which case you need to divide --jobs by 2.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]