[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Properly scheduling multi-threaded/process jobs

From: Ole Tange
Subject: Re: Properly scheduling multi-threaded/process jobs
Date: Sun, 31 Aug 2014 09:59:34 +0200

On Tue, Aug 26, 2014 at 6:52 PM, Douglas A. Augusto <> wrote:
> On 26/08/2014 at 15:55,
> Ole Tange <> wrote:
>> There is no such way per job, but there is per invocation of GNU
>> Parallel using the % in --jobs:
>> I have never in practice seen or even heard of a situation where the
>> number of cores used depended on the arguments, so I don't think it
>> happens very often for GNU Parallel use cases.
> I'm facing this situation right now. I've a 40-core machine and lots of MPI
> jobs ranging from 1 to 32 processes each. I'd like to run simultaneously as
> many jobs as possible, but they cannot exceed 40 slots/processes.

I believe if we go down that road, the first will be CPU allocation,
then RAM allocation, then disk I/O allocation, then network
allocation, then GPU allocation, and then generic resource allocation,
where you specify how much of a given resource each server has and how
much each job takes up.

>> If many users start to request the option and can argue why a full
>> cluster queue systems is not the right way to go for them, I might be
>> persuaded to consider it.
> Nowadays many tools explore some sort of parallelism, be it multi-thread or
> multi-process. Handling those tools properly would extend the applicability of
> GNU Parallel. GNU Parallel is straightforward to use and has so many useful
> features that switching to a full-fledged queue system would overcomplicate
> things and the user would lose all those GNU Parallel's nice features.

GNU Parallel was never intended to be a fully featured job queuing
system - it seems there are already plenty of good tools for that.

So while I am happy to learn that you appreciate GNU Parallel's
features, then instead of expanding GNU Parallel into a full queuing
system I would prefer seeing an extension that would make it possible
for GNU Parallel to interface to real queuing systems with proper
resource management (such as Torque). That way we can re-use
development already done instead of re-inventing the wheel.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]