reproduce-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[task #15737] slurm - openmpi - (PMIx+libevent+hwloc)


From: Boud Roukema
Subject: [task #15737] slurm - openmpi - (PMIx+libevent+hwloc)
Date: Wed, 29 Jul 2020 13:21:41 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0

URL:
  <https://savannah.nongnu.org/task/?15737>

                 Summary: slurm - openmpi - (PMIx+libevent+hwloc) 
                 Project: Reproducible paper template
            Submitted by: boud
            Submitted on: Wed 29 Jul 2020 05:21:39 PM UTC
         Should Start On: Wed 29 Jul 2020 12:00:00 AM UTC
   Should be Finished on: Wed 29 Jul 2020 12:00:00 AM UTC
                Category: None
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
        Percent Complete: 0%
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                  Effort: 0.00

    _______________________________________________________

Details:

Parallel processing using possibly non-shared memory, using MPI (message
passing interface, a standard, not any particular software), is presently
allowed for in Maneage using openmpi. How should we compile openmpi for
reproducibility?

In practice, openmpi is normally to be used on a cluster or supercomputer on
which jobs are submitted to and queued by and run (or rejected) by a
free-software (hopefully) job/user manager such as Slurm:
https://slurm.schedmd.com/ .

The computer on which a job is run is (in general) not the one from which a
batch job is submitted to slurm, e.g. with 'srun'.

So roughly speaking, as I understand it:
* user uses srun or sbatch to submit a script _X.sh_ to the slurm daemon on
the frontend H;
* slurm queues the request, and after some time may choose one or more
computers K and try to run _X.sh_ under the user's identity on those computers
K;
* the computers K each run _X.sh_, which can include a maneage package that
compiles and runs program P, which uses openmpi to ask the host computer and
Slurm which cpus/cores/threads it is allowed to use;
* the interaction between _X.sh_ on K -> openmpi on K (precompiled library) ->
host K + Slurm on H (and in some sense on K) is done through _PMIx_ (pmi or
pmi2); _libevent_; and _hwloc_ .
* MPI means that data (arrays of bytes :)) can be sent/received among the
computers K.

So the question is: for reproducibility, how much of the chain: _openmpi ->
(pmi + libevent + hwloc)_ do we want compiled internally within Maneage, and
how much should it be based on _autotools_ type automatic searching on the
machine for the preferred default libraries?

There is no point trying to include _slurm_ in Maneage, because the whole
point is that the sysadmins managing a cluster use slurm to automatically
manage a whole bunch of users - it's system-level software that the user's
script has to interact with.

official guide: https://slurm.schedmd.com/mpi_guide.html#open_mpi

The official guide doesn't give much in terms of practical, up-to-date
experience. Some URLs that seem useful:
Some URLs that seem useful:

https://bugs.schedmd.com/show_bug.cgi?id=5323

https://github.com/open-mpi/ompi/issues/5871

I'm trying some experiments, but any prior experience with this would help
speed things up. :)





    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/task/?15737>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]