guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#39588] gnu: Add mpich, scalapack-mpich, mumps-mpich, pt-scotch-mpic


From: Ludovic Courtès
Subject: [bug#39588] gnu: Add mpich, scalapack-mpich, mumps-mpich, pt-scotch-mpich, python-mpi4py-mpich
Date: Fri, 21 Feb 2020 12:32:44 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Hi,

I actually managed to reproduce it with a minimal test case (attached):

--8<---------------cut here---------------start------------->8---
$ guix build -f mpich-test.scm
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
La jena derivo estos konstruata:
   /gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv
building /gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv...
/gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215: 
expr: command not found
/gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215: 
expr: command not found
/gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215: 
expr: command not found
/gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215: 
expr: command not found
/gnu/store/pkbg6kllx5xb8vb6kwrwm7qm4rnpmhia-mpich-3.3.2/bin/mpicc: line 215: 
expr: command not found
Invalid error code (-2) (error ring index 127 invalid)
INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in 
MPID_nem_tcp_init:373
Invalid error code (-2) (error ring index 127 invalid)
INTERNAL ERROR: invalid error code fffffffe (Ring Index out of range) in 
MPID_nem_tcp_init:373
Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(586)..............: 
MPID_Init(224).....................: channel initialization failed
MPIDI_CH3_Init(105)................: 
MPID_nem_init(324).................: 
MPID_nem_tcp_init(175).............: 
MPID_nem_tcp_get_business_card(401): 
MPID_nem_tcp_init(373).............: gethostbyname failed, localhost (errno 0)
Backtrace:
           1 (primitive-load "/gnu/store/iykxzg1n018sigd4c23kx1c4ngz?")
In guix/build/utils.scm:
    652:6  0 (invoke _ . _)

guix/build/utils.scm:652:6: In procedure invoke:
Throw to key `srfi-34' with args `(#<condition &invoke-error [program: 
"mpiexec" arguments: ("-np" "2" 
"/gnu/store/8i1dci1wxd6c0q6a2cz4kgb8adfk8rrz-mpi-init") exit-status: 15 
term-signal: #f stop-signal: #f] 7ffff6022f40>)'.
builder for `/gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv' failed 
with exit code 1
build of /gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv failed
View build log at 
'/var/log/guix/drvs/rg/r7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv.bz2'.
guix build: error: build of 
`/gnu/store/rgr7wnxbgxnp6s96zcnb4ryn3rqfcl7b-mpi-init.drv' failed
--8<---------------cut here---------------end--------------->8---

The same program outside the container works just fine:

--8<---------------cut here---------------start------------->8---
$ guix environment --ad-hoc mpich -- mpiexec -np 2 
"/gnu/store/8i1dci1wxd6c0q6a2cz4kgb8adfk8rrz-mpi-init"
np = 2, rank = 0
np = 2, rank = 1
--8<---------------cut here---------------end--------------->8---

‘MPL_get_sockaddr’ uses ‘getaddrinfo’ for host name lookup.
Interestingly, ‘getaddrinfo’ fails in the build environment when passed
the flags that ‘MPL_get_sockaddr’ uses:

--8<---------------cut here---------------start------------->8---
(computed-file "getaddrinfo"
               #~(pk #$output
                     (getaddrinfo "localhost" #f
                                  (logior AI_ADDRCONFIG AI_V4MAPPED)
                                  AF_INET
                                  SOCK_STREAM
                                  IPPROTO_TCP)))
--8<---------------cut here---------------end--------------->8---

However, if you comment AF_INET, SOCK_STREAM, and IPPROTO_TCP, it works.

Now we need to see why the ‘ai_family’ hint is causing troubles in
glibc, and perhaps in parallel try to work around it in MPICH…

Ludo’.

PS: I’ll be mostly away from keyboard in the coming days.

(use-modules (guix) (gnu))

(define code
  (plain-file "mpi.c" "
#include <assert.h>
#include <stdio.h>
#include <mpi.h>

int main (int argc, char *argv[]) {
  int err, np, rank;
  err = MPI_Init (&argc, &argv);
  assert (err == 0);
  err = MPI_Comm_size(MPI_COMM_WORLD, &np);
  assert (err == 0);
  err = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  assert (err == 0);
  printf (\"np = %i, rank = %i\\n\", np, rank);
  return 0;
} "))

(define toolchain (specification->package "gcc-toolchain"))
(define mpich (specification->package "mpich"))

(computed-file "mpi-init"
               (with-imported-modules '((guix build utils))
                 #~(begin
                     (use-modules (guix build utils))

                     (setenv "PATH"
                             (string-append #$(file-append toolchain "/bin") ":"
                                            #$(file-append mpich "/bin")))
                     (setenv "CPATH" #$(file-append mpich "/include"))
                     (setenv "LIBRARY_PATH"
                             (string-append #$(file-append mpich "/lib") ":"
                                            #$(file-append toolchain "/lib")))
                     (invoke "mpicc" "-o" #$output "-Wall" "-g"
                             #$code)

                     ;; Run the MPI code in the build environment.
                     (invoke "mpiexec" "-np" "2" #$output))))

reply via email to

[Prev in Thread] Current Thread [Next in Thread]