bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27


From: Ludovic Courtès
Subject: bug#31925: 'guix substitutes' sometimes hangs on glibc 2.27
Date: Wed, 04 Jul 2018 18:58:30 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

(+Cc: Andy as the ultimate authority for all these things.  :-))

address@hidden (Ludovic Courtès) skribis:

> (let loop ((files files)
>            (n 0))
>   (match files
>     ((file . tail)
>      (call-with-input-file file
>        (lambda (port)
>          (call-with-decompressed-port 'gzip port
>            (lambda (port)
>              (let loop ()
>                (unless (eof-object? (get-bytevector-n port 777))
>                  (loop)))))))
>      ;; (pk 'loop n file)
>      (display ".")
>      (loop tail (+ n 1)))))

One problem I’ve noticed is that the child process that
‘call-with-decompressed-port’ spawns would be stuck trying to get the
allocation lock:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f9fd8d5cb25 in __GI___pthread_mutex_lock (mutex=0x7f9fd91b3240 
<GC_allocate_ml>) at ../nptl/pthread_mutex_lock.c:78
#2  0x00007f9fd8f8ef8f in GC_call_with_alloc_lock (address@hidden 
<do_copy_weak_entry>, address@hidden) at misc.c:1929
#3  0x00007f9fd92b1270 in copy_weak_entry (dst=0x7ffe4b9a0d70, src=0x759ed0) at 
weak-set.c:124
#4  weak_set_remove_x (closure=0x8850c0, pred=0x7f9fd92b0440 <eq_predicate>, 
hash=3944337866184184181, set=0x70cf00) at weak-set.c:615
#5  scm_c_weak_set_remove_x (address@hidden<weak-set 756df0>, 
raw_hash=<optimized out>, address@hidden <eq_predicate>, address@hidden) at 
weak-set.c:791
#6  0x00007f9fd92b13b0 in scm_weak_set_remove_x (set=#<weak-set 756df0>, 
address@hidden<port 2 8850c0>) at weak-set.c:812
#7  0x00007f9fd926f72f in close_port (port=#<port 2 8850c0>, 
explicit=<optimized out>) at ports.c:884
#8  0x00007f9fd92ad307 in vm_regular_engine (thread=0x7f9fd91b3240 
<GC_allocate_ml>, vp=0x7adf30, registers=0x0, resume=-657049556) at 
vm-engine.c:786
#9  0x00007f9fd92afb37 in scm_call_n (proc=<error reading variable: ERROR: 
Cannot access memory at address 0xd959b030>0x7f9fd959b030, address@hidden, 
address@hidden) at vm.c:1257
#10 0x00007f9fd9233017 in scm_primitive_eval (exp=<optimized out>, 
address@hidden<error reading variable: ERROR: Cannot access memory at address 
0xd5677cf8>0x855280) at eval.c:662
#11 0x00007f9fd9233073 in scm_eval (exp=<error reading variable: ERROR: Cannot 
access memory at address 0xd5677cf8>0x855280, address@hidden<error reading 
variable: ERROR: Cannot access memory at address 0xd95580d8>0x83d140) at 
eval.c:696
#12 0x00007f9fd927e8d0 in scm_shell (argc=2, argv=0x7ffe4b9a1668) at 
script.c:454
#13 0x00007f9fd9249a9d in invoke_main_func (body_data=0x7ffe4b9a1510) at 
init.c:340
#14 0x00007f9fd922c28a in c_body (d=0x7ffe4b9a1450) at continuations.c:422
#15 0x00007f9fd92ad307 in vm_regular_engine (thread=0x7f9fd91b3240 
<GC_allocate_ml>, vp=0x7adf30, registers=0x0, resume=-657049556) at 
vm-engine.c:786
#16 0x00007f9fd92afb37 in scm_call_n (address@hidden<smob catch-closure 
795120>, address@hidden, address@hidden) at vm.c:1257
#17 0x00007f9fd9231e69 in scm_call_0 (address@hidden<smob catch-closure 
795120>) at eval.c:481
#18 0x00007f9fd929e7b2 in catch (address@hidden, thunk=#<smob catch-closure 
795120>, handler=<error reading variable: ERROR: Cannot access memory at 
address 0x400000000>0x7950c0, pre_unwind_handler=<error reading variable: 
ERROR: Cannot access memory at address 0x400000000>0x7950a0) at throw.c:137
#19 0x00007f9fd929ea95 in scm_catch_with_pre_unwind_handler (address@hidden, 
thunk=<optimized out>, handler=<optimized out>, pre_unwind_handler=<optimized 
out>) at throw.c:254
#20 0x00007f9fd929ec5f in scm_c_catch (address@hidden, address@hidden <c_body>, 
address@hidden, address@hidden <c_handler>, address@hidden, address@hidden 
<pre_unwind_handler>, pre_unwind_handler_data=0x7a9bc0) at throw.c:377
#21 0x00007f9fd922c870 in scm_i_with_continuation_barrier (address@hidden 
<c_body>, address@hidden, address@hidden <c_handler>, address@hidden, 
address@hidden <pre_unwind_handler>, pre_unwind_handler_data=0x7a9bc0) at 
continuations.c:360
#22 0x00007f9fd922c905 in scm_c_with_continuation_barrier (func=<optimized 
out>, data=<optimized out>) at continuations.c:456
#23 0x00007f9fd929d3ec in with_guile (address@hidden, address@hidden) at 
threads.c:661
#24 0x00007f9fd8f8efb8 in GC_call_with_stack_base (address@hidden <with_guile>, 
address@hidden) at misc.c:1949
#25 0x00007f9fd929d708 in scm_i_with_guile (dynamic_state=<optimized out>, 
address@hidden, address@hidden <invoke_main_func>) at threads.c:704
#26 scm_with_guile (address@hidden <invoke_main_func>, address@hidden) at 
threads.c:710
#27 0x00007f9fd9249c32 in scm_boot_guile (address@hidden, address@hidden, 
address@hidden <inner_main>, address@hidden) at init.c:323
#28 0x0000000000400b70 in main (argc=2, argv=0x7ffe4b9a1668) at guile.c:101
(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7f9fd972eb80 (LWP 15573) "guile" __lll_lock_wait () at 
../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
--8<---------------cut here---------------end--------------->8---

So it seems quite clear that the thing has the alloc lock taken.  I
suppose this can happen if one of the libgc threads runs right when we
call fork and takes the alloc lock, right?

If that is correct, the fix would be to call fork within
‘GC_call_with_alloc_lock’.

How does that sound?

As a workaround on the Guix side, we might achieve the same effect by
calling ‘gc-disable’ right before ‘primitive-fork’.

Ludo’.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]