bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#43565: cuirass: Fibers scheduling blocked.


From: Mathieu Othacehe
Subject: bug#43565: cuirass: Fibers scheduling blocked.
Date: Mon, 26 Oct 2020 15:22:19 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Hey!

Many thanks for your help, you rock!

> But does Cuirass create file descriptors as O_NONBLOCK?  This has to be
> done explicitly, Fibers won’t do it for us.  As it turns out, the answer
> is no, in at least one important case: the connection to the daemon
> (untested patch below).
>
> While GC is running, Cuirass typically sends ‘build-derivations’ RPCs
> and they block until the GC lock is released.  That can lead to the
> situation above: a bunch of threads blocked in ‘read’ from their daemon
> socket, waiting for the RPC reply.  OTOH, ‘build-derivations’ RPCs are
> made from a fresh thread created by ‘build-derivations&’.

While I agree not opening file descriptors with O_NONBLOCK is an issue,
build-derivations is called in a separate thread. Blocking this separate
thread should not block the fibers.

For instance, the following program:

--8<---------------cut here---------------start------------->8---
(use-modules (fibers)
             (ice-9 threads))

(run-fibers
 (lambda ()
   (spawn-fiber
    (lambda ()
      (call-with-new-thread
       (lambda ()
         (read (car (pipe)))))))
   (spawn-fiber
    (lambda ()
      (while #t
        (format #t "alive~%")
        (sleep 1)))))
 #:hz 10
 #:drain? #t)
--8<---------------cut here---------------end--------------->8---

keeps displaying "alive" even if the spawned thread is blocking. I guess
that's also what's happening in Cuirass because the log shows that some
fibers are scheduled while the GC is running.

Now the question is why there's no fetching while the GC is running? The
answer is that "latest-repository-commit" called by "fetch-input" will
block the only fiber dedicated to fetching. Having multiple fibers
trying to fetch wouldn't solve anything because fetching requires some
building from the daemon.

Long story short, I think we can apply your patch that can be useful to
prevent fibers talking directly to the daemon to block, even though it
won't help for this particular hang, that will only be fixed the GC time
will be reduced to something more acceptable.

Thanks,

Mathieu





reply via email to

[Prev in Thread] Current Thread [Next in Thread]