[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
shepherd, fibers, and signals (asyncs)
From: |
Attila Lendvai |
Subject: |
shepherd, fibers, and signals (asyncs) |
Date: |
Fri, 15 Dec 2023 22:03:21 +0000 |
dear Guix,
context:
Shepherd stops responding during "guix system reconfigure"
https://issues.guix.gnu.org/67538
https://issues.guix.gnu.org/65178
https://issues.guix.gnu.org/67230
i've added a ton of logging and asserts in my fork:
https://codeberg.org/attila-lendvai-patches/shepherd
which resulted in this report:
https://github.com/wingo/fibers/issues/29#issuecomment-1858319291
to which @emixa-d kindly responded:
https://github.com/wingo/fibers/issues/29#issuecomment-1858497720
which essentially identifies the following:
--------------
posix signal handlers are async, and shepherd uses the fibers API from inside
signal handlers, specifically in at least handle-SIGCHLD.
this violates the fibers API, and most probably leads to the root cause of the
reconfigure hang: a match-error flying out from service-controller due to
losing the value of the parameter called (current-process-monitor), which then
makes that fiber exit.
i have little experience with posix signal handlers, so i probably won't come
up with a fix for this, or at least not without someone's bird's eye view
guidance.
maybe the solution could be something like packaging up posix signals and
delivering them to the fibers universe by some form of polling of an atomic
variable? or is there some signal-safe semaphore facility in guile that could
be used in accordance with the fibers API?
--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“Virtue is never left to stand alone. He who has it will have neighbors.”
— Confucius (551–479 BC)
- shepherd, fibers, and signals (asyncs),
Attila Lendvai <=