guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Improving Shepherd


From: Carlo Zancanaro
Subject: Re: Improving Shepherd
Date: Tue, 06 Feb 2018 02:56:12 +1100
User-agent: mu4e 0.9.18; emacs 25.3.1

Hey Ludo,

On Mon, Feb 05 2018, Ludovic Courtès wrote:
User services - Alex has already sent a patch to the list to allow generating user services from the Guix side. The idea is to generate a Shepherd config file, allowing a user to invoke shepherd manually to start their services. A further extension to this would be to have something like systemd's "user sessions", where the pid 1 Shepherd
automatically starts a user's services when they log in.

After replying to Alex’ message, I realized that we could just as well have a separate “guix service” or similar tool to take care of this?

This needs more thought (and perhaps taking a look at systemd user sessions, which I’m not familiar with), but I think Alex’ approach is a
good starting point.

We were thinking it might work like this:
- services->package constructs a package which places a file in the profile containing the necessary references - pid 1 shepherd listens to elogind login/logout events, and starts the services when necessary

Admittedly this isn't the nicest way for it to work, but it might be a good starting point.

There were some discussions on the list a while ago about how to have `guix environment` automatically start services, too, so I wonder what overlap there could be there. Although maybe environment services (in containers) have more in common with system services than user services.

Child process control - this is my personal frustration, where
Shepherd loses track of processes that fork away (e.g. "emacs
--daemon"). I barely know anything about Linux process management, but from my reading this can be solved through Linux namespaces (if user namespaces are available). Could someone who knows more about this let me know if that's a productive direction for me to investigate? Or
tell me a better way to go about it?

Currently shepherd monitors SIGCHLD, and it’s not supposed to miss those; in some cases it might handle them later than you’d expect, which means that in the meantime you see a zombie process, but otherwise it
seems to work.

ISTR you reported an issue when using ‘shepherd --daemonize’, right?
Perhaps the issue is limited to that mode?

I no longer use the daemonize function. My user shepherd runs "in the foreground" (it's started when my X session starts), so it's not that. Jelle fixed the problem I was having by delaying the SIGCHLD handler registration until it's needed. It is still buggy if a process is started before the daemonize command is given to root service, though.

If you try running "emacs --daemon" with "make-forkexec-constructor" (and #:pid-file, and put something in your emacs config to make it write out the pid) you should be able to reproduce what I am seeing. If you kill emacs (or if it crashes) then shepherd continues to report that it is started and running. When I look at htop's output I can also see that my emacs process is not a child of my shepherd process.

I would like to add a --daemon/--daemonize command line argument to shepherd instead of the current "send the root service a daemonize message". I think the use cases of turning it into a daemon later are limited, and it just gives you an additional way of shooting yourself in the foot.

Concurrency/parallelism - I think Jelle was planning to work on this, but I might be wrong about that. Maybe I volunteered? We're keen to see Shepherd starting services in parallel, where possible. This will require some changes to the way we start/stop services (because at the moment we just send a "start" signal to a single service to start it, which makes it hard to be parallel), and will require us to actually build some sort of real dependency resolution. Longer-term our goal should be to bring fibers into Shepherd, but Efraim mentioned that fibers doesn't compile on ARM at the moment, so we'll have to get that
working first at least.

I’d really like to see that happen. I’ve become more familiar with Fibers, and I think it’ll be perfect for the Shepherd (and we’ll fix the
ARM build issue, no doubt.)

I'm not going to put much time/effort into this until we have fibers building on ARM. I think these changes are likely to break shepherd's config API, too. In particular, with higher levels of concurrency I want to move the mutable state out of <service> objects.

It seems that signalfd(2) is Linux-only though, which is a bummer. The solution might be to get over it and have it implemented on GNU/Hurd…
(I saw this discussion:
<https://www.gnu.org/software/hurd/glibc/signal/signal_thread.html>; I
suspect it’s within reach.)

Failing that, could we have our signal handlers just convert the signal to a message in our event loop? I have a very rudimentary understanding of signal handling, but I assume we could have our main event loop just reading things off of two channels: one of signal events, one of fd events.

This would mean that Shepherd could decide the best way to start/stop
services, including doing so in parallel if possible.

Sounds good. That’s annoyed most of us already, so if you get that
fixed, you’ll make a lot of people happy.  :-)

I'll have a go at this in the next few weeks. I'll be travelling until the end of February, so I'm not expecting much, but we'll see!

Carlo

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]