[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[shepherd] 06/12: service: 'stop' blocks when a service is already being
From: |
Ludovic Courtès |
Subject: |
[shepherd] 06/12: service: 'stop' blocks when a service is already being stopped. |
Date: |
Sun, 19 Feb 2023 16:58:36 -0500 (EST) |
civodul pushed a commit to branch wip-service-monitor
in repository shepherd.
commit 310bdb265a10d6046cf847383f640434c11f122b
Author: Ludovic Courtès <ludo@gnu.org>
AuthorDate: Sat Feb 18 16:00:44 2023 +0100
service: 'stop' blocks when a service is already being stopped.
This change allows the monitor to keep track of services that are being
stopped. The 'stop' method now blocks when called on a service already
being stopped, until the service has actually been stopped. This was
the case before 0.9.0; since 0.9.0 though, invoking 'stop' on a service
already being stopped would lead to a second invocation of its 'stop'
method.
* modules/shepherd/service.scm (stop): Send a 'stop' message
to (current-monitor-channel). Call the 'stop' slot if and only if the
monitor returns a channel.
(service-monitor)[*service-stopped*, stopped-message?]: New variables.
Add 'stopping' loop variable and thread it in recursive calls. Handle
'stop' and 'stopped-message?' messages.
* tests/stopping-status.sh: New file.
* Makefile.am (TESTS): Add it.
---
Makefile.am | 3 +-
modules/shepherd/service.scm | 126 ++++++++++++++++++++++++++++++++++---------
tests/stopping-status.sh | 115 +++++++++++++++++++++++++++++++++++++++
3 files changed, 217 insertions(+), 27 deletions(-)
diff --git a/Makefile.am b/Makefile.am
index b50f535..58acfc5 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -1,6 +1,6 @@
# Makefile.am -- How to build and install the Shepherd.
# Copyright © 2002, 2003 Wolfgang Jährling <wolfgang@pro-linux.de>
-# Copyright © 2013-2016, 2018-2020, 2022 Ludovic Courtès <ludo@gnu.org>
+# Copyright © 2013-2016, 2018-2020, 2022-2023 Ludovic Courtès <ludo@gnu.org>
# Copyright © 2018 Carlo Zancanaro <carlo@zancanaro.id.au>
#
# This file is part of the GNU Shepherd.
@@ -231,6 +231,7 @@ SUFFIXES = .go
TESTS = \
tests/basic.sh \
tests/starting-status.sh \
+ tests/stopping-status.sh \
tests/replacement.sh \
tests/respawn.sh \
tests/respawn-throttling.sh \
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 29f568e..34e23b1 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -499,17 +499,27 @@ is not already running, and will return SERVICE's
canonical name in a list."
(slot-set! service 'handle-termination (const #f))
;; Stop the service itself.
- (catch #t
- (lambda ()
- (apply (slot-ref service 'stop)
- (service-running-value service)
- args))
- (lambda (key . args)
- ;; Special case: 'root' may quit.
- (and (eq? root-service service)
- (eq? key 'quit)
- (apply quit args))
- (caught-error key args)))
+ (let ((reply (make-channel)))
+ (put-message (current-monitor-channel)
+ `(stop ,service ,reply))
+ (match (get-message reply)
+ (#f
+ #f)
+ ((? channel? notification)
+ (catch #t
+ (lambda ()
+ (define stopped?
+ (not (apply (slot-ref service 'stop)
+ (service-running-value service)
+ args)))
+ (put-message notification stopped?))
+ (lambda (key . args)
+ ;; Special case: 'root' may quit.
+ (and (eq? root-service service)
+ (eq? key 'quit)
+ (apply quit args))
+ (put-message notification #f)
+ (caught-error key args))))))
;; SERVICE is no longer running.
(put-message (current-monitor-channel)
@@ -702,10 +712,13 @@ clients."
requests arriving on @var{channel}."
(define *service-started* (list 'service 'started!))
(define (started-message? obj) (eq? *service-started* obj))
+ (define *service-stopped* (list 'service 'stopped!))
+ (define (stopped-message? obj) (eq? *service-stopped* obj))
(let loop ((registered vlist-null)
(running vlist-null)
- (starting vlist-null))
+ (starting vlist-null)
+ (stopping vlist-null))
(define (unregister services)
;; Return REGISTERED minus SERVICE.
(vhash-fold (lambda (name service result)
@@ -728,36 +741,37 @@ requests arriving on @var{channel}."
(let ((name (canonical-name service)))
(match (vhash-assq name registered)
(#f
- (loop (register service) running starting))
+ (loop (register service) running starting stopping))
((_ . old)
(if (vhash-assq old running)
(begin
(slot-set! old 'replacement service)
- (loop registered running starting))
+ (loop registered running starting stopping))
(loop (register service (unregister (list old)))
- running starting))))))
+ running starting stopping))))))
(('unregister services) ;no reply
(match (filter (cut vhash-assq <> running) services)
(()
- (loop (unregister services) running starting))
+ (loop (unregister services) running starting stopping))
(lst ;
(local-output
(l10n "Cannot unregister service ~a, which is still running"
"Cannot unregister services~{ ~a,~} which are still running"
(length lst))
(map canonical-name lst))
- (loop registered running starting))))
+ (loop registered running starting stopping))))
(('unregister-all) ;no reply
(let ((root (cdr (vhash-assq 'root registered))))
(loop (fold (cut vhash-consq <> root <>)
vlist-null
(provided-by root))
(vhash-consq root #t running)
- starting)))
+ starting
+ stopping)))
(('lookup name reply)
(put-message reply
(vhash-foldq* cons '() name registered))
- (loop registered running starting))
+ (loop registered running starting stopping))
(('service-list reply)
(let ((names (delete-duplicates
(vhash-fold (lambda (key _ result)
@@ -773,13 +787,14 @@ requests arriving on @var{channel}."
result))
'()
names))
- (loop registered running starting)))
+ (loop registered running starting stopping)))
(('running service reply)
(put-message reply
(match (vhash-assq service running)
(#f #f)
((_ . value) value)))
- (loop registered running starting))
+ (loop registered running starting stopping))
+
(('start service reply)
;; Attempt to start SERVICE, blocking if it is already being started.
;; Send #f on REPLY if SERVICE was already running or being started;
@@ -792,7 +807,7 @@ requests arriving on @var{channel}."
(match pair
((_ . value)
(put-message reply #f)
- (loop registered running starting)))))
+ (loop registered running starting stopping)))))
((vhash-assq service starting)
=>
;; SERVICE is being started: wait until it has started and then
@@ -804,7 +819,7 @@ requests arriving on @var{channel}."
(lambda ()
(wait condition)
(put-message reply #f)))
- (loop registered running starting)))))
+ (loop registered running starting stopping)))))
(else
;; Become the one who starts SERVICE.
(let ((condition (make-condition))
@@ -820,7 +835,8 @@ requests arriving on @var{channel}."
(canonical-name service))
(put-message reply notification)
(loop registered running
- (vhash-consq service condition starting))))))
+ (vhash-consq service condition starting)
+ stopping)))))
(((? started-message?) service value) ;no reply
(local-output (l10n "Service ~a running with value ~s.")
(canonical-name service) value)
@@ -831,11 +847,69 @@ requests arriving on @var{channel}."
(if (or (one-shot? service) (not value))
running
(vhash-consq service value running))
- (vhash-delq service starting)))))
+ (vhash-delq service starting)
+ stopping))))
+
+ (('stop service reply)
+ ;; Attempt to stop SERVICE, blocking if it is already being stopped.
+ ;; Send #f on REPLY if SERVICE was already running or being started;
+ ;; otherwise send a channel on which to send a notification once it has
+ ;; been stopped.
+ (cond ((vhash-assq service stopping)
+ =>
+ ;; SERVICE is being stopped: wait until it is stopped and then
+ ;; send #f on REPLY.
+ (lambda (pair)
+ (match pair
+ ((_ . condition)
+ (spawn-fiber
+ (lambda ()
+ (wait condition)
+ (put-message reply #f)))
+ (loop registered running starting stopping)))))
+ ((not (vhash-assq service running))
+ =>
+ ;; SERVICE is not running: send #f on REPLY.
+ (lambda (pair)
+ (match pair
+ ((_ . value)
+ (put-message reply #f)
+ (loop registered running starting stopping)))))
+ (else
+ ;; Become the one that stops SERVICE.
+ (let ((condition (make-condition))
+ (notification (make-channel)))
+ (spawn-fiber
+ (lambda ()
+ (let ((stopped? (get-message notification)))
+ (if stopped?
+ (local-output (l10n "Service ~a stopped.")
+ (canonical-name service))
+ (local-output (l10n "Failed to stop ~a.")
+ (canonical-name service)))
+ (put-message channel
+ (list *service-stopped* service)))))
+ (local-output (l10n "Stopping service ~a...")
+ (canonical-name service))
+ (put-message reply notification)
+ (loop registered running starting
+ (vhash-consq service condition stopping))))))
+ (((? stopped-message?) service) ;no reply
+ (local-output (l10n "Service ~a is now stopped.")
+ (canonical-name service))
+ (match (vhash-assq service stopping)
+ ((_ . condition)
+ (signal-condition! condition)
+ (loop registered
+ (vhash-delq service running)
+ starting
+ (vhash-delq service stopping)))))
+
(('notify-termination service) ;no reply
(loop registered
(vhash-delq service running) ;XXX: complexity
- starting)))))
+ starting
+ stopping)))))
(define (spawn-service-monitor)
"Spawn a new service monitor fiber and return a channel to send it requests."
diff --git a/tests/stopping-status.sh b/tests/stopping-status.sh
new file mode 100644
index 0000000..e7dce8c
--- /dev/null
+++ b/tests/stopping-status.sh
@@ -0,0 +1,115 @@
+# GNU Shepherd --- Test the "stopping" status.
+# Copyright © 2023 Ludovic Courtès <ludo@gnu.org>
+#
+# This file is part of the GNU Shepherd.
+#
+# The GNU Shepherd is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or (at
+# your option) any later version.
+#
+# The GNU Shepherd is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with the GNU Shepherd. If not, see <http://www.gnu.org/licenses/>.
+
+shepherd --version
+herd --version
+
+socket="t-socket-$$"
+conf="t-conf-$$"
+confdir="t-confdir-$$"
+datadir="t-datadir-$$"
+log="t-log-$$"
+stamp="t-stamp-$$"
+pid="t-pid-$$"
+
+herd="herd -s $socket"
+
+trap "cat $log || true; rm -f $socket $conf $stamp $log;
+ test -f $pid && kill \`cat $pid\` || true; rm -f $pid" EXIT
+
+cat > "$conf" <<EOF
+(register-services
+ (make <service>
+ #:provides '(test)
+ #:start (const #t)
+ #:stop (lambda _
+ (let loop ((n 30))
+ (if (or (file-exists? "$stamp") (zero? n))
+ (begin
+ (delete-file "$stamp")
+ (zero? n)) ;failure if N is zero
+ (begin
+ ((@ (fibers) sleep) 1)
+ (loop (- n 1))))))
+
+ #:respawn? #f))
+EOF
+
+rm -f "$pid" "$stamp"
+shepherd -I -s "$socket" -c "$conf" -l "$log" --pid="$pid" &
+
+# Wait till it's ready.
+while ! test -f "$pid" ; do sleep 0.3 ; done
+
+shepherd_pid="`cat $pid`"
+
+$herd start test
+$herd status test | grep started
+
+$herd stop test &
+herd_pid=$!
+
+# Currently, 'test' is considered as "running" while being stopped.
+$herd status
+$herd status test
+$herd status test | grep started
+
+$herd stop test &
+herd_pid2=$!
+sleep 1
+kill -0 "$herd_pid"
+kill -0 "$herd_pid2"
+
+# Trigger actual service stop.
+touch "$stamp"
+
+# Make sure the service is marked as "stopped" shortly after.
+n=0
+while : ; do
+ if $herd status test | grep stopped
+ then break
+ else n=$(expr $n + 1)
+ fi
+
+ test $n -le 10
+ sleep 1
+done
+
+# Make sure the 'herd' processes terminated.
+n=0
+while : ; do
+ if kill -0 "$herd_pid" || kill -0 "$herd_pid2"
+ then
+ n=$(expr $n + 1)
+ test $n -le 10
+ sleep 1
+ else
+ break
+ fi
+done
+
+$herd stop test
+! test -f "$stamp"
+
+$herd stop root
+! kill -0 $shepherd_pid
+
+test $(grep "Stopping service test" "$log" | wc -l) = 1
+
+rm -rf "$confdir"
+rm -rf "$datadir"
- [shepherd] branch wip-service-monitor created (now da158c5), Ludovic Courtès, 2023/02/19
- [shepherd] 01/12: service: Remove 'stop-delay?' and 'waiting-for-termination?'., Ludovic Courtès, 2023/02/19
- [shepherd] 02/12: service: Handle service state in a monitoring agent., Ludovic Courtès, 2023/02/19
- [shepherd] 06/12: service: 'stop' blocks when a service is already being stopped.,
Ludovic Courtès <=
- [shepherd] 09/12: service: Rename "service monitor" to "service registry"., Ludovic Courtès, 2023/02/19
- [shepherd] 10/12: service: Clarify expected statuses., Ludovic Courtès, 2023/02/19
- [shepherd] 05/12: service: 'start-in-the-background' starts services in parallel., Ludovic Courtès, 2023/02/19
- [shepherd] 03/12: service: 'start' blocks when a service is already being started., Ludovic Courtès, 2023/02/19
- [shepherd] 04/12: service: Start dependent services in parallel., Ludovic Courtès, 2023/02/19
- [shepherd] 07/12: service: 'service-running-value' uses a fresh channel for each reply., Ludovic Courtès, 2023/02/19
- [shepherd] 11/12: service: Communicate the service status symbol to clients., Ludovic Courtès, 2023/02/19
- [shepherd] 12/12: herd: Report 'starting' and 'stopping' service statuses., Ludovic Courtès, 2023/02/19
- [shepherd] 08/12: service: Associate a control fiber with each service., Ludovic Courtès, 2023/02/19