[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Killing processes on system shutdow

From: Roland McGrath
Subject: Re: Killing processes on system shutdow
Date: Thu, 23 Aug 2001 21:20:08 -0400 (EDT)

You are right.  This was even true before the split-init change.  How odd.
Well, SIGTERM to go to single-user was implemented in the old init, but
isn't all there in runttys.  I guess I forgot some things were unfinished.
In split-init, any signal like SIGTERM or SIGHUP sent to init is just sent
on to its child.  So using "kill 1" to go to single-user can be implemented
just by runttys or sysvinit.

In the Hurd's BSD-style init world, it's actually the /libexec/runsystem
script that's init's child and it runs runttys.  The script, which you can
see in daemons/runsystem.sh in the hurd source tree, catches the signals
and passes them on to runttys.  runttys is supposed to do BSD-style
response to signals, and exit when the system should go to single-user.

In fact, runttys only implements SIGHUP properly.  For SIGTERM it just dies,
but doesn't kill the children as it should.  It doesn't implement the
traditional SIGTSTP behavior at all (which is to not kill anything but stop
spawning any new ttys programs until further notice).

In traditional BSD (NetBSD matches the traditional BSD implementation), you
go to single-user with "kill 1" (i.e. SIGTERM to init); when init gets
SIGTERM, it does a logwtmp and then does kill(-1, {SIGHUP,SIGTERM,SIGKILL}).
In NetBSD "shutdown" is just a wrapper program that delays and runs wall and
then runs the "reboot" or "halt" program.  The "reboot" program similarly
does a logwtmp, then uses SIGTSTP to tell init not to start anything, then
does kill(-1, SIGTERM) waits a few seconds, calls sync, waits a few more
seconds, and then does kill(-1, SIGKILL) repeatedly with a pause of a few
seconds in between (and complains if it doesn't get to no processes left
(ESRCH) after 5 tries), and then calls the reboot system call.  (halt and
reboot are the same program, differing only in the flags they pass to reboot
at the end of the procedure, and the string they log in wtmp.)

In the sysvinit world on Linux, shutdown is structured fairly similarly.
There, halt and reboot are in fact wrappers around "shutdown", unless given
their "cut right to it" options that make them simply call the reboot system
call.  shutdown works either by just telling init to do it (runs "init 0" or
"init 6"), or by doing a kill(1,SIGTSTP), kill(-1,SIGTERM), kill(-1,SIGKILL)
sequence very similar to BSD's and then logging to wtmp, running some random
things, and calling the reboot system call.  When it's done by init, all
that ever does is run scripts specified in inittab which winds up running a
final script (/etc/init.d/halt) that does the very similar sequence of
sending everything SIGTERM, waiting a few seconds, sending everything
SIGKILL, writing to wtmp, doing some random things, and finally use the
reboot or halt programs in the mode where they do nothing but the system call.

The various programs (halt, reboot, shutdown) in both systems have a variety
of options to skip some of the steps described.  The reboot system call in
both BSD and Linux does essentially what our Hurd reboot function (that
translates to the startup_reboot RPC to /hurd/init) does now, i.e. just sync
the filesystem and reboot the machine.

The upshot of all this is that the implementation of the orderly shutdown is
left to the various shutdown programs or the higher-level features of init.
So it seems reasonable enough to leave /hurd/init as it is, and just add the
appropriate friendly shutdown features in these other places
(runttys/sysvinit, the reboot/halt/shutdown programs).

>From my perspective, both the schemes are the same on all the interesting
issues.  For the Hurd, there are two troubling bits they both share.  The
first one is that kill(-1,SIGTERM), and the second is that kill(-1,SIGKILL).
(For BSD-style init going to single-user, there is also SIGHUP before SIGTERM.)

At the moment, I think most every filesystem getting SIGHUP or SIGTERM will
just die (the default action for those signals).  Probably the most useful
thing is for filesystems to canonically handle these signals and respond to
them by starting a syncfs and then dying as soon as they have no live users.
Filesystems could just ignore the signals and wait for shutdown
notification, but it seems appropriate for going to single-user to kill any
random filesystem processes not still in use.

However, the SIGKILL will be a real problem.  In the Hurd, SIGKILL (unlike
all other signals) works by fetching the process's task port and terminating
the Mach task directly.  This immediately kills any process (just like
SIGKILL in Unix is supposed to, but doesn't always really do).  This stage
of shutdown really should kill most processes with extreme prejudice (after
they failed to die from SIGTERM).  But killing filesystems that are still in
use is a bad thing, and killing tasks like auth or exec will cause init to
crash the system because it becomes impossible to return to normal function.

The old unified init tried to avoid this problem in its BSD-style
shutdown-to-single-user behavior by skipping the tasks marked essential when
sending signals.  In the split-init world, nobody has any way to know which
tasks are marked as essential except for /hurd/init itself; so that's not
possible.  But anyway, this did not necessarily exclude all filesystems that
needed a chance to shut down properly.

I'm really not sure whether we want to have any kind of general feature to
try to make a process immune to SIGKILL (if chosen by root); I tend to think
not.  We could instead add some sort of flag that could be set on a process
and passed as required in a proc_getallpids variant call, that kill(-1,)
would use to intentionally overlook certain processes.

I think that if we have a delay of a few seconds between SIGTERM and SIGKILL
(as the others all do), then it is fairly reasonable to leave no way to
avoid the SIGKILL for all but the essential system servers.  In the case of
a nonessential filesystem, at the initial SIGTERM it will go into "dying
mode", and all its users will have gotten a SIGTERM too and should be going
away shortly; when they've all gone away, the server syncs and dies
(hopefully all that doesn't take 5 seconds).  So they should be gone by the
time the SIGKILL comes.  Now, if some user program is wedged so it doesn't
die on SIGTERM and keeps a file open, then the filesystem will stay around
and won't have been able to finish unmounting by the time it and its user
both get SIGKILL.  But I'm not sure of a good way to avoid that problem.
You'd like to have something like SIGTERM, wait, then SIGKILL to all
non-translators, wait, then SIGKILL to everybody nonessential.
I can't see a way to wedge that into sane semantics of the kill function.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]