[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: busyloop in sigchld_handler

From: Kim F. Storm
Subject: Re: busyloop in sigchld_handler
Date: Wed, 14 Mar 2007 10:24:05 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.95 (gnu/linux)

David Kastrup <address@hidden> writes:

> Andreas Schwab <address@hidden> writes:
>> David Kastrup <address@hidden> writes:
>>> The CPU is claimed by the process with the loop, so no other process
>>> may actually progress to a state which can be "wait"ed for.
>> Of there is no child to be waited for then there is no loop.
> In order to make sophistics solve the problem, you need to convince
> the kernel.

This happens in the sigchld handler - which is only invoked when there
is a dead child (zombie) to "wait3" for - so we should not have to wait
for the dead child to "really die".

In addition, we call wait3 with WNOHANG, so it is not supposed to block
if there are no dead childs.

That why Andreas and I can't really see where the busy loop can
happen, but since the loop _is_ observed, it is important to
understand why it happens, not just install a "semi-random" patch
which fixes the problem, but nobody can explain why.

Perhaps we need to ask a Linux kernel hacker?

Here's the code in condensed form:

  while (1)
      while (1)
          errno = 0;
          pid = wait3 (&w, WNOHANG | WUNTRACED, 0);
          if (! (pid < 0 && errno == EINTR))
          /* Avoid a busyloop: wait3 is a system call, so we do not want
             to prevent the kernel from actually sending SIGCHLD to emacs
             by asking for it all the time.  */
          sleep (1);

      if (pid <= 0)
      /* handle death of child `pid' */

So the problem is the interpretation of an EINTR error from
wait3(..., WNOHANG, ...).

The Linux man page says:

       EINTR  if WNOHANG was not set and an unblocked signal or a SIGCHLD  was

So WNOHANG => EINTR is not explained, but the usual meaning is that
the wait3 was interrupted by some other signal - and if there is a
loop, that signal is repeated somehow ...

However, with the test code I inserted into the sigchld handler, and
then executing M-x complile once after starting emacs -Q, it clearly
shows that:

a) the sigchld handler is entered exactly once.

b) the first wait3 returns immediately with the pid
   of the compile process,

c) the next wait3 returns immediately with 0, since
   there are no more processes to wait for.

So where's the busy loop?

The above code is the version for Linux - other variations of the code
are used for other platform, but the OP said this was observed on a
GNU/Linux system.

Thinking more about it, I wonder why we use the WUNTRACED flag on wait3.

              which means to also return for children which are  stopped,  and
              whose status has not been reported.

Why do we care about stopped processes?

Kim F. Storm <address@hidden> http://www.cua.dk

reply via email to

[Prev in Thread] Current Thread [Next in Thread]