screen-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [screen-devel] [bug #25089] screen produces zombies


From: Nicholas Marriott
Subject: Re: [screen-devel] [bug #25089] screen produces zombies
Date: Thu, 3 Feb 2022 06:36:37 +0000

Hi Vincent & all - FWIW I believe your analysis of the problem with utempter is correct, I saw the same problem in tmux and exactly the change you are suggesting (an explicit kill(getpid(), SIGCHLD)) completely fixed the problem both then and since.

You are also correct that GotSigChld (and any other globals accessed by signal handlers) should be typed "volatile sigatomic_t" to avoid signal races. AFAIK this is the same as "volatile int" on any platforms you are likely to see.


On Thu, 3 Feb 2022, 01:48 Vincent Lefèvre, <INVALID.NOREPLY@gnu.org> wrote:
Follow-up Comment #25, bug #25089 (project screen):

[comment #24 comment #24 :]
> It doesn't seem to depend on the OS, so far it seems to depend on two
things:
>
> a) using libutempter (a compile time decision by the configure script), and
> b) using a slow computer or a VM.

This is actually more complex. To reproduce the issue, the SIGCHLD needs to be
received while the action has been set to SIG_DFL by libutempter, basically
while the libutempter helper is running. I had noticed that reducing or
disabling the compiler optimizations for screen made the zombies less likely
to appear on my VM. And this probably depends very much on the hardware (e.g.,
on my laptop, I recently noticed that a race condition in the display manager
triggered an issue only after I changed the SSD disk, just because the new SSD
disk is faster).

> The exact part which "needs" to be slow on a host to trigger the race
condition or why it shows up especially on VMs is though unclear to me. I was
so far unable to reproduce this on neither a Xen VM nor on a KVM (ProxMox)
based VM nor on real hardware.

My VM (at Gandi) is based on Xen. And the issue is 100% reproducible.

I've just tried to reproduce the issue on my laptop by adding a "sleep(1);"
after the SIGCHLD action has been set to SIG_DFL in libutempter (iface.c), but
I couldn't. I suppose that the SIGCHLD arrives earlier. I don't know which
stress-ng options could be used to have it delayed ("--all 1" completely
freezes my machine).

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?25089>

_______________________________________________
  Message posté via Savannah
  https://savannah.gnu.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]