[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

echo interrupted by SIGCHLD from a dying coprocess

From: Tomáš Trnka
Subject: echo interrupted by SIGCHLD from a dying coprocess
Date: Wed, 24 Mar 2010 20:30:00 +0100
User-agent: KMail/1.13.1 (Linux/; KDE/4.4.1; x86_64; ; )

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -
DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' -
DCONF_VENDOR='unknown' -DLOCALEDIR='/home/trnka/opt/share/locale' -
DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib   -g -
uname output: Linux a324-2 #1 SMP Wed Feb 20 12:36:17 CET 2008 x86_64 
Machine Type: x86_64-unknown-linux-gnu

Bash Version: 4.1
Patch Level: 2
Release Status: release

I've started using coprocesses heavily and I've found a nasty problem related 
(but not limited) to them: After the coprocess finishes its job, the resultant 
SIGCHLD is not properly blocked by bash signal processing logic and interferes 
with script I/O. In my case, I've been using something like:

read var1 var2 < <( a | long | pipeline | here)
echo "var1=$var1"
echo "var2=$var2"

Sometimes, the SIGCHLD arrived just when one of the echos were doing output 
and the result was:
echo: write error: Interrupted system call
As this is a bit of a race, it occurs only when the stars are right, i.e. 
during normal usage the probability of the SIGCHLD hitting exactly the echo is 
quite low. However, as soon as anything causes the I/O to take significantly 
longer, the bug appears. I've been hitting quite often (30%?) when running the 
script over SSH.

This bug has probably been reported years ago here:

I've reduced one of my scripts to this (nothing exceptionally intelligent, but 
it does the job):


while [[ 1 ]]; do
set +e
read tmp tmp2 < <( echo "blabla" | wc | tr -s " " "\n" | tail -n 2 | tr "\n" " 
set -e
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2

Using this script I can reliably reproduce the bug (i.e. get a Interrupted 
system call error) using bash 4.1.2 (compiled myself from vanilla tarball) and 
3.1.17 (Debian lenny) over SSH and 4.0.35 (stock Fedora 12) under strace.

Applying the following simple patch (against 4.1.2) fixes the bug:

--- builtins/echo.def.orig      2010-03-24 19:40:54.000000000 +0100
+++ builtins/echo.def   2010-03-24 19:47:07.000000000 +0100
@@ -27,6 +27,7 @@
 #include "../bashansi.h"
+#include <signal.h>
 #include <stdio.h>
 #include "../shell.h"
@@ -108,6 +109,7 @@
   int display_return, do_v9, i, len;
   char *temp, *s;
+  sigset_t nmask, omask;
   do_v9 = xpg_echo;
   display_return = 1;
@@ -159,6 +161,10 @@
   clearerr (stdout);   /* clear error before writing and testing success */
+  sigemptyset(&nmask);
+  sigaddset(&nmask, SIGCHLD);
+  sigprocmask(SIG_BLOCK, &nmask, &omask);
   while (list)
@@ -193,6 +199,8 @@
   if (display_return)
     putchar ('\n');
+  sigprocmask(SIG_SETMASK, &omask, NULL);
   return (sh_chkwrite (EXECUTION_SUCCESS));

reply via email to

[Prev in Thread] Current Thread [Next in Thread]