Racing condition leads to unstable exit code

Hello,

I'm using 'GNU bash, version 4.3.46(1)-release (x86_64-suse-linux-gnu)' provided by OpenSUSE Tumbleweed. I recently faced a problem that, depending on the system load, bash returns different exit codes. I detected that it is related to trap processing.

My tests uses 3 blocks of code:

1) installs a trap for USR1 and calls function testme

2) calls (3) in background, waits for it and return its exit code

3) send kill -USR1 to main process (1)

The way (1) calls (2) has some variants:

a) direct call: testme

b) as a subshell: (testme)

c) as a command substitution $(testme)

I added some sleeps in order to have these variations:

x) main bash instance receives signal after the code on (2) reaches the wait (which might be a subprocess when (b) and (c)

y) main bash instance receives signal before the code on (2) reaches the wait

w) main bash instance finish signal processing before the code on (3) returns

z) main bash instance finish signal processing after the code on (3) returns

When I have scenarios "a,x,{w,z}", trap is received while main instance is running wait. It returns error code 138. I know that this is a documented behavior to return 128+signal. However, letting the time a signal reaches the process change wait behavior seems to be a weak solution. I guess wait should ignore this signal when there is a trap waiting for that signal. I really don't understand why wait should return at all when process receives a signal. Anyway, this is not the bug.

When I have scenarios "a,y,{w,z}", bash works nicely. When I have scenarios "b,{x,y},{w,z}", bash also works nicely, even using subshell. The scenario "c,{x,y},w"

also is OK. The only problem is with "c,{x,y},z", when the command substitution subshell returned while the trap was still running. Even returning code 2, the caller installed gets 0. it seems that trap processing cleans the exit code of any finished command substitution. I guess this is not expected :-)

A simple version of the test code that considers only scenario c,x,z is this

#!/bin/bash

trap_USR1() {

echo "$BASHPID:Trap USR1 start"

sleep 3

echo "$BASHPID:Trap USR1 end"

}

testme() {

(

sleep 0.5

kill -USR1 $parent

sleep 0.5

err=2

echo "$BASHPID:exit $err..." >&2

exit $err

) &

_pid=$!

echo "$BASHPID:waiting $_pid" >&2

wait $_pid

err=$?

echo "$BASHPID:return $err" >&2

return $err

}

parent=$BASHPID

echo "$BASHPID:I'm installing trap USR1..." >&2

trap trap_USR1 USR1

x=$(testme)

echo "$BASHPID:err=$? (should be equal previous return) command substitution" >&2

testme return 2 while $? on the last line gets 0. I'll attach a full version script.

Regards,

From:	Luiz Angelo Daros de Luca
Subject:	Racing condition leads to unstable exit code
Date:	Fri, 23 Sep 2016 19:04:53 +0000