[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gpsd-dev] [PATCH] Use pselect() to reduce wakeups and avoid race co

From: Mike Frysinger
Subject: Re: [gpsd-dev] [PATCH] Use pselect() to reduce wakeups and avoid race conditions
Date: Fri, 13 Jan 2012 22:29:40 -0500
User-agent: KMail/1.13.7 (Linux/3.2.0; KDE/4.6.5; x86_64; ; )

On Friday 13 January 2012 19:18:44 Eckhart Wörner wrote:
> Am Freitag, 13. Januar 2012, 18:35:55 schrieben Sie:
> > it's worse than that.  you cannot send any signal a 2nd time once the
> > first has been delivered.  the relationship to where the checking occurs
> > doesn't matter. at any point before the app exits, sending the same
> > signal a 2nd time will cause the kernel to immediately terminate it.
> That I don't understand. The onsig handler can run twice in rapid
> succession, can't it?

it can if gpsd is sent diff signals.  but the way signal() works is that the 
handler for that signal is reset to SIG_DFL before the user handler is called.  
this is one reason why sigaction() was introduced.

at a high level, you can think of it as:
        - gpsd calls signal(SIGHUP, onsig)
        - gpsd calls select(), gets some data, and starts to process it
                (most of code in while() loop after the select())
        - user runs `killall -HUP gpsd`
        - kernel looks up the handler for gpsd for SIGHUP and gets "onsig"
        - kernel sets the handler for gpsd for SIGHUP to SIG_DFL
        - kernel forces gpsd to run onsig (details here don't matter)
        - gpsd runs onsig and returns
        - kernel restores gpsd to wherever it was before
        - gpsd continues executing (hasn't gotten to top of while() loop
                where "signalled" is checked)
        - user runs `killall -HUP gpsd`
        - kernel looks up the handler for gpsd for SIGHUP and gets SIG_DFL
        - kernel terminates process immediately

gpsd doesn't get a chance to finish cleaning up after itself :(

i think your pselect() change indirectly fixes this because it blocks all 
signals and only allows one to get through at a time.  then gpsd either exits, 
or it restarts and resets all the signal handlers via the longjmp().

along these lines, i'm not sure your sigprocmask() call is at the right place.  
consider the code path:
        - setjmp()
        - sigprocmask(&newsigset, &oldsigset)
        - pselect()
        - SIGHUP is handled
        - longjmp()
        - sigprocmask(&newsigset, &oldsigset)

so the 2nd call to sigprocmask() will end up saving your newsigset into the 
old one.  in practice, it won't make a difference to the current code base, but 
it sounds pretty fragile.

so all in all, this should probably be cleanly fixed by changing the signal() 
calls to sigaction() and then moving all the signal init stuff (including your 
sigprocmask()) to before the setjmp() call.

> > your original changelog is too vague to figure out which race and which
> > signals you're referring to.  i'd suggest reposting with clarification
> > and real details as to what it is fixing.
> Well, the patch originated from a discussion on irc, it was only reposted
> on the mailing list for eternity.
> I guess the discussion documents the exact reasons for the patch already.

keeping it all in one place (git log) makes it a lot easier to recover than 
trying to correlate mailing lists and commits based on dates

Attachment: signature.asc
Description: This is a digitally signed message part.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]