[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Another glibc Hurd signal code deadlock: Mach thread_suspend sporadi
Re: Another glibc Hurd signal code deadlock: Mach thread_suspend sporadically returning KERN_FAILURE
Thu, 20 Dec 2012 16:16:13 +0100
On Thu, Dec 20, 2012 at 03:25:35PM +0100, Thomas Schwinge wrote:
> So -- what about this failure mode of thread_suspend? Is it expected,
> and needs to be handled? Where is it coming from?
The failure is specific to GNU Mach and was introduced recently in
0a55db5302a78ea51a1b4e4ff3ba632f34b2f6af (Make thread_suspend honor the
TH_UNINT flag). The patch does make sense, but with what we now know,
I would consider it incomplete. There is already far too much busy
waiting in the Hurd. We should rather change thread_suspend so that it
blocks until the target thread state changes.
Other than that, the analysis and the fix look good.
> What I once noticed is that when I attached GDB, the process had mostly
> been paged out (down to two pages, 8 KiB, despite there being no apparent
> memory pressure -- but we know Mach is sometimes doing funny things
> regarding paging), so perhaps TH_UNINT is set during page-in or something
> like that? Hmm...
The TH_UNINT flag is often set in GNU Mach. The main reason is to prevent
leaving an inconsistent state if a thread never resumes. Legacy task
(process) swapping is a likely reason for the state you're observing.