[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: libpager deadlock
From: |
Sergio Lopez |
Subject: |
Re: libpager deadlock |
Date: |
Thu, 8 Apr 2010 13:15:20 +0200 |
El Thu, 8 Apr 2010 00:53:02 +0200
Samuel Thibault <samuel.thibault@gnu.org> escribió:
> HEllo,
>
> Sergio Lopez, le Wed 07 Apr 2010 12:43:15 +0200, a écrit :
> > El Sat, 27 Mar 2010 00:39:19 +0100
> > Samuel Thibault <samuel.thibault@gnu.org> escribió:
> > > From times to times, ext2fs deadlocks on the pager->interlock
> > > mutex. This is an excerpt of what I could find in the process:
> > >
> > > #2 0x08106e59 in memory_object_lock_request ()
> > > #3 0x0806fdeb in _pager_lock_object (p=0x81c97b8, offset=0,
> > > size=827392, should_return=2, should_flush=0, lock_value=8,
> > > sync=0) at /var/tmp/hurd-20090404/./libpager/lock-object.c:68 #4
> > > 0x0806da18 in pager_sync (p=0x81c97b8, wait=0)
> > > at /var/tmp/hurd-20090404/./libpager/pager-sync.c:31 ... #9
> > > 0x0805a9ac in periodic_sync (interval=5)
> > > at /var/tmp/hurd-20090404/./libdiskfs/sync-interval.c:119
> > >
> > > This is the periodic sync, calling memory_object_lock_request()
> > > on the pager. Note that before doing this, _pager_lock_object
> > > takes pager->interlock.
> >
> > AFAIK, m_o_lock_request is an asynchronous operation, so it should
> > not block in any case. Perhaps the cthreads package is behaving
> > weird?
>
> Above #2 0x08106e59 in memory_object_lock_request (), there is
> #0 0x080bf22c in mach_msg_trap ()
> #1 0x0808666e in mach_msg ()
>
> So it's really hung in the kernel. And indeed, even if from
> the interface it would seem like it could be asynchronous,
> the memory_object_lock_completed() call is done from the
> memory_object_lock_request function itself...
>
But even if m_o_lock_completed is called from m_o_lock_request, that
answer should come in another message, which arrives at another user
thread (in libpager, this is processed at lock-completed.c). So if
_pager_lock_object is called with sync=1, and is waiting for the kernel
to reply with a m_o_lock_completed, the thread should be waiting at the
"condition_wait (&p->wakeup, &p->interlock);" just a lines below. And
condition_wait releases the interlock until is woke up by another
thread.
If a thread is stalled in mach_msg_trap(), that means the kernel can't
enqueue the message for some reason (and this is very, very bad). Is
possible that ext2fs had a huge number of threads at that moment?
- Re: libpager deadlock, Sergio Lopez, 2010/04/07
- Re: libpager deadlock, Samuel Thibault, 2010/04/07
- Re: libpager deadlock,
Sergio Lopez <=
- Re: libpager deadlock, Samuel Thibault, 2010/04/08
- Re: libpager deadlock, Sergio Lopez, 2010/04/08
- Re: libpager deadlock, Samuel Thibault, 2010/04/08
- Re: libpager deadlock, Sergio Lopez, 2010/04/09
- MIG documentation (was: libpager deadlock), Thomas Schwinge, 2010/04/09
- Re: MIG documentation (was: libpager deadlock), Samuel Thibault, 2010/04/09
- Re: MIG documentation, Thomas Schwinge, 2010/04/09