[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contex
Re: [PATCH v2 29/36] blockdev: qmp_x_blockdev_reopen: acquire all contexts
Fri, 5 Feb 2021 17:36:51 +0100
Am 05.02.2021 um 17:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 05.02.2021 19:01, Kevin Wolf wrote:
> > Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > During reopen we may add backing bs from other aio context, which may
> > > lead to changing original context of top bs.
> > >
> > > We are going to move graph modification to prepare stage. So, it will
> > > be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
> > > non-original aio context, which we didn't aquire which leads to crash.
> > >
> > > More correct would be to acquire all aio context we are going to work
> > > with. And the simplest ways is to just acquire all of them. It may be
> > > optimized later if needed.
> > >
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <firstname.lastname@example.org>
> > I'm afraid it's not as easy. Holding the lock of more than one
> > AioContext is always a bit risky with respect to deadlocks.
> > For example, changing the AioContext of a node with
> > bdrv_set_aio_context_ignore() has explicit rules that are now violated:
> > * The caller must own the AioContext lock for the old AioContext of bs,
> > but it
> > * must not own the AioContext lock for new_context (unless new_context is
> > the
> > * same as the current context of bs).
> > Draining while holding all AioContext locks is suspicious, too. I think
> > I have seen deadlocks before, which is why bdrv_drain_all_*() are
> > careful to only ever lock a single AioContext at a time.
> That's not good :\ Hmm, probably we just should flush everything
> before all graph modifications.
Would that have to be a separate phase before prepare then?
I suppose the same problem exists with drv->bdrv_reopen_prepare, which
might be called in a different state (both graph structure and
AioContext) than before. I'll have to see the patch first that reorders
things, but this callback has always had the problem that sometimes it
wants the old state and sometimes it wants the new state...