qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 04/19] replay: don't drain/flush bdrv queue w


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH v7 04/19] replay: don't drain/flush bdrv queue while RR is working
Date: Fri, 30 Nov 2018 11:01:57 +0100
User-agent: Mutt/1.10.1 (2018-07-13)

Am 30.11.2018 um 08:55 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:address@hidden
> > Am 10.10.2018 um 15:33 hat Pavel Dovgalyuk geschrieben:
> > > In record/replay mode bdrv queue is controlled by replay mechanism.
> > > It does not allow saving or loading the snapshots
> > > when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
> > > queue, but flushing the queue is still impossible there,
> > > because it may cause deadlocks in replay mode.
> > > This patch disables bdrv_drain_all and bdrv_flush_all in
> > > record/replay mode.
> > >
> > > Signed-off-by: Pavel Dovgalyuk <address@hidden>
> > 
> > Disabling bdrv_flush_all() shouldn't make a big difference in replay
> > mode, so I agree with that.
> > 
> > But does the bdrv_drain_all() change mean that bdrv_drain_all_begin()
> > can return while there are still requests in flight? This sounds like
> > something that could break some existing code, because the meaning of
> > the function is that if it returns success, no requests are in flight
> > any more.
> > 
> > What would move the request forward in the replay case once it has been
> > created? Does this also involve some user interaction? Or may we process
> > the next event in a drain callback of blkreplay or something?
> 
> You are right - there are some operation that can't be performed at any
> time during the replay, because of unfinished block requests.
> 
> One of these is creating the VM snapshot. I've added a check in another
> patch, which prevents snapshotting while the event queue (which holds
> the block requests) is not empty.
> 
> There could also be other things, that will be fixed when we try to use it.
> 
> Moving replay forward until the queue is empty is not a good idea, because
> step-by-step debugging should be available and work without skipping the 
> instructions.

So if the idea is that in replay mode, bdrv_drain_all_begin() is only
called when there are no requests in flight anyway (because callers
catch this situation earler), can we make this an assertion that the
block device is already quiesced?

Skipping drain without knowing that there are no requests in flight
feels simply wrong, and I'm almost sure it will result in bugs. On the
other hand, if we know that no requests are in flight, there's no real
point in skipping.

So I think I still don't understand the situation where skipping a drain
is the correct solution.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]