Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the gu

From: Anthony Liguori
Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts
Date: Tue, 14 Sep 2010 07:51:02 -0500
On 09/14/2010 04:47 AM, Avi Kivity wrote:
 On 09/13/2010 04:34 PM, Anthony Liguori wrote:
On 09/13/2010 09:13 AM, Kevin Wolf wrote:
I think the only real advantage is that we fix NFS migration, right?
That's the one that we know about, yes.

The rest is not a specific scenario, but a strong feeling that having an
image opened twice at the same time feels dangerous.

We've never really had clear semantics about live migration and block driver's life cycles. At a high level, for live migration to work, we need the following sequence:

1) src> flush all pending writes to disk
2) <barrier>
3) dst> invalidate any cached data
4) dst> start guest

That's pretty complicated, compared to

1) src> close

1.5) <barrier>

2) dst> open
3) dst> start guest

You need to make sure the open happens *after* the close.

You're just using close to flush all pending writes or open to invalidate any cached data.


Anthony Liguori

There are two failure scenarios with this model:

1. dst cannot open the image

We fix that by killing dst and continuing src (which has to re-open its images).

2. dst cannot open the image, and src cannot as well

In this case, what would be gained by having an image handle open in one of the hosts, but no way to open it again? As soon as the surviving qemu exited (or crashed), the image would be lost for ever.

To get (1) working correctly we need an event that tells management that all initialization has completed and the guest is ready to run (so management can terminate the source).

