qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
Date: Thu, 28 Apr 2011 10:10:58 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110419 Lightning/1.0b2 Thunderbird/3.1.9

On 04/28/2011 09:57 AM, Jes Sorensen wrote:
On 04/28/11 16:46, Anthony Liguori wrote:
On 04/28/2011 09:38 AM, Jes Sorensen wrote:

Sorry but this is utterly bogus.

The snapshot support as is works fine, and the command is in the
monitor. We should expose it in QMP as well.

It went in for the monitor because it was considered an imperfect
command so we held up the QMP side because we wanted a better interface.

I am not sure who is included in the 'we' here.....

The current command does:
1) Create new image backing to current image
2) Flush outstanding I/O to old image
3) Close current image
4) Reopen newly created image
5) Go

Operations (1) and (2) are very synchronous operations.  (4) can be too.
  We really should have a bdrv_aio_snapshot() function that implements
the logic for at least (2) in an asynchronous fashion.

That sort of interface is going to affect how we expose things in QMP.
As from a QMP perspective, we're going to do something like:

a) start snapshot
b) query snapshot progress
c) receive notification of snapshot completion
d) flip over image

Sorry this is inherently broken. The management tool should not be
keeping state in this process. I agree an async interface would be nice,
but the above process is plain wrong.

The async snapshot process needs to be doing the exact same as in the
current implementation, the main difference is that it would be running
asynchronously and that QMP would be able to query the state of it.


No, the command does too many things and as such, makes it impossible for a management tool to gracefully recover.

And of course, this needs to be carefully thought through for race
conditions.  In the current command, what happens if you get a crash
between (2) and (3)?  There's no way for the management tools to know
that we didn't finish flushing writes.  How does the management tool
know that (1) didn't fail mid way through resulting in a corrupted image?

There is no issue here, you have the exact same problem if you get a
crash during d) in your example. It is the same with the existing
command, the crash is only an issue if it happens right in the middle of
the switch over. Until then, only the original image remains valid.

But the new image is always valid once it's been created as pending writes are lost no matter what in a hard power failure. That suggests the following flow:

1) Create new image with a backing file
2) Completion ACK

At this stage, the management tool should update it's internal state to point to the new image.

3) Begin switch over to new image
4) Switch over image.
5) Receive notification when it's complete

If (4) is happening asynchronously, things get kind of complicated. You can either wait for things to quiesce on their own or you can queue pending requests. It's unclear to me what the right strategy is or if there's a mixed strategy that's needed. I think it makes sense to treat 3/4 as a command with (5) being an event notification.

But combining 1-5 in a single interface creates a command that while convenient on the command line, is not usable for a robust management tool.

Regards,

Anthony Liguori


Cheers,
Jes





reply via email to

[Prev in Thread] Current Thread [Next in Thread]