qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces


From: Anthony Liguori
Subject: Re: [Qemu-devel] qemu and qemu.git -> Migration + disk stress introduces qcow2 corruptions
Date: Thu, 10 Nov 2011 15:30:08 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13

On 11/10/2011 12:27 PM, Anthony Liguori wrote:
On 11/10/2011 02:55 AM, Avi Kivity wrote:
If we have to delay the release for a month to get it right, we should.
Not that I think we have to.


Adding libvirt to the discussion.

What does libvirt actually do in the monitor prior to migration completing on
the destination? The least invasive way of doing delayed open of block devices
is probably to make -incoming create a monitor and run a main loop before the
block devices (and full device model) is initialized. Since this isolates the
changes strictly to migration, I'd feel okay doing this for 1.0 (although it
might need to be in the stable branch).

This won't work. libvirt needs things to be initialized. Plus, once loadvm gets to loading the device model, the device model (and BDSes) need to be fully initialized.

I think I've convinced myself that without proper clustered shared storage, cache=none is a hard requirement. That goes for iSCSI and NFS. I don't see a way to do migration safely with NFS and there's no way to really solve the page cache problem with iSCSI.

Even with the reopen, it's racing against the close on the source. If you look at Daniel's description of what libvirt is doing and then compare that to Juan's patches, there's a race condition regarding whether the source gets closed before the reopen happens. cache=none seems to be the only way to solve this.

Live migration with qcow2 or any other image format is just not going to work right now even with proper clustered storage. I think doing a block level flush cache interface and letting block devices decide how to do it is the best approach.

Regards,

Anthony Liguori

I know a monitor can run like this as I've done it before but some of the
commands will not behave as expected so it's pretty important to be comfortable
with what commands are actually being used in this mode.

Regards,

Anthony Liguori




reply via email to

[Prev in Thread] Current Thread [Next in Thread]