qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2.5 v5 0/11] dataplane snapshot fixes


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [PATCH 2.5 v5 0/11] dataplane snapshot fixes
Date: Fri, 6 Nov 2015 19:19:33 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 11/06/2015 07:05 PM, Eric Blake wrote:
On 11/06/2015 08:54 AM, Stefan Hajnoczi wrote:
On Wed, Nov 04, 2015 at 08:19:31PM +0300, Denis V. Lunev wrote:
with test
     while /bin/true ; do
         virsh snapshot-create rhel7
         sleep 10
         virsh snapshot-delete rhel7 --current
     done
with enabled iothreads on a running VM leads to a lot of troubles: hangs,
asserts, errors.
That is a case of using libvirt to trigger internal snapshots...

The HMP monitor is legacy and also not used by modern libvirt.
...and libvirt is forced to use HMP for internal snapshots, since we
_still_ haven't exposed internal snapshots as a QMP command.

I think the affected use cases are restricted to savevm+dataplane and
HMP+dataplane.
The fact that the commit message calls out a libvirt method of
triggering the bug does mean that it is user-visible, and so it would
qualify as a bug fix even during hard freeze.  But I also understand
that taking a large complex series late in the game is not without risk;
and it is not like this is a regression (rather, something that has
never worked bulletproof), right?

yes, this was not working in the past and this is not a regression.

The problem is that it seems that NOBODY uses iothreads in the
production or even for complex real life production tests. There
is another recently merged example of this (100% reproducible,
happens both on migration/snapshot). We have faced this on
suspend operation.

commit 10a06fd65f667a972848ebbbcac11bdba931b544
Author: Pavel Butsykin <address@hidden>
Date:   Mon Oct 26 14:42:57 2015 +0300

virtio: sync the dataplane vring state to the virtqueue before virtio_save

I have started this initially as a set of small bits in savevm code
and was asked to move the code from savevm.c to block layer.
This has been done and yes, series becomes complex after
that and it was obvious that it will be complex when the task
was set to move a bunch of code from one place to another.

Anyway, from my point of view the serie is not that complex.
It is just large and is doing simple things almost near copy/paste
and there is a month to catch bugs here.

Can we still consider this for merge?

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]