[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PULL 02/30] cpus: Fix event order on resume of stopped gue
From: |
Paolo Bonzini |
Subject: |
[Qemu-devel] [PULL 02/30] cpus: Fix event order on resume of stopped guest |
Date: |
Wed, 9 May 2018 00:14:19 +0200 |
From: Markus Armbruster <address@hidden>
When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.
Reproducer:
1. Create a scratch image
$ dd if=/dev/zero of=scratch.img bs=1M count=100
Size doesn't actually matter.
2. Prepare blkdebug configuration:
$ cat >blkdebug.conf <<EOF
[inject-error]
event = "write_aio"
errno = "5"
EOF
Note that errno 5 is EIO.
3. Run a guest with an additional scratch disk, i.e. with additional
arguments
-drive
if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
-device virtio-blk-pci,id=scratch,drive=scratch-drive
The blkdebug part makes all writes to the scratch drive fail with
EIO. The werror=stop pauses the guest on write errors.
4. Connect to the QMP socket e.g. like this:
$ socat UNIX:/your/qmp/socket
READLINE,history=$HOME/.qmp_history,prompt='QMP> '
Issue QMP command 'qmp_capabilities':
QMP> { "execute": "qmp_capabilities" }
5. Boot the guest.
6. In the guest, write to the scratch disk, e.g. like this:
# dd if=/dev/zero of=/dev/vdb count=1
Do double-check the device specified with of= is actually the
scratch device!
7. Issue QMP command 'cont':
QMP> { "execute": "cont" }
After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event. Good.
After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP. Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.
The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".
The culprit is vm_prepare_start().
/* Ensure that a STOP/RESUME pair of events is emitted if a
* vmstop request was pending. The BLOCK_IO_ERROR event, for
* example, according to documentation is always followed by
* the STOP event.
*/
if (runstate_is_running()) {
qapi_event_send_stop(&error_abort);
res = -1;
} else {
replay_enable_events();
cpu_enable_ticks();
runstate_set(RUN_STATE_RUNNING);
vm_state_notify(1, RUN_STATE_RUNNING);
}
/* We are sending this now, but the CPUs will be resumed shortly later */
qapi_event_send_resume(&error_abort);
return res;
When resuming a stopped guest, we take the else branch before we get
to sending RESUME. vm_state_notify() runs virtio_vmstate_change(),
among other things. This restarts I/O, triggering the BLOCK_IO_ERROR
event.
Reshuffle vm_prepare_start() to send the RESUME event earlier.
Fixes RHBZ 1566153.
Cc: Paolo Bonzini <address@hidden>
Signed-off-by: Markus Armbruster <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Paolo Bonzini <address@hidden>
---
cpus.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/cpus.c b/cpus.c
index 5bcd3ec..be3a4eb 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2043,7 +2043,6 @@ int vm_stop(RunState state)
int vm_prepare_start(void)
{
RunState requested;
- int res = 0;
qemu_vmstop_requested(&requested);
if (runstate_is_running() && requested == RUN_STATE__MAX) {
@@ -2057,17 +2056,18 @@ int vm_prepare_start(void)
*/
if (runstate_is_running()) {
qapi_event_send_stop(&error_abort);
- res = -1;
- } else {
- replay_enable_events();
- cpu_enable_ticks();
- runstate_set(RUN_STATE_RUNNING);
- vm_state_notify(1, RUN_STATE_RUNNING);
+ qapi_event_send_resume(&error_abort);
+ return -1;
}
/* We are sending this now, but the CPUs will be resumed shortly later */
qapi_event_send_resume(&error_abort);
- return res;
+
+ replay_enable_events();
+ cpu_enable_ticks();
+ runstate_set(RUN_STATE_RUNNING);
+ vm_state_notify(1, RUN_STATE_RUNNING);
+ return 0;
}
void vm_start(void)
--
1.8.3.1
- [Qemu-devel] [PULL 00/30] Misc patches for 2018-05-09, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 01/30] configure: recognize more rpmbuild macros, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 03/30] cpus: tcg: fix never exiting loop on unplug, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 02/30] cpus: Fix event order on resume of stopped guest,
Paolo Bonzini <=
- [Qemu-devel] [PULL 04/30] checkpatch.pl: add common glib defines to typelist, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 06/30] memdev: remove "id" property, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 05/30] qom: allow object_get_canonical_path_component without parent, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 09/30] exec: extract address_space_translate_iommu, fix page_mask corner case, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 08/30] exec: small changes to flatview_do_translate, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 10/30] exec: reintroduce MemoryRegion caching, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 11/30] qemu-thread: always keep the posix wrapper layer, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 12/30] update-linux-headers: drop hyperv.h, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 07/30] exec: move memory access declarations to a common header, inline *_phys functions, Paolo Bonzini, 2018/05/08
- [Qemu-devel] [PULL 13/30] accel: use g_strsplit for parsing accelerator names, Paolo Bonzini, 2018/05/08