qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire i


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
Date: Tue, 27 Oct 2015 21:23:09 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 10/27/2015 09:12 PM, Paolo Bonzini wrote:

On 27/10/2015 15:09, Denis V. Lunev wrote:
aio_context should be locked in the similar way as was done in QMP
snapshot creation in the other case there are a lot of possible
troubles if native AIO mode is enabled for disk.

- the command can hang (HMP thread) with missed wakeup (the operation is
   actually complete)
     io_submit
     ioq_submit
     laio_submit
     raw_aio_submit
     raw_aio_readv
     bdrv_co_io_em
     bdrv_co_readv_em
     bdrv_aligned_preadv
     bdrv_co_do_preadv
     bdrv_co_do_readv
     bdrv_co_readv
     qcow2_co_readv
     bdrv_aligned_preadv
     bdrv_co_do_pwritev
     bdrv_rw_co_entry

- QEMU can assert in coroutine re-enter
     __GI_abort
     qemu_coroutine_enter
     bdrv_co_io_em_complete
     qemu_laio_process_completion
     qemu_laio_completion_bh
     aio_bh_poll
     aio_dispatch
     aio_poll
     iothread_run

AioContext lock is reqursive. Thus nested locking should not be a problem.

Signed-off-by: Denis V. Lunev <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
CC: Paolo Bonzini <address@hidden>
CC: Juan Quintela <address@hidden>
CC: Amit Shah <address@hidden>
---
  block/snapshot.c   | 5 +++++
  migration/savevm.c | 7 +++++++
  2 files changed, 12 insertions(+)

diff --git a/block/snapshot.c b/block/snapshot.c
index 89500f2..f6fa17a 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState 
*bs,
  {
      int ret;
      Error *local_err = NULL;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
+
+    aio_context_acquire(aio_context);
ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
      if (ret == -ENOENT || ret == -EINVAL) {
@@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState 
*bs,
          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
      }
+ aio_context_release(aio_context);
Why here and not in hmp_delvm, for consistency?

The call from hmp_savevm is already protected.

Thanks for fixing the bug!

Paolo

the situation is more difficult. There are several disks in VM.
One disk is used for state saving (protected in savevm)
and there are several disks touched via

static int del_existing_snapshots(Monitor *mon, const char *name)
    while ((bs = bdrv_next(bs))) {
        if (bdrv_can_snapshot(bs) &&
            bdrv_snapshot_find(bs, snapshot, name) >= 0) {
            bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
        }
    }

in savevm and similar looking code in delvm with similar cycle
implemented differently.

This patchset looks minimal for me to kludge situation enough.

True fix would be a drop of this code in favour of blockdev
transactions. At least this is my opinion. Though I can not do
this at this stage, this will take a lot of time.

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]