[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH v3 00/19] Fix some jobs/drain/aio_poll related hangs

From: Kevin Wolf
Subject: [Qemu-devel] [PATCH v3 00/19] Fix some jobs/drain/aio_poll related hangs
Date: Thu, 20 Sep 2018 18:19:39 +0200

Especially the combination of iothreads, block jobs and drain tends to
lead to hangs currently. This series fixes a few of these bugs, although
there are more of them, to be addressed in separate patches.

The primary goal of this series is to fix the scenario from:

A simplified reproducer of the reported problem looks like this (two concurrent
commit block jobs for disks in an iothread):

$qemu -qmp stdio \
    -object iothread,id=iothread1 \
virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x6,iothread=iothread1 \
    -device scsi-hd,drive=drive_image1,id=image1,bootindex=1 \
    -device scsi-hd,drive=drive_image2,id=image2,bootindex=2


{ "execute": "block-commit", "arguments": { "device": 
{ "execute": "block-commit", "arguments": { "device": 


- Patch 3 ('aio-wait: Increase num_waiters even in home thread'):
  Hoist atomic_inc/dec outside the if [Fam, Paolo]
- Patch 10 ('block-backend: Fix potential double blk_delete()'):
  Assert in blk_unref() that drain doesn't resurrect the BB [Paolo]
- Patch 11 ('block-backend: Decrease in_flight only after callback'):
  Removed bdrv_ref/unref pair [Paolo]
- v2 Patch 12 ('mirror: Fix potential use-after-free in active'):
  Dropped. It just papered over another bug that is fixed later.
- v3 Patch 17 ('test-bdrv-drain: Fix outdated comments'):
  New patch with comment improvements [Max]
- v3 Patch 18 ('block: Use a single global AioWait'):
  v3 Patch 19 ('test-bdrv-drain: Test draining job source child and
  New patches to fix an additional hang that was caused by notifying the
  wrong AioWait

- Rebased on top of mreitz/block (including fixes for new bugs: patch 1 and 16)
- Patch 12: Added missing bdrv_unref() calls in error path [Fam]

Kevin Wolf (19):
  job: Fix missing locking due to mismerge
  blockjob: Wake up BDS when job becomes idle
  aio-wait: Increase num_waiters even in home thread
  test-bdrv-drain: Drain with block jobs in an I/O thread
  test-blockjob: Acquire AioContext around job_cancel_sync()
  job: Use AIO_WAIT_WHILE() in job_finish_sync()
  test-bdrv-drain: Test AIO_WAIT_WHILE() in completion callback
  block: Add missing locking in bdrv_co_drain_bh_cb()
  block-backend: Add .drained_poll callback
  block-backend: Fix potential double blk_delete()
  block-backend: Decrease in_flight only after callback
  blockjob: Lie better in child_job_drained_poll()
  block: Remove aio_poll() in bdrv_drain_poll variants
  test-bdrv-drain: Test nested poll in bdrv_drain_poll_top_level()
  job: Avoid deadlocks in job_completed_txn_abort()
  test-bdrv-drain: AIO_WAIT_WHILE() in job .commit/.abort
  test-bdrv-drain: Fix outdated comments
  block: Use a single global AioWait
  test-bdrv-drain: Test draining job source child and parent

 include/block/aio-wait.h  |  17 ++-
 include/block/block.h     |   6 +-
 include/block/block_int.h |   3 -
 include/block/blockjob.h  |   3 +
 include/qemu/coroutine.h  |   5 +
 include/qemu/job.h        |  12 ++
 block.c                   |   5 -
 block/block-backend.c     |  31 +++--
 block/io.c                |  30 ++---
 blockjob.c                |   9 +-
 job.c                     |  49 +++++---
 tests/test-bdrv-drain.c   | 292 +++++++++++++++++++++++++++++++++++++++++++---
 tests/test-blockjob.c     |   6 +
 util/aio-wait.c           |  11 +-
 util/qemu-coroutine.c     |   5 +
 15 files changed, 402 insertions(+), 82 deletions(-)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]