[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-block] [PATCH v5 00/13] block: Rework bdrv_close_all()

From: Max Reitz
Subject: [Qemu-block] [PATCH v5 00/13] block: Rework bdrv_close_all()
Date: Tue, 3 Mar 2015 15:12:58 -0500

Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
which can lead to data corruption (see the iotest added in the final
patch of this series) and is most certainly very ugly.

This series reworks bdrv_close_all() to instead eject the BDS trees from
all BlockBackends and then close the monitor-owned BDS trees, which are
the only BDSs without a BB. In effect, all BDSs are closed just by
getting closed automatically due to their reference count becoming 0.

The benefit over the approach taken in v1 and v2 is that in device
models we often cannot simply drop the reference to a BB because there
may be some user which we forgot about. By ejecting the BDS trees from
the BB, the BB itself becomes unusable, but in a clean way (it will
return errors when accessed, but nothing will crash). Also, it is much
simpler (no reference tracking necessary).

The only disadvantage (I can see) is that the BBs are leaked; but this
does not matter because the qemu process is about to exit anyway.

This series depends on v2 of my series
"blockdev: BlockBackend and media" (or any later version), and on v2 of
my patch "virtio-scsi: Allocate op blocker reason before blocking".
(actually, I rebased it on my local v3 of the "BB and media" series, but
I did not send that one out yet, so I'm writing "v2" for the time being)

- Patch 5: Because of above mentioned virtio-scsi patch, the helper
  functions for setting up and destroying op blockers on a virtio-scsi
  device are no longer necessary because s->blocker is allocated all the
- Patch 9: See justification below. [Kevin]
- Patch 10 (was patch 9 in v4): Dropped the modification of
  bdrv_close_all() (moved to patch 12)
- Patch 11 (split off patch 10 from v4): Moved the modification of
  bdrv_close_all() to patch 12 (just like for patch 10)
- Patch 12 (Heavy modification and extension of a part of patch 10 from
  v4): If we don't force-bdrv_close() all BDSs anymore, there may be
  block jobs running; so we have to cancel all block jobs manually.
  However, in order to do that effectively and in a simple way, I want
  the list of BDSs to be completely empty afterwards. For that to
  happen, however, we need to eject all BDSs from the BBs and drop the
  monitor references to bare BDSs first. Therefore, all these changes
  have to be in a single patch (because they depend on each other), and
  that patch is this patch. [Kevin]

Justification for patch 9:

Adding yet another list of BlockDriverStates is ugly, I know. But we
need something to track all the block jobs, and these are the

A. Have a list of all block jobs. Feasible, but this would involve a bit
   more code changes.
B. Only consider the BDSs block jobs may run on; these are the root BDSs
   (bdrv_states) and the named BDSs (graph_bdrv_states). Is a bit uglier
   in that we need to iterate through both lists, but the benefit of
   this approach is that we don't have to add yet another BDS list.
   Also, this approach is a bit faster because we don't have to iterate
   through all the BDS left open, but only through the ones which might
   have block jobs running on them.
   However, the way patch 12 is implemented, bdrv_close_all() (which
   isn't performance-critical anyway...) only iterates through the list
   of all BDSs once the references from the BBs and the monitor are
   gone, so the only BDSs left are the ones on which block jobs are
   running, and the ones which are referenced by those (recursively).
   Therefore, we will not have a performance problem here.
C. Have a list of all BDSs. This is mostly ugly because it means having
   yet another list of BDSs, but other than that, it's okay.

So all these alternatives seem pretty equal, with A maybe being the most
efficient one.

Why did I go for C? As Kevin said, we probably want to have a way to
assert that all BDSs are closed at the end of bdrv_close_all(). Besides
adding a counter for the number of BDSs (which would have been as simple
as it is ugly), this is the only way (I see) to do it. So we get that
for free.

But after this series, there are four lists for BDSs: bdrv_states,
graph_bdrv_states, all_bdrv_states, and monitor_bdrv_states. Urgh.

But the follow-up series to this series removes bdrv_states because they
are covered by the list of BlockBackends. Maybe I could have repurposed
that list to be what all_bdrv_states is right now, but I would find that
confusing. So we're down to three.

And finally, there have been plans for quite a while to give node-names
to all BDSs; this would make graph_bdrv_states and all_bdrv_states just
the same. So with that, we'd be down to two.

So, because using option C allows us to make sure there are no BDSs left
open at the end of bdrv_close_all(), because it's just as good as
solution A and B from a technical perspective, and because it is not
that ugly after all, I chose to go for that.

git-backport-diff against v4:

[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/13:[----] [--] 'iotests: Move _filter_nbd into common.filter'
002/13:[----] [--] 'iotests: Make redirecting qemu's stderr optional'
003/13:[----] [--] 'iotests: Add test for eject under NBD server'
004/13:[----] [--] 'quorum: Fix close path'
005/13:[0032] [FC] 'block: Move BDS close notifiers into BB'
006/13:[----] [-C] 'block: Use blk_remove_bs() in blk_delete()'
007/13:[----] [-C] 'blockdev: Use blk_remove_bs() in do_drive_del()'
008/13:[----] [--] 'block: Make bdrv_close() static'
009/13:[down] 'block: Add list of all BlockDriverStates'
010/13:[0003] [FC] 'blockdev: Keep track of monitor-owned BDS'
011/13:[down] 'block: Add blk_remove_all_bs()'
012/13:[down] 'block: Rewrite bdrv_close_all()'
013/13:[----] [-C] 'iotests: Add test for multiple BB on BDS tree'

*** BLURB HERE ***

Max Reitz (13):
  iotests: Move _filter_nbd into common.filter
  iotests: Make redirecting qemu's stderr optional
  iotests: Add test for eject under NBD server
  quorum: Fix close path
  block: Move BDS close notifiers into BB
  block: Use blk_remove_bs() in blk_delete()
  blockdev: Use blk_remove_bs() in do_drive_del()
  block: Make bdrv_close() static
  block: Add list of all BlockDriverStates
  blockdev: Keep track of monitor-owned BDS
  block: Add blk_remove_all_bs()
  block: Rewrite bdrv_close_all()
  iotests: Add test for multiple BB on BDS tree

 block.c                                | 49 ++++++++++++------
 block/block-backend.c                  | 41 ++++++++++++----
 block/quorum.c                         |  3 +-
 blockdev-nbd.c                         | 36 +-------------
 blockdev.c                             | 24 +++++++--
 hw/block/dataplane/virtio-blk.c        | 77 ++++++++++++++++++++++-------
 hw/scsi/virtio-scsi.c                  | 59 ++++++++++++++++++++++
 include/block/block.h                  |  2 -
 include/block/block_int.h              |  8 ++-
 include/hw/virtio/virtio-scsi.h        | 10 ++++
 include/sysemu/block-backend.h         |  4 +-
 nbd.c                                  | 13 +++++
 stubs/Makefile.objs                    |  1 +
 stubs/blockdev-close-all-bdrv-states.c |  5 ++
 tests/qemu-iotests/083                 | 13 +----
 tests/qemu-iotests/083.out             | 10 ----
 tests/qemu-iotests/096                 | 90 ++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/096.out             | 16 ++++++
 tests/qemu-iotests/117                 | 86 ++++++++++++++++++++++++++++++++
 tests/qemu-iotests/117.out             | 14 ++++++
 tests/qemu-iotests/common.filter       | 12 +++++
 tests/qemu-iotests/common.qemu         | 12 ++++-
 tests/qemu-iotests/group               |  2 +
 23 files changed, 474 insertions(+), 113 deletions(-)
 create mode 100644 stubs/blockdev-close-all-bdrv-states.c
 create mode 100755 tests/qemu-iotests/096
 create mode 100644 tests/qemu-iotests/096.out
 create mode 100755 tests/qemu-iotests/117
 create mode 100644 tests/qemu-iotests/117.out


reply via email to

[Prev in Thread] Current Thread [Next in Thread]