Re: [Qemu-devel] [PATCH 28/34] block: Introduce bs->explicit_options

From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH 28/34] block: Introduce bs->explicit_options
Date: Fri, 15 May 2015 19:47:01 +0200
On 08.05.2015 19:22, Kevin Wolf wrote:
bs->options doesn't only contain options that the user explicitly
requested, but also option that were derived from flags, the filename or
inherited from the parent node.

For reopen, it is important to know the difference because reopening the
parent can change inherited values in child nodes, but it shouldn't
change any options that were explicitly specified for the child.

Signed-off-by: Kevin Wolf <address@hidden>
  block.c                   | 20 ++++++++++++++++++--
  include/block/block.h     |  1 +
  include/block/block_int.h |  1 +
  3 files changed, 20 insertions(+), 2 deletions(-)

This patch is not that easy to review because we have to make sure that bs->explicit_options really does not include any derived options, so we have to make sure that any @options QDict given to bdrv_open_inherit() does not contain any derived options:

1) In bdrv_open_backing_file():
The QDict is generated by extracting the sub-QDict from @parent_options with the prefix "#{bdref_key}.". Maybe good, we'll have to find out. "driver" is put into the QDict if not explicitly overridden and if provided by the BDS format driver as the backing file format: Okay.
So is @parent_options good?

1a) bdrv_open_backing_file() call from bdrv_open_inherit():
@options is a QDict which we assume (our induction hypothesis) to be void of derived options when bdrv_open_inherit() was called. @bdref_key is "backing". Which calls can modify @options on the way to the bdrv_open_backing_file() call? 1a I) child_role->inherit_options(): None so far. But this is very deeply nested, and there is no notice of this whatsoever. It's fine, but it's not good. 1a II) bdrv_fill_options(): If @filename is a JSON filename, anything can happen, but this is okay because these are user-supplied options (or are they not? see note below). "filename" and "driver" can be set, but they do not match /^backing\./, so it's good.
1a III) bdrv_backing_options(): Special case of 1a I.
1a IV) bdrv_open_image(): Only removes entries from @options, so it's good.
1a V) Once again, "driver" may be set; it's fine (see 1a II).
1a VI) bdrv_open_common() only removes entries from @options, too, so it's good as well. Therefore, no auto-generated options matching /^backing\./ are passed to bdrv_open_backing_file() here.

1b) bdrv_open_backing_file() call from mirror_complete(): @parent_options is NULL, everything's good.

Thus, the bdrv_open_inherit() call from bdrv_open_backing_file() is safe.

2) In bdrv_open_image():
The QDict is generated by extracting the sub-QDict from @options with the prefix "#{bdref_key}.". Whether that's okay again depends on the callers of bdrv_open_image():

2a) bdrv_open_image() call from bdrv_open_inherit():
Same problem as for 1a, @bdref_key is "file". No option set that matches /^file\./.

2b) bdrv_open_image() call from blkdebug_open():
@options is a QDict which is both supplied to blkdebug_open() and may be filled from a config file (the latter one of which is definitely okay), all blkdebug-specific options have been removed (none of which match /^image\./ anyway). @bdref_key is "image". The question is whether any option can be autogenerated for blkdebug_open() which matches /^image\./: 2b I) drv->bdrv_file_open() call in bdrv_open_common(): Basically the same problem as for 1a. No option set that matches /^image\./ either, so it's good.

2c) First bdrv_open_image() call from blkverify_open():
@options is a QDict which is supplied to blkverify_open() where all blkverify-specific options (x-raw and x-image) have been removed. @bdref_key is "raw". Essentially the same as 2b, no option set that matches /^raw\./ before the drv->bdrv_file_open() call, so it's good.

2d) Second bdrv_open_image() call from blkverify_open():
Same as 2c with @bdref_key being "test". No option set that matches /^test\./ before the drv->bdrv_file_open() call, so it's good.

2e) bdrv_open_image() call from quorum_open():
@options is a QDict which is supplied to quorum_open() where all quorum-specific options have been removed. @bdref_key is "children.%d". Same as 2b, 2c, 2d: No option set that matches /^children\.\d+\./ before the drv->bdrv_file_open() call in bdrv_open_common(), so it's good.

2f) bdrv_open_image() call from vmdk_parse_extents():
@options is supplied directly to this function; it originally comes directly from vmdk_open(), without any modification. vmdk_open() is called as drv->bdrv_open() in bdrv_open_common(); so it's again essentially the same as 2b to 2e, with @bdref_key being "extents.%d". No option set that matches /^extents\.\d+\./ before said call, so it's good.

That's all. As a recurring pattern, we can see that as long as nothing in bdrv_open_inherit() sets any option with a dot in it, we're good.

3) In bdrv_open():
The QDict is just the one given to bdrv_open().

3a) bdrv_open() call from bdrv_append_temp_snapshot():
"file.driver" and "file.filename" are set, and these are the only options in the whole QDict. Well... I'd argue that these are options supplied are supplied from the flags (BDRV_O_SNAPSHOT, to be exact), but I guess I can turn a blind eye to this case.

3b) bdrv_open() call from blk_new_open():
@options is just the one given to blk_new_open().
3b I) blk_new_open() call from blockdev_init(): Some options are removed from @bs_opts, "driver" is set if "format" is set (and "driver" isn't), okay. No further modifications, good. (@bs_opts in turn comes from drive_new() and blockdev_add(), therefore those really are options coming from the user) 3b II) blk_new_open() call from blk_connect() (Xen): @options is empty or will contain "driver" if that's enforced by Xen. Basically as correct as setting "driver" based on the backing file format for backing files, so it's good. 3b III) blk_new_open() call from img_open() (qemu-img): @options is empty or will contain the driver set by the user. Good. 3b IV) First blk_new_open() call from img_rebase() (qemu-img): Same as above. Good. 3b V) Second blk_new_open() call from img_rebase() (qemu-img): Same as above. Good. 3b VI) blk_new_open() call from openfile() (qemu-io): @opts is generated from user-supplied options. Good. 3b VII) blk_new_open() call from main() (qemu-nbd): Same as in qemu-img. Good.

3c, 3d, 3e, 3f, 3g, 3h, 3i, 3j, 3k, 3l, 3m, 3n, 3o, 3p, 3q) bdrv_open() call from qcow_create(), qcow2_create2(), qcow2_create2(), qed_create(), sd_prealloc(), sd_create(), vdi_create(), vhdx_create(), vmdk_create_extent(), vmdk_create(), vpc_create(), enable_write_target() (vvfat), bdrv_image_create(), qmp_bdrv_open_encrypted(), and qmp_drive_backup():
@options is NULL. Good.

3r, 3s) bdrv_open() call from external_snapshot_prepare() and qmp_drive_mirror():
@options only contains a node-name, and that one is user-supplied. Good.

In total, 3a looks a bit fishy, but I guess it's alright.

Concluding, I can say that setting bs->explicit_options at that point will not result in automatically derived options being included there (except for 3a). A problem I do see is that as can be seen above, deriving this is not trivial, and keeping this the case isn't either. We have to make sure that bdrv_open_inherit() will never set any option in @options which contains a dot, neither may any of the functions it calls (do we need appropriate documentation for child_role->inherit_options()?).

As noted above in point 1a II, @filename may be a JSON filename in bdrv_open_inherit(). I think these would be user-supplied options, so they should be put into bs->explicit_options, too. If they are not, 1a II is invalid and we have to make sure that none of the options supplied there can end up in any bs->explicit_options of any child BDS.

Also note that above I did not check whether bs->explicit_options will contain all user-specified options. I only made sure that it doesn't contain automatically derived options. But as long as blkdebug doesn't absorb options like "image.filename", it should be fine (no options prefixed with a bdref_key matching a BDS child role we are still intending to open may be removed, but I don't think that's the case, ever).

Oh, and also we have to make sure that setting reopen_state->bs->explicit_options does not result in derived options being set. It's generated from @options given to bdrv_reopen_queue_child(), joined with bs->explicit_options (induction hypothesis: bs->explicit_options is good). Is @options good, too? So far yes, because it's always empty (except for qemu-io, where it comes directly from the user).

diff --git a/block.c b/block.c
index 9259b42..d76e385 100644
--- a/block.c
+++ b/block.c
@@ -1408,6 +1408,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
      if (options == NULL) {
          options = qdict_new();
+    bs->explicit_options = qdict_clone_shallow(options);
if (child_role) {
          bs->inherits_from = parent;
@@ -1559,6 +1560,7 @@ fail:
      if (file != NULL) {
+    QDECREF(bs->explicit_options);
      bs->options = NULL;
@@ -1634,7 +1636,7 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
BlockReopenQueueEntry *bs_entry;
      BdrvChild *child;
-    QDict *old_options;
+    QDict *old_options, *explicit_options;
if (bs_queue == NULL) {
          bs_queue = g_new0(BlockReopenQueue, 1);
@@ -1649,11 +1651,18 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
       * Precedence of options:
       * 1. Explicitly passed in options (highest)
       * 2. TODO Set in flags (only for top level)
-     * 3. TODO Retained from explicitly set options of bs
+     * 3. Retained from explicitly set options of bs
       * 4. Inherited from parent node
       * 5. Retained from effective options of bs
+ /* Old explicitly set values (don't overwrite by inherited value) */
+    old_options = qdict_clone_shallow(bs->explicit_options);
+    qdict_join(options, old_options, false);
+    QDECREF(old_options);
+    explicit_options = qdict_clone_shallow(options);
      /* Inherit from parent node */
      if (parent_options) {
@@ -1692,6 +1701,7 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
bs_entry->state.bs = bs;
      bs_entry->state.options = options;
+    bs_entry->state.explicit_options = explicit_options;
      bs_entry->state.flags = flags;
return bs_queue;
@@ -1886,6 +1896,9 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
/* set BDS specific flags now */
+    QDECREF(reopen_state->bs->explicit_options);
+    reopen_state->bs->explicit_options   = reopen_state->explicit_options;
      reopen_state->bs->open_flags         = reopen_state->flags;
      reopen_state->bs->enable_write_cache = !!(reopen_state->flags &
@@ -1909,6 +1922,8 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
      if (drv->bdrv_reopen_abort) {
+    QDECREF(reopen_state->explicit_options);

I think this must be done in bdrv_reopen_multiple(). Otherwise, reopen_state->explicit_options is leaked for the one BDS where bdrv_reopen_prepare() failed.

@@ -1952,6 +1967,7 @@ void bdrv_close(BlockDriverState *bs)
          bs->sg = 0;
          bs->zero_beyond_eof = false;
+        QDECREF(bs->explicit_options);
          bs->options = NULL;
          bs->full_open_options = NULL;
diff --git a/include/block/block.h b/include/block/block.h
index 1287013..08bd0fe 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -147,6 +147,7 @@ typedef struct BDRVReopenState {
      BlockDriverState *bs;
      int flags;
      QDict *options;
+    QDict *explicit_options;
      void *opaque;
  } BDRVReopenState;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1cae8d4..a2e96bb 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -443,6 +443,7 @@ struct BlockDriverState {
      QLIST_HEAD(, BdrvChild) children;
QDict *options;
+    QDict *explicit_options;
      BlockdevDetectZeroesOptions detect_zeroes;
/* The error object in use for blocking operations on backing_hd */

What I'd like to have for a R-b: No leak of reopen_state->explicit_options, and an answer to the question whether options coming from a JSON filename should be part of bs->explicit_options (right now, they are for all child BDSs, but not for the top BDS, because bdrv_fill_options() is called after bs->explicit_options is set).


