|
From: | Emanuele Giuseppe Esposito |
Subject: | Re: [PATCH v4 03/25] assertions for block global state API |
Date: | Mon, 15 Nov 2021 13:27:46 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 |
@@ -5958,6 +6043,7 @@ const char *bdrv_get_parent_name(const BlockDriverState *bs)/* TODO check what callers really want: bs->node_name or blk_name() */ const char *bdrv_get_device_name(const BlockDriverState *bs) { + assert(qemu_in_main_thread()); return bdrv_get_parent_name(bs) ?: ""; }This function is invoked from qcow2_signal_corruption(), which comes generally from an I/O path. Is it safe to assert that we’re in the main thread here?Well, the question is probably rather whether this needs really be a considered a global-state function, or whether putting it in common or I/O is fine. I believe you’re right given that it invokes bdrv_get_parent_name(), it cannot be thread-safe, but then we’ll have to change qcow2_signal_corruption() so it doesn’t invoke this function.
I think that the assertion is wrong here. As long as a single caller is not under BQL, we cannot keep the function in global-state headers. It should go into block-io.h, together with bdrv_get_parent_name and bdrv_get_device_or_node_name (caller of bdrv_get_parent_name).
Since bdrv_get_parent_name only scans through the list of bs->parents, as long as we can assert that the parent list is modified only under BQL + drain, it is safe to be read and can be I/O.
[...]diff --git a/block/io.c b/block/io.c index bb0a254def..c5d7f8495e 100644 --- a/block/io.c +++ b/block/io.c[...]@@ -544,6 +546,7 @@ void bdrv_drained_end(BlockDriverState *bs)void bdrv_drained_end_no_poll(BlockDriverState *bs, int *drained_end_counter){ + assert(qemu_in_main_thread()); bdrv_do_drained_end(bs, false, NULL, false, drained_end_counter); }Why is bdrv_drained_end an I/O function and this is a GS function, even though it does just a subset?
Right this is clearly called in bdrv_drained_end. I don't know how is it possible that no test managed to trigger this assertion... This is actually invoked in both unit and qemu-iothread tests...
@@ -586,12 +589,14 @@ void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent)void coroutine_fn bdrv_co_drain(BlockDriverState *bs) { assert(qemu_in_coroutine()); + assert(qemu_in_main_thread()); bdrv_drained_begin(bs); bdrv_drained_end(bs); } void bdrv_drain(BlockDriverState *bs) { + assert(qemu_in_main_thread()); bdrv_drained_begin(bs); bdrv_drained_end(bs); }Why are these GS functions when both bdrv_drained_begin() and bdrv_drained_end() are I/O functions?I can understand making the drain_all functions GS functions, but it seems weird to say it’s an I/O function when a single BDS is drained via bdrv_drained_begin() and bdrv_drained_end(), but not via bdrv_drain(), which just does both.(I can see that there are no I/O path callers, but I still find it strange.)
The problem as you saw is on the path callers: bdrv_drain seems to be called only in main thread, while others or similar drains are also coming from I/O. I would say maybe it's a better idea to just put everything I/O, maybe (probably) there are cases not caught by the tests.
The only ones in global-state will be:- bdrv_apply_subtree_drain and bdrv_unapply_subtree_drain (called only in .attach and .detach callbacks of BdrvChildClass, run under BQL).
- bdrv_drain_all_end_quiesce (called only by bdrv_close thus under BQL).- bdrv_drain_all_begin, bdrv_drain_all_end and bdrv_drain_all (have already BQL assertions)
In general, this is the underlying problem with these assertions: if a test doesn't test a specific code path, an unneeded assertion might pass undetected.
[...]@@ -2731,6 +2742,7 @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base, int bdrv_block_status(BlockDriverState *bs, int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file){ + assert(qemu_in_main_thread()); return bdrv_block_status_above(bs, bdrv_filter_or_cow_bs(bs), offset, bytes, pnum, map, file); }Why is this a GS function as opposed to all other block-status functions? Because of the bdrv_filter_or_cow_bs() call?
This function is in block.io but has the assertion. I think it's a proper I/O but I forgot to remove the assertion in my various trial&errors.
Let me know if you disagree on anything of what I said. Thank you, Emanuele
[Prev in Thread] | Current Thread | [Next in Thread] |