qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 00/31] block layer: split block APIs in global state and I


From: Kevin Wolf
Subject: Re: [PATCH v7 00/31] block layer: split block APIs in global state and I/O
Date: Tue, 1 Mar 2022 17:08:10 +0100

Am 11.02.2022 um 15:51 hat Emanuele Giuseppe Esposito geschrieben:
> Currently, block layer APIs like block.h contain a mix of
> functions that are either running in the main loop and under the
> BQL, or are thread-safe functions and run in iothreads performing I/O.
> The functions running under BQL also take care of modifying the
> block graph, by using drain and/or aio_context_acquire/release.
> This makes it very confusing to understand where each function
> runs, and what assumptions it provided with regards to thread
> safety.
> 
> We call the functions running under BQL "global state (GS) API", and
> distinguish them from the thread-safe "I/O API".
> 
> The aim of this series is to split the relevant block headers in
> global state and I/O sub-headers. The division will be done in
> this way:
> header.h will be split in header-global-state.h, header-io.h and
> header-common.h. The latter will just contain the data structures
> needed by header-global-state and header-io, and common helpers
> that are neither in GS nor in I/O. header.h will remain for
> legacy and to avoid changing all includes in all QEMU c files,
> but will only include the two new headers. No function shall be
> added in header.c .
> Once we split all relevant headers, it will be much easier to see what
> uses the AioContext lock and remove it, which is the overall main
> goal of this and other series that I posted/will post.
> 
> In addition to splitting the relevant headers shown in this series,
> it is also very helpful splitting the function pointers in some
> block structures, to understand what runs under AioContext lock and
> what doesn't. This is what patches 21-27 do.
> 
> Each function in the GS API will have an assertion, checking
> that it is always running under BQL.
> I/O functions are instead thread safe (or so should be), meaning
> that they *can* run under BQL, but also in an iothread in another
> AioContext. Therefore they do not provide any assertion, and
> need to be audited manually to verify the correctness.
> 
> Adding assetions has helped finding 2 bugs already, as shown in
> my series "Migration: fix missing iothread locking".
> 
> Tested this series by running unit tests, qemu-iotests and qtests
> (x86_64).
> Some functions in the GS API are used everywhere but not
> properly tested. Therefore their assertion is never actually run in
> the tests, so despite my very careful auditing, it is not impossible
> to exclude that some will trigger while actually using QEMU.
> 
> Patch 1 introduces qemu_in_main_thread(), the function used in
> all assertions. This had to be introduced otherwise all unit tests
> would fail, since they run in the main loop but use the code in
> stubs/iothread.c
> Patches 2-27 (with the exception of patch 9-10, that are an additional
> assert) are all structured in the same way: first we split the header
> and in the next (or same, if small) patch we add assertions.
> Patch 28-31 take care instead of the block layer permission API,
> fixing some bugs where they are used in I/O functions.
> 
> This serie depends on my previous serie "block layer: permission API
> refactoring in preparation to the API split"
> 
> Based-on: <20220209105452.1694545-1-eesposit@redhat.com>
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Verifying the correctness of all the categorisations looked hard, so
instead of that I'll give you something to review now. :-)

The best way I could think of is to hack up a small script that checks
the consistency of all the macro annotations, i.e. no direct caller of
IO_CODE() may indirectly call GLOBAL_STATE_CODE() etc.

I'll attach that script so you can verify if the approach makes sense to
you and if my code is correct.

The good news is that the resulting list of errors is relatively small,
but it's not entirely empty. A good number of them look like false
positives (probably everything going through bdrv_co_drain_bh_cb()), but
there seem to be a few real ones, too:

Error: bdrv_get_full_backing_filename() is IO_CODE(), but calls 
GLOBAL_STATE_CODE() code
       bdrv_get_full_backing_filename() -> bdrv_make_absolute_filename() -> 
bdrv_dirname() -> GLOBAL_STATE_CODE()

Error: blk_error_action() is IO_CODE(), but calls GLOBAL_STATE_CODE() code
       blk_error_action() -> send_qmp_error_event() -> 
blk_iostatus_is_enabled() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_end_no_poll() is IO_CODE(), but calls GLOBAL_STATE_CODE() 
code
       bdrv_drained_end_no_poll() -> bdrv_do_drained_end() -> 
bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: nbd_co_do_establish_connection() is IO_CODE(), but calls 
GLOBAL_STATE_CODE() code
       nbd_co_do_establish_connection() -> nbd_handle_updated_info() -> 
bdrv_apply_auto_read_only() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_end_no_poll() is IO_CODE(), but calls IO_OR_GS_CODE() code
       bdrv_drained_end_no_poll() -> bdrv_do_drained_end() -> 
bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> 
bdrv_do_drained_begin_quiesce() -> IO_OR_GS_CODE()

Error: blk_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       blk_drain() -> bdrv_ref() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_begin() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() 
code
       bdrv_drained_begin() -> bdrv_do_drained_begin() -> 
bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_subtree_drained_begin() is IO_OR_GS_CODE(), but calls 
GLOBAL_STATE_CODE() code
       bdrv_subtree_drained_begin() -> bdrv_do_drained_begin() -> 
bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_end() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_drained_end() -> bdrv_do_drained_end() -> bdrv_do_drained_end() -> 
bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> 
GLOBAL_STATE_CODE()

Error: bdrv_subtree_drained_end() is IO_OR_GS_CODE(), but calls 
GLOBAL_STATE_CODE() code
       bdrv_subtree_drained_end() -> bdrv_do_drained_end() -> 
bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_apply_subtree_drain() is IO_OR_GS_CODE(), but calls 
GLOBAL_STATE_CODE() code
       bdrv_apply_subtree_drain() -> bdrv_do_drained_begin() -> 
bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_unapply_subtree_drain() is IO_OR_GS_CODE(), but calls 
GLOBAL_STATE_CODE() code
       bdrv_unapply_subtree_drain() -> bdrv_do_drained_end() -> 
bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_co_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_co_drain() -> bdrv_drained_begin() -> bdrv_do_drained_begin() -> 
bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_drain() -> bdrv_drained_begin() -> bdrv_do_drained_begin() -> 
bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> 
bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Do you want to address some of these cases before we merge the series? I
expect that usually we just miscategorised the caller, so the assertion
wouldn't actually fail at runtime, but I haven't checked in detail yet.

Kevin

Attachment: check_calls.py
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]