qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] iotests: Test commit with iothreads and ongoing I/O


From: Eric Blake
Subject: Re: [PATCH 3/3] iotests: Test commit with iothreads and ongoing I/O
Date: Thu, 18 May 2023 16:00:31 -0500
User-agent: NeoMutt/20230517

On Wed, May 17, 2023 at 05:28:34PM +0200, Kevin Wolf wrote:
> This tests exercises graph locking, draining, and graph modifications
> with AioContext switches a lot. Amongst others, it serves as a
> regression test for bdrv_graph_wrlock() deadlocking because it is called
> with a locked AioContext and for AioContext handling in the NBD server.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

I've now confirmed the following setups with just './check
graph-changes-while-io -qcow2':

patch 3 alone => test fails with wrlock assertion (good, we're
catching the 8.0 regression where the new assertion failure is
tripping)

patch 1 and 3 => test fails differently than patch 3 alone (good,
we're exposing the fact that NBD had a pre-existing bug, regardless of
whether the added rwlock made it easier to spot)

patch 2 and 3 => test passes (good, patch 2 appears to have fixed this
particular bug, and when we are ready to revert patch 1 because we get
rid of AioContext locking we'll still be okay)

patch 1, 2, and 3 => test passes (good, we fixed the NBD bug, and have
regression testing in place for a scenario that previously wasn't
getting good testing)

As such, I'm happy to supply:

Tested-by: Eric Blake <eblake@redhat.com>


Now on to the patch itself...

> ---
>  tests/qemu-iotests/iotests.py                 |  4 ++
>  .../qemu-iotests/tests/graph-changes-while-io | 56 +++++++++++++++++--
>  .../tests/graph-changes-while-io.out          |  4 +-
>  3 files changed, 58 insertions(+), 6 deletions(-)
> 

> @@ -84,6 +84,54 @@ class TestGraphChangesWhileIO(QMPTestCase):
>  
>          bench_thr.join()
>  
> +    def test_commit_while_io(self) -> None:
> +        # Run qemu-img bench in the background
> +        bench_thr = Thread(target=do_qemu_img_bench, args=(200000, ))

TIL - you can create a 1-item tuple in Python.  It caught me
off-guard, but makes sense now that I've re-read it.

> +
> +        # While qemu-img bench is running, repeatedly commit overlay to node0
> +        while bench_thr.is_alive():
> +            result = self.qsd.qmp('block-commit', {
> +                'job-id': 'job0',
> +                'device': 'overlay',
> +            })
> +            self.assert_qmp(result, 'return', {})
> +
> +            result = self.qsd.qmp('block-job-cancel', {
> +                'device': 'job0',
> +            })
> +            self.assert_qmp(result, 'return', {})
> +
> +            cancelled = False
> +            while not cancelled:
> +                for event in self.qsd.get_qmp().get_events(wait=10.0):

The updated test took about 34 seconds on my machine; long enough that
it is rightfully not part of './check -g quick', but still reasonable
that I had no problems reproducing the issue while the test was
running.

> +                    if event['event'] != 'JOB_STATUS_CHANGE':
> +                        continue
> +                    if event['data']['status'] == 'null':
> +                        cancelled = True
> +
> +        bench_thr.join()
> +

It feels a bit odd that the test is skipped during './check -nbd', yet
it IS utilizing nbd, and the fix in patch 2 is indeed in NBD code.
But that's not a flaw in the test itself, just a limitation of what
images we need in order to set up the NBD service in a way to trigger
the problem.

Reviewed-by: Eric Blake <eblake@redhat.com>

I'm in a spot where I can quickly queue this through my NBD tree so we
can get it backported to 8.0.1; pull request coming up, provided the
full series passes a few more tests on my end.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]