qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2(RFC) 0/3] qcow2: fix parallel rewrite and discard


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH v2(RFC) 0/3] qcow2: fix parallel rewrite and discard
Date: Thu, 18 Mar 2021 18:37:42 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

12.03.2021 18:24, Kevin Wolf wrote:
Am 25.02.2021 um 12:52 hat Vladimir Sementsov-Ogievskiy geschrieben:
Hi all! It occurs that nothing prevents discarding and reallocating host
cluster during data writing. This way data writing will pollute another
newly allocated cluster of data or metadata.

OK, v2 is a try to solve the problem with CoRwlock.. And it is marked
RFC, because of a lot of iotest failures.. Some of problems with v2:

1. It's a more complicated to make a test, as everything is blocking
and I can't just break write and do discard.. I have to implement
aio_discard in qemu-io and rewrite test into several portions of io
commands splitted by "sleep 1".. OK, it's not a big problem, and I've
solved it.

Right, this just demonstrates that it's doing what it's supposed to.

2. iotest 7 fails with several leaked clusters. Seems, that it depend on
the fact that discard may be done in parallel with writes. Iotest 7 does
snapshots, so I think l1 table is updated to the moment when discard is
finally unlocked.. But I didn't dig into it, it's all my assumptions.

This one looks a bit odd, but it may be related to the bug in your code
that you forgot that qcow2_cluster_discard() is not a coroutine_fn.
Later tests fail during the unlock:

qemu-img: ../util/qemu-coroutine-lock.c:358: qemu_co_rwlock_unlock: Assertion 
`qemu_in_coroutine()' failed.

#0  0x00007ff33f7d89d5 in raise () from /lib64/libc.so.6
#1  0x00007ff33f7c18a4 in abort () from /lib64/libc.so.6
#2  0x00007ff33f7c1789 in __assert_fail_base.cold () from /lib64/libc.so.6
#3  0x00007ff33f7d1026 in __assert_fail () from /lib64/libc.so.6
#4  0x0000556f4ffd3c94 in qemu_co_rwlock_unlock (lock=0x556f51f63ca0) at 
../util/qemu-coroutine-lock.c:358
#5  0x0000556f4fef5e09 in qcow2_cluster_discard (bs=0x556f51f56a80, 
offset=37748736, bytes=0, type=QCOW2_DISCARD_NEVER, full_discard=false) at 
../block/qcow2-cluster.c:1970
#6  0x0000556f4ff46c23 in qcow2_snapshot_create (bs=0x556f51f56a80, 
sn_info=0x7fff89fb9a30) at ../block/qcow2-snapshot.c:736
#7  0x0000556f4ff0d7b6 in bdrv_snapshot_create (bs=0x556f51f56a80, 
sn_info=0x7fff89fb9a30) at ../block/snapshot.c:227
#8  0x0000556f4fe85526 in img_snapshot (argc=4, argv=0x7fff89fb9d30) at 
../qemu-img.c:3337
#9  0x0000556f4fe8a227 in main (argc=4, argv=0x7fff89fb9d30) at 
../qemu-img.c:5375

3. iotest 13 (and I think a lot more iotests) crashes on
assert(!to->locks_held); .. So with this assertion we can't keep rwlock
locked during data writing...

   #3  in __assert_fail () from /lib64/libc.so.6
   #4  in qemu_aio_coroutine_enter (ctx=0x55762120b700, co=0x55762121d700)
       at ../util/qemu-coroutine.c:158
   #5  in aio_co_enter (ctx=0x55762120b700, co=0x55762121d700) at 
../util/async.c:628
   #6  in aio_co_wake (co=0x55762121d700) at ../util/async.c:612
   #7  in thread_pool_co_cb (opaque=0x7f17950daab0, ret=0) at 
../util/thread-pool.c:279
   #8  in thread_pool_completion_bh (opaque=0x5576211e5070) at 
../util/thread-pool.c:188
   #9  in aio_bh_call (bh=0x557621205df0) at ../util/async.c:136
   #10 in aio_bh_poll (ctx=0x55762120b700) at ../util/async.c:164
   #11 in aio_poll (ctx=0x55762120b700, blocking=true) at 
../util/aio-posix.c:659
   #12 in blk_prw (blk=0x557621205790, offset=4303351808,
       buf=0x55762123e000 '\364' <repeats 199 times>, <incomplete sequence 
\364>..., bytes=12288,
       co_entry=0x557620d9dc97 <blk_write_entry>, flags=0) at 
../block/block-backend.c:1335
   #13 in blk_pwrite (blk=0x557621205790, offset=4303351808, buf=0x55762123e000,
       count=12288, flags=0) at ../block/block-backend.c:1501

This is another bug in your code: A taken lock belongs to its coroutine.
You can't lock in one coroutine and unlock in another.

The changes you made to the write code seem unnecessarily complicated
anyway: Why not just qemu_co_rwlock_rdlock() right at the start of
qcow2_co_pwritev_part() and unlock at its end, without taking and
dropping the lock repeatedly? It makes both the locking more obviously
correct and also fixes the bug (013 passes with this change).

So now I think that v1 is simpler.. It's more complicated (but not too
much) in code. But it keeps discards and data writes non-blocking each
other and avoids yields in critical sections.

I think an approach with additional data structures is almost certainly
more complex and harder to maintain (and as the review from Max showed,
also to understand). I wouldn't give up yet on the CoRwlock based
approach, it's almost trivial code in comparison.

True, making qcow2_cluster_discard() a coroutine_fn requires a
preparational patch that is less trivial, but at least this can be seen
as something that we would want to do sooner or later anyway.


Actually, recount of cluster may become not only because of guest discard. So 
correct thing is rwlock in update_refcount(), when we want to decrease refcount 
from 1 to 0.. And for this update_refcount() should be moved to coroutine, and 
better the whole qcow2 driver..


--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]