qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 3/3] block/qcow2-bitmap: rewrite bitmap reopenin


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-block] [PATCH 3/3] block/qcow2-bitmap: rewrite bitmap reopening logic
Date: Wed, 29 May 2019 15:58:59 +0000

29.05.2019 18:33, Max Reitz wrote:
> On 23.05.19 17:47, Vladimir Sementsov-Ogievskiy wrote:
>> Current logic
>> =============
>>
>> Reopen rw -> ro:
>>
>> Store bitmaps and release BdrvDirtyBitmap's.
>>
>> Reopen ro -> rw:
>>
>> Load bitmap list
>> Skip bitmaps which for which we don't have BdrvDirtyBitmap [this is
>>     the worst thing]
>> A kind of fail with error message if we see not readonly bitmap
>>
>> This leads to at least the following bug:
>>
>> Assume we have bitmap0.
>> Create external snapshot
>>    bitmap0 is stored and therefore removed
>> Commit snapshot
>>    now we have no bitmaps
>> Do some writes from guest (*)
>>    they are not marked in bitmap
>> Shutdown
>> Start
>>    bitmap0 is loaded as valid, but it is actually broken! It misses
>>    writes (*)
>> Incremental backup
>>    it will be inconsistent
>>
>> New logic
>> =========
>>
>> Reopen rw -> ro:
>>
>> Store bitmaps and don't remove them from RAM, only mark them readonly
>> (the latter is already done, but didn't work properly because of
>> removing in storing function)
>>
>> Reopen to rw (don't filter case with reopen rw -> rw, it is supported
>> now in qcow2_reopen_bitmaps_rw)
>>
>> Load bitmap list
>>
>> Carefully consider all possible combinations of bitmaps in RAM and in
>> the image, try to fix corruptions, print corresponding error messages.
>>
>> Also, call qcow2_reopen_bitmaps_rw after the whole reopen queue
>> commited, as otherwise we can't write to the qcow2 file child on reopen
>> ro -> rw.
>>
>> Also, add corresponding test-cases to 255.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>> ---
>>   block/qcow2.h              |   5 +-
>>   include/block/block_int.h  |   2 +-
>>   block.c                    |  29 ++----
>>   block/qcow2-bitmap.c       | 197 ++++++++++++++++++++++++++-----------
>>   block/qcow2.c              |   2 +-
>>   tests/qemu-iotests/255     |   2 +
>>   tests/qemu-iotests/255.out |  35 +++++++
>>   7 files changed, 193 insertions(+), 79 deletions(-)
> 
> [...]
> 
>> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
>> index 2b84bfa007..4e565db11f 100644
>> --- a/block/qcow2-bitmap.c
>> +++ b/block/qcow2-bitmap.c
> 
> [...]
> 
>> @@ -1103,76 +1110,150 @@ Qcow2BitmapInfoList 
>> *qcow2_get_bitmap_info_list(BlockDriverState *bs,
>>       return list;
>>   }
>>   
>> -int qcow2_reopen_bitmaps_rw_hint(BlockDriverState *bs, bool *header_updated,
>> -                                 Error **errp)
>> +/*
>> + * qcow2_reopen_bitmaps_rw
>> + *
>> + * The function is called in bdrv_reopen_multiple after all calls to
>> + * bdrv_reopen_commit, when final bs is writable. It is done so, as we need
>> + * write access to child bs, and with current reopening architecture, when
>> + * reopen ro -> rw it is possible only after all reopen commits.
>> + *
>> + * So, we can't fail here.
> 
> Why?  Because the current design forbids it?
> 
> Then the current design seems flawed to me.
> 
> [CC-ing Berto]
> 
> Without having looked too close into this, it seems from your commit
> message that it is possible to run into unrecoverable fatal errors here.
>   We cannot just ignore that on the basis that the current design cannot
> deal with that.
> 
> It just appears wrong to me to update any flags in the image in the
> .commit() part of a transaction.  We should try to do so in .prepare().
>   If it works, great, we’re set for .commit().  If it doesn’t, welp, time
> for .abort() and pretend the image is still read-only (even though it
> kind of may be half-prepared for writing).
> 
> If we fail to set IN_USE in .commit(), that’s a fatal error in my opinion.
> 
> After some chatting with John, I’ve come to the following belief:
> 
> When switching a node from RO to R/W, it must be able to write to its
> children in .prepare() -- precisely because performing this switch may
> need some metadata change.  If we cannot do this change, then we cannot
> switch the node to R/W.  So it’s clear to me that this needs to be done
> in .prepare().
> 
> So I think a node’s children must be R/W before we can .prepare() the
> node.  That means we need to .commit() them.  That means we cannot have
> a single transaction that switches a whole tree to be R/W, but it must
> consist of one transaction per level.
> 
> If something fails, we need to roll back all the previous layers.  Well,
> it depends.
> 
> If we switch to R/W (and something fails), then we need to try to revert
> the children we have already made R/W to be RO.  If that doesn’t work,
> welp, too bad, but not fatal.

And, than, why not do full reopen rw in .prepare, and just organize
recursion so that children reopened rw before parent? Than we again have
one transaction for the tree, but abort may fail to rollback it in worst case.
But we cant avoid it anyway, with one transaction or with transactions per 
level..

So, just move all code from .commit to .prepare for reopen-rw and try to 
roll-back
it in .abort?

Or do you have an idea for a patch?

> 
> Switching to RO goes the other way around (parents to children), so if
> we encounter an error there, we cannot make that node’s children RO,
> too.  We could try to revert the whole change and make the whole tree
> R/W again, or we just don’t.  I think “just don’t” is reasonable.
> 
> Max
> 
>>      On the other hand, we may face different kinds of
>> + * corruptions and/or just can't write IN_USE flags to the image due to EIO.
>> + *
>> + * Try to handle as many cases as we can.
>> + */
>> +void qcow2_reopen_bitmaps_rw(BlockDriverState *bs)
> 


-- 
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]