qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 10/22] quorum: Implement .bdrv_recurse_can_replace()


From: Max Reitz
Subject: Re: [PATCH 10/22] quorum: Implement .bdrv_recurse_can_replace()
Date: Thu, 26 Sep 2019 13:17:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0

On 25.09.19 16:12, Vladimir Sementsov-Ogievskiy wrote:
> 20.09.2019 18:27, Max Reitz wrote:
>> Signed-off-by: Max Reitz <address@hidden>
>> ---
>>   block/quorum.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 62 insertions(+)
>>
>> diff --git a/block/quorum.c b/block/quorum.c
>> index 207054a64e..81b57dbae2 100644
>> --- a/block/quorum.c
>> +++ b/block/quorum.c
>> @@ -825,6 +825,67 @@ static bool 
>> quorum_recurse_is_first_non_filter(BlockDriverState *bs,
>>       return false;
>>   }
>>   
>> +static bool quorum_recurse_can_replace(BlockDriverState *bs,
>> +                                       BlockDriverState *to_replace)
>> +{
>> +    BDRVQuorumState *s = bs->opaque;
>> +    int i;
>> +
>> +    for (i = 0; i < s->num_children; i++) {
>> +        /*
>> +         * We have no idea whether our children show the same data as
>> +         * this node (@bs).  It is actually highly likely that
>> +         * @to_replace does not, because replacing a broken child is
>> +         * one of the main use cases here.
>> +         *
>> +         * We do know that the new BDS will match @bs, so replacing
>> +         * any of our children by it will be safe.  It cannot change
>> +         * the data this quorum node presents to its parents.
>> +         *
>> +         * However, replacing @to_replace by @bs in any of our
>> +         * children's chains may change visible data somewhere in
>> +         * there.  We therefore cannot recurse down those chains with
>> +         * bdrv_recurse_can_replace().
>> +         * (More formally, bdrv_recurse_can_replace() requires that
>> +         * @to_replace will be replaced by something matching the @bs
>> +         * passed to it.  We cannot guarantee that.)
>> +         *
>> +         * Thus, we can only check whether any of our immediate
>> +         * children matches @to_replace.
>> +         *
>> +         * (In the future, we might add a function to recurse down a
>> +         * chain that checks that nothing there cares about a change
>> +         * in data from the respective child in question.  For
>> +         * example, most filters do not care when their child's data
>> +         * suddenly changes, as long as their parents do not care.)
>> +         */
>> +        if (s->children[i].child->bs == to_replace) {
>> +            Error *local_err = NULL;
>> +
>> +            /*
>> +             * We now have to ensure that there is no other parent
>> +             * that cares about replacing this child by a node with
>> +             * potentially different data.
>> +             */
>> +            s->children[i].to_be_replaced = true;
>> +            bdrv_child_refresh_perms(bs, s->children[i].child, &local_err);
>> +
> 
> So we are trying to answer on a question "is it ok to replace" it, by 
> cheating on
> permission system... Possibly, it's a problem of general design, and instead 
> of
>   examining one subtree, we should ask all parents of to_replace node, are 
> they
> OK with such replacement..

I’m not sure whether it’s cheating.

We want to replace some node.  A parent should be A-OK with that as long
as it hasn’t frozen its child link, and as long as it doesn’t care about
data changes (it should not have taken CONSISTENT_READ, and it must have
shared WRITE).

The only actual problem we have is that currently basically everything
takes CONSISTENT_READ (which is completely fine), but the only thing
that doesn’t is the mirror_top_bs, and that has exactly the problem of
“I can only get away without CONSISTENT_READ if it was me who unshared it”.

But that’s a different problem.  I don’t think this is cheating.

> Another idea is that it's strange to check permissions somewhere else than in 
> generic
> permission check functions. But I've no idea how to handle it in permission 
> system.

I don’t check the permissions, though.  I let quorum take what it needs
to allow changing one of its children.

What is a problem is that I should keep the permissions tightened until
the node is actually replaced and only then release them.  But that
turned out to be a huge mess so I resorted to just double-checking
before mirror actually completes.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]