qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 22/42] block: Fix bdrv_get_allocated_file_siz


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v6 22/42] block: Fix bdrv_get_allocated_file_size's fallback
Date: Wed, 11 Sep 2019 13:00:44 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 11.09.19 12:31, Kevin Wolf wrote:
> Am 11.09.2019 um 12:00 hat Max Reitz geschrieben:
>> On 11.09.19 10:27, Kevin Wolf wrote:
>>> Am 11.09.2019 um 09:37 hat Max Reitz geschrieben:
>>>> On 11.09.19 08:55, Kevin Wolf wrote:
>>>>> Well, by default the primary child, which should cover like 90% of the
>>>>> drivers?
>>>>
>>>> Hm, yes.
>>>>
>>>> But I still think that the drivers that do not want to count every
>>>> single non-COW child are the exception.
>>>
>>> They are, but drivers that want to count more than their primary node
>>> are exceptions, too. And I think you're more likely to remember adding
>>> the callback when you want to have a certain feature, not when you don't
>>> want to have it.
>>>
>>> I really think we're likely to forget adding the callback where we need
>>> to disable the feature.
>>
>> Well, I mean, we did forget adding it for qcow2.
> 
> I'm afraid I have to agree. So the conclusion is that we won't get it
> right anyway?
> 
>>> I can see two options that should address both of our views:
>>>
>>> 1. Just don't have a fallback at all, make the callback mandatory and
>>>    provide implementations in block.c that can be referred to in
>>>    BlockDriver. Not specifying the callback causes an assertion failure,
>>>    so we'd hopefully notice it quite early (assuming that we run either
>>>    'qemu-img info' or 'query-block' on a configuration with the block
>>>    driver, but I think that's faily safe to assume).
>>
>> Hm.  Seems a bit much, but if we can’t agree on what’s a good general
>> implementation that works for everything, this is probably the only
>> thing that would actually keep us from forgetting to add special cases.
>>
>> Though I actually don’t know.  I’d probably add two globally available
>> helpers, one that returns the sum of everything but the backing node,
>> and one that just returns the primary node.
> 
> Yes, I think this is the same as I meant by "provide implementations in
> block.c".
> 
>> Now if I were to make qcow2 use the primary node helper function, would
>> we have remembered changing it once we added a data file?
>>
>> Hmm.  Maybe not, but it should be OK to just make everything use the sum
>> helper, except the drivers that want the primary node.  That should work
>> for all cases.  (I think that whenever a format driver suddenly gains
>> more child nodes, we probably will want to count them.  OTOH, everything
>> that has nodes that shouldn’t be counted probably always wants to use
>> the primary node helper function from the start.)
> 
> The job filter nodes have only one child currently, which should be
> counted. We'll add other children that shouldn't be counted only later.
> 
> But we already have an idea of what possible extensions look like, so we
> can probably choose the right function from the start.

Yep.

>>> 2. Make the 90% solution a 100% solution: Allow drivers to have multiple
>>>    storage children (for vmdk) and then have the fallback add up the
>>>    primary child plus all storage children. This is what I suggested as
>>>    the documented semantics in my initial reply to this patch (that you
>>>    chose not to answer).
>>
>> I didn’t answer that because I didn’t disagree.
>>
>>>    Adding the size of storage children covers qcow2 and vmdk.
>>
>> That’s of course exactly what we’re trying to do, but the question is,
>> how do we figure out that storage children?  Make it a per-BdrvChild
>> attribute?  That seems rather heavy-handed, because I think we’d need it
>> only here.
> 
> Well, you added bdrv_storage_child().I'd argue this interface is wrong

Yes, it probably is.

> because it assumes that only one storage child exists. You just didn't
> implement it for vmdk so that the problem didn't become apparent. It
> would have to return a list rather than a single child. So fixing the
> interface and then using it is what I was thinking.
> 
> Now that you mention a per-BdrvChild attribute, however, I start to
> wonder if the distinction between COW children, filter children, storage
> children, metadata children, etc. isn't really what BdrvChildRole was
> supposed to represent?

That’s a good point.

> Maybe we want to split off child_storage from child_file, though it's
> not strictly necessary for this specific case because we want to treat
> both metadata and storage nodes the same. But it could be useful for
> other users of bdrv_storage_child(), if there are any.

Possible.  Maybe it turns out that at least for this series I don’t need
bdrv_storage_child() at all.

>>>    As the job filter won't declare the target or any other involved
>>>    nodes their storage nodes (I hope), this will do the right thing for
>>>    them, too.
>>>
>>>    For quorum and blkverify both ways could be justifiable. I think they
>>>    probably shouldn't declare their children as storage nodes. They are
>>>    more like filters that don't have a single filtered node. So some
>>>    kind of almost-filters.
>>
>> I don’t think quorum is a filter, and blkverify can only be justified to
>> be a filter because it quits qemu when there is a mismatch.
>>
>> The better example is replication, but that has a clear filtered child
>> (the primary node).
>>
>>
>> So all in all I think it’s best to make the callback mandatory and add
>> two global helper functions.  That’s simple enough and should prevent
>> us from making mistakes by forgetting to adjust something in the
>> future.
> 
> Yes, that should work.
> 
> We should probably still figure out what the relationship between the
> child access functions and child roles is, even if we don't need it for
> this solution. But it feels like an important part of the design.

Hm.  It feels like something that should be done before this series,
actually.

So I think we should add at least a child role per child access function
so that they match?  And then maybe in bdrv_attach_child() assert that a
BDS never has more than one primary or filtered child (a filtered child
acts as a primary child, too), or more than one COW child.  (And that
these are always in bs->file or bs->backing so the child access
functions do work.)

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]