qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-block] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE
Date: Wed, 14 Aug 2019 06:27:10 +0000



13 авг. 2019 г. 19:32 пользователь Vladimir Sementsov-Ogievskiy <address@hidden> написал:

13.08.2019 19:08, Kevin Wolf wrote:
> Am 13.08.2019 um 17:54 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 13.08.2019 18:41, Kevin Wolf wrote:
>>> Am 13.08.2019 um 16:43 hat Max Reitz geschrieben:
>>>> On 13.08.19 13:04, Kevin Wolf wrote:
>>>>> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
>>>>>> returned file. But is it correct behavior at all? If returned file
>>>>>> itself has a backing file, we may report as totally unallocated and
>>>>>> area which actually has data in bottom backing file.
>>>>>>
>>>>>> So, mirroring of qcow2 under raw-format is broken. Which is illustrated
>>>>>> by following commit with a test. Let's make raw-format behave more
>>>>>> correctly returning BDRV_BLOCK_DATA.
>>>>>>
>>>>>> Suggested-by: Max Reitz <address@hidden>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>>>
>>>>> After some reading, I think I came to the conclusion that RAW is the
>>>>> correct thing to do. There is indeed a problem, but this patch is trying
>>>>> to fix it in the wrong place.
>>>>>
>>>>> In the case where the backing file contains some data, and we have a
>>>>> 'raw' node above the qcow2 overlay node, the content of the respective
>>>>> block is not defined by the queried backing file layer, so it is
>>>>> completely correct that bdrv_is_allocated() returns false,like it would
>>>>> if you queried the qcow2 layer directly.
>>>>
>>>> I disagree.  The queried backing file layer is the raw node.  As I said,
>>>> in my opinion raw nodes are not filter nodes, neither in behavior (they
>>>> have an offset option), nor in how they are generally used (as a format).
>>>>
>>>> The raw format does not support backing files.  Therefore, everything on
>>>> a raw node is allocated.
>>>>
>>>> (That is, like, my opinion.)
>>>>
>>>>>                                            If it returned true, we would
>>>>> copy everything, which isn't right either (the test cases should may add
>>>>> the qemu-img map output of the target so this becomes visible).
>>>>
>>>> It is right.
>>>
>>> So we don't even agree what mirroring the raw node should even mean.
>>>
>>> I can the see your point when you say that the raw node has no backing
>>> file, so everything should be copied. But I can also see the point that
>>> the raw node can really just be used as a filter that limits the data
>>> exposed from the qcow2 layer, and you want to keep the copy a COW
>>> overlay over the same backing file.
>>>
>>> Both are valid use cases in principle and there is no single right or
>>> wrong.
>>>
>>> We don't currently support the latter use case because we have only
>>> sync=full or sync=top, but if you could specify a base node instead, we
>>> could probably suport the case without all of the special-casing filter
>>> nodes and backing file childs.
>>>
>>> You would call bdrv_co_block_status_above() with the right base node and
>>> it would just recurse whereever the data is stored, be it bs->backing,
>>> bs->file or even driver-specific children. This would allow you to find
>>> out whether some block in the top node came from the base node that
>>> we're going to keep. If yes, skip it; if no, copy it.
>>>
>>> This way we wouldn't have to decide whether raw is a filter or not,
>>> because it wouldn't make a difference. The behaviour would only depend
>>> on the base node given by the user. If you specified the top-level qcow2
>>> file as the base, you get your full copy;
>>
>> ahm, full-copy = base is NULL..
>
> Oops, yes, of course. Using the top-level node would create an empty
> "copy".
>
>>> if you specified the backing
>>> qcow2, you get the partial copy where the target still uses the same
>>> backing file.
>>>
>>> (Hm... It would only actually work if the offsets stay the same in the
>>> chain, which is true for backing file children, but not necessarily for
>>> other children.
>>
>> Don't follow, what you mean by offsets stay the same and what is wrong
>> with it?
>
> Say we have this graph:
>
> raw,offset=65536
>      |
>      v
>    qcow2-----+
>      |       |
>      v       v
>    file     base
>
> Now you can't just mirror the raw node into a target.qcow2 that shares
> base as the backing file, because the offsets will be wrong. In order to
> use such a copy correctly, you'd have to use a raw node again in the
> backing chain:
>
> target.qcow2----+
>      |           |
>      v           v
>    file      raw,offset=65536
>                  |
>                  v
>                base
>
> So the case where offsets differ between the top and the base node isn't
> trivial.

Understand, but for me it don't look like the thing that behaves in unexpected
for user way, on the contrary, it seems obvious that it will not work, as user
understand what is backing file (offsets are backed by corresponding offsets)

>
> (If this case isn't complicated enough yet, imagine passing file as the
> base node instead... It just can't work.)
>

Yes, if we chose the way you proposed we have a possibility to use any node as a base,
but again this does something completely different from usual "top" or "based" mode..
So it's just a new possibility, may be useless, it don't break up the concept.


I like the idea of generic block_status_above that you propose, but still there should
a decision of what exactly DATA, ZERO and ALLOCATED means. Ok, assume we don't need ALLOCATED..
Then we don't need DATA too?

And how block_status_above would work? If node don't report ZERO and don't report DATA but report
*file pointer == base, we should stop, and don't go to *file, it seems obvious..

But what if node reports ZERO together with *file pointer == base? Should consider this region
belonging to base and UNALLOCATED and not copy it, or we should consider it ALLOCATED because
current node reports it ZERO? So, criteria of "ALLOCATED" or "where should we stop in recursion"
is not obvious.. Do you have one?


And now I see a drawback of full recursion as Kevin proposes: it's slower when we are interested in allocation. We don't need to recurse through file* backing chain, if *file is not backing, as we already know that data is allocated.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]