[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v7 14/47] stream: Deal with filters
From: |
Max Reitz |
Subject: |
Re: [PATCH v7 14/47] stream: Deal with filters |
Date: |
Thu, 16 Jul 2020 16:59:15 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 10.07.20 19:41, Andrey Shinkevich wrote:
> On 10.07.2020 18:24, Max Reitz wrote:
>> On 09.07.20 16:52, Andrey Shinkevich wrote:
>>> On 25.06.2020 18:21, Max Reitz wrote:
>>>> Because of the (not so recent anymore) changes that make the stream job
>>>> independent of the base node and instead track the node above it, we
>>>> have to split that "bottom" node into two cases: The bottom COW node,
>>>> and the node directly above the base node (which may be an R/W filter
>>>> or the bottom COW node).
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>> qapi/block-core.json | 4 +++
>>>> block/stream.c | 63
>>>> ++++++++++++++++++++++++++++++++------------
>>>> blockdev.c | 4 ++-
>>>> 3 files changed, 53 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>> index b20332e592..df87855429 100644
>>>> --- a/qapi/block-core.json
>>>> +++ b/qapi/block-core.json
>>>> @@ -2486,6 +2486,10 @@
>>>> # On successful completion the image file is updated to drop the
>>>> backing file
>>>> # and the BLOCK_JOB_COMPLETED event is emitted.
>>>> #
>>>> +# In case @device is a filter node, block-stream modifies the first
>>>> non-filter
>>>> +# overlay node below it to point to base's backing node (or NULL if
>>>> @base was
>>>> +# not specified) instead of modifying @device itself.
>>>> +#
>>>> # @job-id: identifier for the newly-created block job. If
>>>> # omitted, the device name will be used. (Since 2.7)
>>>> #
>>>> diff --git a/block/stream.c b/block/stream.c
>>>> index aa2e7af98e..b9c1141656 100644
>>>> --- a/block/stream.c
>>>> +++ b/block/stream.c
>>>> @@ -31,7 +31,8 @@ enum {
>>>> typedef struct StreamBlockJob {
>>>> BlockJob common;
>>>> - BlockDriverState *bottom;
>>>> + BlockDriverState *base_overlay; /* COW overlay (stream from
>>>> this) */
>>>> + BlockDriverState *above_base; /* Node directly above the base */
>>> Keeping the base_overlay is enough to complete the stream job.
>> Depends on the definition. If we decide it isn’t enough, then it isn’t
>> enough.
>>
>>> The above_base may disappear during the job and we can't rely on it.
>> In this version of this series, it may not, because the chain is frozen.
>> So the above_base cannot disappear.
>
> Once we insert a filter above the top bs of the stream job, the parallel
> jobs in
>
> the iotests #030 will fail with 'frozen link error'. It is because of the
>
> independent parallel stream or commit jobs that insert/remove their filters
>
> asynchroniously.
I’m not sure whether that’s a problem with this series specifically.
>> We can discuss whether we should allow it to disappear, but I think not.
>>
>> The problem is, we need something to set as the backing file after
>> streaming. How do we figure out what that should be? My proposal is we
>> keep above_base and use its immediate child.
>
> We can do the same with the base_overlay.
>
> If the backing node turns out to be a filter, the proper backing child will
>
> be set after the filter is removed. So, we shouldn't care.
And what if the user manually added some filter above the base (i.e.
below base_overlay) that they want to keep after the job?
>> If we don’t keep above_base, then we’re basically left guessing as to
>> what should be the backing file after the stream job.
>>
>>>> BlockdevOnError on_error;
>>>> char *backing_file_str;
>>>> bool bs_read_only;
>>>> @@ -53,7 +54,7 @@ static void stream_abort(Job *job)
>>>> if (s->chain_frozen) {
>>>> BlockJob *bjob = &s->common;
>>>> - bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->bottom);
>>>> + bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->above_base);
>>>> }
>>>> }
>>>> @@ -62,14 +63,15 @@ static int stream_prepare(Job *job)
>>>> StreamBlockJob *s = container_of(job, StreamBlockJob,
>>>> common.job);
>>>> BlockJob *bjob = &s->common;
>>>> BlockDriverState *bs = blk_bs(bjob->blk);
>>>> - BlockDriverState *base = backing_bs(s->bottom);
>>>> + BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
>>>> + BlockDriverState *base = bdrv_filter_or_cow_bs(s->above_base);
>>> The initial base node may be a top node for a concurrent commit job and
>>>
>>> may disappear.
>> Then it would just be replaced by another node, though, so above_base
>> keeps a child. The @base here is not necessarily the initial @base, and
>> that’s intentional.
>
> Not really. In my example, above_base becomes a dangling
>
> pointer because after the commit job finishes, its filter that should
> belong to the
>
> commit job frozen chain will be deleted. If we freeze the link to the
> above_base
>
> for this job, the iotests #30 will not pass.
So it doesn’t become a dangling pointer, because it’s frozen.
030 passes after this series, so I’m not sure whether I can consider
that problem part of this series.
I think if adding a filter node becomes a problem, we have to consider
relaxing the restrictions when we do that, not now.
>>> base = bdrv_filter_or_cow_bs(s->base_overlay) is more reliable.
>> But also wrong. The point of keeping above_base around is to get its
>> child here to use that child as the new backing child of the top node.
>>
>>>> Error *local_err = NULL;
>>>> int ret = 0;
>>>> - bdrv_unfreeze_backing_chain(bs, s->bottom);
>>>> + bdrv_unfreeze_backing_chain(bs, s->above_base);
>>>> s->chain_frozen = false;
>>>> - if (bs->backing) {
>>>> + if (bdrv_cow_child(unfiltered_bs)) {
>>>> const char *base_id = NULL, *base_fmt = NULL;
>>>> if (base) {
>>>> base_id = s->backing_file_str;
>>>> @@ -77,8 +79,8 @@ static int stream_prepare(Job *job)
>>>> base_fmt = base->drv->format_name;
>>>> }
>>>> }
>>>> - bdrv_set_backing_hd(bs, base, &local_err);
>>>> - ret = bdrv_change_backing_file(bs, base_id, base_fmt);
>>>> + bdrv_set_backing_hd(unfiltered_bs, base, &local_err);
>>>> + ret = bdrv_change_backing_file(unfiltered_bs, base_id,
>>>> base_fmt);
>>>> if (local_err) {
>>>> error_report_err(local_err);
>>>> return -EPERM;
>>>> @@ -109,14 +111,15 @@ static int coroutine_fn stream_run(Job *job,
>>>> Error **errp)
>>>> StreamBlockJob *s = container_of(job, StreamBlockJob,
>>>> common.job);
>>>> BlockBackend *blk = s->common.blk;
>>>> BlockDriverState *bs = blk_bs(blk);
>>>> - bool enable_cor = !backing_bs(s->bottom);
>>>> + BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
>>>> + bool enable_cor = !bdrv_cow_child(s->base_overlay);
>>>> int64_t len;
>>>> int64_t offset = 0;
>>>> uint64_t delay_ns = 0;
>>>> int error = 0;
>>>> int64_t n = 0; /* bytes */
>>>> - if (bs == s->bottom) {
>>>> + if (unfiltered_bs == s->base_overlay) {
>>>> /* Nothing to stream */
>>>> return 0;
>>>> }
>>>> @@ -150,13 +153,14 @@ static int coroutine_fn stream_run(Job *job,
>>>> Error **errp)
>>>> copy = false;
>>>> - ret = bdrv_is_allocated(bs, offset, STREAM_CHUNK, &n);
>>>> + ret = bdrv_is_allocated(unfiltered_bs, offset, STREAM_CHUNK,
>>>> &n);
>>>> if (ret == 1) {
>>>> /* Allocated in the top, no need to copy. */
>>>> } else if (ret >= 0) {
>>>> /* Copy if allocated in the intermediate images. Limit
>>>> to the
>>>> * known-unallocated area [offset,
>>>> offset+n*BDRV_SECTOR_SIZE). */
>>>> - ret = bdrv_is_allocated_above(backing_bs(bs), s->bottom,
>>>> true,
>>>> + ret = bdrv_is_allocated_above(bdrv_cow_bs(unfiltered_bs),
>>>> + s->base_overlay, true,
>>>> offset, n, &n);
>>>> /* Finish early if end of backing file has been
>>>> reached */
>>>> if (ret == 0 && n == 0) {
>>>> @@ -223,9 +227,29 @@ void stream_start(const char *job_id,
>>>> BlockDriverState *bs,
>>>> BlockDriverState *iter;
>>>> bool bs_read_only;
>>>> int basic_flags = BLK_PERM_CONSISTENT_READ |
>>>> BLK_PERM_WRITE_UNCHANGED;
>>>> - BlockDriverState *bottom = bdrv_find_overlay(bs, base);
>>>> + BlockDriverState *base_overlay = bdrv_find_overlay(bs, base);
>>>> + BlockDriverState *above_base;
>>>> - if (bdrv_freeze_backing_chain(bs, bottom, errp) < 0) {
>>>> + if (!base_overlay) {
>>>> + error_setg(errp, "'%s' is not in the backing chain of '%s'",
>>>> + base->node_name, bs->node_name);
>>> Sorry, I am not clear with the error message.
>>>
>>> In this case, there is no an intermediate COW node but the base, if not
>>> NULL, is
>>>
>>> in the backing chain of bs, isn't it?
>>>
>>>> + return;
>>>> + }
>>>> +
>>>> + /*
>>>> + * Find the node directly above @base. @base_overlay is a COW
>>>> overlay, so
>>>> + * it must have a bdrv_cow_child(), but it is the immediate
>>>> overlay of
>>>> + * @base, so between the two there can only be filters.
>>>> + */
>>>> + above_base = base_overlay;
>>>> + if (bdrv_cow_bs(above_base) != base) {
>>>> + above_base = bdrv_cow_bs(above_base);
>>>> + while (bdrv_filter_bs(above_base) != base) {
>>>> + above_base = bdrv_filter_bs(above_base);
>>>> + }
>>>> + }
>>>> +
>>>> + if (bdrv_freeze_backing_chain(bs, above_base, errp) < 0) {
>>> When a concurrent stream job tries to freeze or remove the above_base
>>> node,
>>>
>>> we will encounter the frozen node error. The above_base node is a part
>>> of the
>>>
>>> concurrent job frozen chain.
>> Correct.
>>
>>>> return;
>>>> }
>>>> @@ -255,14 +279,19 @@ void stream_start(const char *job_id,
>>>> BlockDriverState *bs,
>>>> * and resizes. Reassign the base node pointer because the
>>>> backing BS of the
>>>> * bottom node might change after the call to
>>>> bdrv_reopen_set_read_only()
>>>> * due to parallel block jobs running.
>>>> + * above_base node might change after the call to
>>> Yes, if not frozen.
>>>> + * bdrv_reopen_set_read_only() due to parallel block jobs running.
>>>> */
>>>> - base = backing_bs(bottom);
>>>> - for (iter = backing_bs(bs); iter && iter != base; iter =
>>>> backing_bs(iter)) {
>>>> + base = bdrv_filter_or_cow_bs(above_base);
>>>> + for (iter = bdrv_filter_or_cow_bs(bs); iter != base;
>>>> + iter = bdrv_filter_or_cow_bs(iter))
>>>> + {
>>>> block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
>>>> basic_flags, &error_abort);
>>>> }
>>>> - s->bottom = bottom;
>>>> + s->base_overlay = base_overlay;
>>>> + s->above_base = above_base;
>>> Generally, being the filter for a concurrent job, the above_base node
>>> may be deleted any time
>>>
>>> and we will keep the dangling pointer. It may happen even earlier if
>>> above_base is not frozen.
>>>
>>> If it is, as it here, we may get the frozen link error then.
>> I’m not sure what you mean here. Freezing it was absolutely
>> intentional. A dangling pointer would be a problem, but that’s why it’s
>> frozen, so it stays around and can’t be deleted any time.
>>
>> Max
>
> The nodes we freeze should be in one context of the relevant job:
>
> filter->top_node->intermediate_node(s)
>
> We would not include the base or any filter above it to the frozen chain
>
> because they are of a different job context.
They aren’t really, because we need to know the backing node of @device
after the job.
> Once 'this' job is completed, we set the current backing child of the
> base_overlay
>
> and may not care of its character. If that is another job filter, it
> will be replaced
>
> with the proper node afterwards.
But what if there are filters above the base that the user wants to keep
after the job?
Max
signature.asc
Description: OpenPGP digital signature