qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH RFC 1/1] Stream block job involves


From: Andrey Shinkevich
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH RFC 1/1] Stream block job involves copy-on-read filter
Date: Thu, 14 Feb 2019 13:43:11 +0000


On 12/02/2019 14:35, Alberto Garcia wrote:
> On Mon 11 Feb 2019 05:58:05 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>>>> The problem is in the concept of "base" node. The code written in
>>>> manner that base is not changed during block job. However, job don't
>>>> own base and there is no guarantee that it will not change during
>>>> the job.
>>>
>>> But if that's the case then we have a problem already, because 'base'
>>> is a member of StreamBlockJob and is used in the existing
>>> stream_run() code.
>>
>> I think it should be possible to reproduce, using block-commit (which
>> already has filter) with block-stream in parallel, we'll try.
> 
> It's not possible to run block-stream and block-commit in parallel on
> the same nodes. See iotest 030 for a few examples.
> 
> So unless there's a bug it should be safe.
> 
>>> So if there's a way to make 'base' disappear during the job (how?)
>>> then we could protect it with block_job_add_bdrv().
>>
>> I'm not sure this is correct. What is the reason for stream to own
>> base? It's not really interested in it.
> 
> stream does not need to write or modify base, but it does need to keep a
> reference to it in order to now where to stop copying data.
> 
> As I said earlier base is a member of StreamBlockJob, so it should not
> disappear during the job.
> 
> Berto
> 

No doubt that a reference to the base node is needed as a limit for the
COR work. The base is still in the game. Actually, we encounter an issue
when BlockDriverState of the COR-filter comes into play instead of the 
base. It is inherited from a parallel job. Based on the case
test_stream_parallel from the qemu-iotests/030, it works like this:
1. Job#1 is started and inserts a COR-filter above the top node.
2. Context switch.
3. Job#2 is started and inherits the COR-filter as the base.
4. Context switch.
5. Job#1 comes to the end and removes its COR-filter referenced by
    job#2.
6. Context switch.
7. Job#2 comes to the end and uses the deleted COR-filter node.
8. We are in trouble.

Or another scenario:
1-4. As above
5. Job#2 comes to the end first and keeps the COR-filter as the
    backing node.
6. The assert() fails as the referenced COR-filter was not deleted
    on exit.
7. The game is over with a poor score.

If we just keep the intermediate bottom node instead of the base, we
will have a similar issue. On the job exit, we change the backing file.
If we call the backing_bs(bottom_node_bs), we will keep the COR-filter
node instead, whereas it has not been removed jet by the unfinished 
job#1 (see the second scenario above).

If we include the base node into the job node list with some protecting
flags, that's block_job_add_bdrv(&s->common, "base", base, ..), we can
end up with a failure of the bdrv_check_update_perm() called by the
bdrv_root_attach_child() because the previous job#1 has the root
reference to the same node with other permissions. So, the next job will
fail to start and the test cases won't pass.

It we set the following flags, there will be no failure:
block_job_add_bdrv(&s->common, "base", base, 0, BLK_PERM_CONSISTENT_READ 
| BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD, &error_abort);
But what margin will we gain? We will have the same issues as above.

In many other cases, when the filter has not been identified, we can
get a broken chain while calling to the backing_bs(bs=filter) as the
backing node of the filter is the zero pointer.

That's why I implemented the skip_filter() function in this series as
a some sort of solution or workaround. May we proceed with that?
Is there any better idea?

-- 
With the best regards,
Andrey Shinkevich

reply via email to

[Prev in Thread] Current Thread [Next in Thread]