qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH RFC 1/1] Stream block job involves


From: Andrey Shinkevich
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH RFC 1/1] Stream block job involves copy-on-read filter
Date: Wed, 20 Feb 2019 09:10:11 +0000


On 18/02/2019 13:08, Vladimir Sementsov-Ogievskiy wrote:
> 14.02.2019 16:43, Andrey Shinkevich wrote:
>>
>>
>> On 12/02/2019 14:35, Alberto Garcia wrote:
>>> On Mon 11 Feb 2019 05:58:05 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> The problem is in the concept of "base" node. The code written in
>>>>>> manner that base is not changed during block job. However, job don't
>>>>>> own base and there is no guarantee that it will not change during
>>>>>> the job.
>>>>>
>>>>> But if that's the case then we have a problem already, because 'base'
>>>>> is a member of StreamBlockJob and is used in the existing
>>>>> stream_run() code.
>>>>
>>>> I think it should be possible to reproduce, using block-commit (which
>>>> already has filter) with block-stream in parallel, we'll try.
>>>
>>> It's not possible to run block-stream and block-commit in parallel on
>>> the same nodes. See iotest 030 for a few examples.
>>>
>>> So unless there's a bug it should be safe.
>>>
>>>>> So if there's a way to make 'base' disappear during the job (how?)
>>>>> then we could protect it with block_job_add_bdrv().
>>>>
>>>> I'm not sure this is correct. What is the reason for stream to own
>>>> base? It's not really interested in it.
>>>
>>> stream does not need to write or modify base, but it does need to keep a
>>> reference to it in order to now where to stop copying data.
>>>
>>> As I said earlier base is a member of StreamBlockJob, so it should not
>>> disappear during the job.
>>>
>>> Berto
>>>
>>
>> No doubt that a reference to the base node is needed as a limit for the
>> COR work. The base is still in the game. Actually, we encounter an issue
>> when BlockDriverState of the COR-filter comes into play instead of the
>> base. It is inherited from a parallel job. Based on the case
>> test_stream_parallel from the qemu-iotests/030, it works like this:
>> 1. Job#1 is started and inserts a COR-filter above the top node.
>> 2. Context switch.
>> 3. Job#2 is started and inherits the COR-filter as the base.
>> 4. Context switch.
>> 5. Job#1 comes to the end and removes its COR-filter referenced by
>>       job#2.
>> 6. Context switch.
>> 7. Job#2 comes to the end and uses the deleted COR-filter node.
>> 8. We are in trouble.
>>
>> Or another scenario:
>> 1-4. As above
>> 5. Job#2 comes to the end first and keeps the COR-filter as the
>>       backing node.
>> 6. The assert() fails as the referenced COR-filter was not deleted
>>       on exit.
>> 7. The game is over with a poor score.
>>
>> If we just keep the intermediate bottom node instead of the base, we
>> will have a similar issue. On the job exit, we change the backing file.
>> If we call the backing_bs(bottom_node_bs), we will keep the COR-filter
>> node instead, whereas it has not been removed jet by the unfinished
>> job#1 (see the second scenario above).
> 
> Honestly, don't see any problem with it.
> 
>>
>> If we include the base node into the job node list with some protecting
>> flags, that's block_job_add_bdrv(&s->common, "base", base, ..), we can
>> end up with a failure of the bdrv_check_update_perm() called by the
>> bdrv_root_attach_child() because the previous job#1 has the root
>> reference to the same node with other permissions. So, the next job will
>> fail to start and the test cases won't pass.
> 
> You mean test 30 # testcase about parallel stream jobs, which starts streams,
> sharing one node as base for one and top for another. And if we lock base in
> stream job the test fails. I think, we just need to update the test, by 
> inserting
> additional empty qcow2 nodes below shared ones, and make new inserted nodes
> to be top nodes of stream jobs, so there no more node-sharing between jobs.
> 
>>
>> It we set the following flags, there will be no failure:
>> block_job_add_bdrv(&s->common, "base", base, 0, BLK_PERM_CONSISTENT_READ
>> | BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD, &error_abort);
>> But what margin will we gain? We will have the same issues as above.
> 
> We meed flags ..., BLK_PERM_GRAPH_MOD, BLK_PERM_ALL & ~BLK_PERM_GRAPH_MOD, ...
> for base node.
> 
>>
>> In many other cases, when the filter has not been identified, we can
>> get a broken chain while calling to the backing_bs(bs=filter) as the
>> backing node of the filter is the zero pointer.
>>
>> That's why I implemented the skip_filter() function in this series as
>> a some sort of solution or workaround. May we proceed with that?
>> Is there any better idea?
>>
> 
> I think now, that best way is to lock base node. As it is the simplest one. 
> And it
> corresponds to current code, which actually keeps illegal pointer to base 
> node,
> without any access rights.
> 
> 

Well, I am about to implement the next series version with the base
BlockDriverState included into the job node list and will insert extra
images between those shared by the parallel jobs in the test case
TestParallelOps of #030.

-- 
With the best regards,
Andrey Shinkevich

reply via email to

[Prev in Thread] Current Thread [Next in Thread]