qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] KVM Forum block no[td]es


From: Max Reitz
Subject: Re: [Qemu-devel] KVM Forum block no[td]es
Date: Wed, 14 Nov 2018 18:24:10 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0

On 13.11.18 16:12, Alberto Garcia wrote:
> On Sun 11 Nov 2018 11:25:00 PM CET, Max Reitz wrote:
> 
>> Permission system
>> =================
>>
>> GRAPH_MOD
>> ---------
>>
>> We need some way for the commit job to prevent graph changes on its
>> chain while it is running.  Our current blocker doesn’t do the job,
>> however.  What to do?
>>
>> - We have no idea how to make a *permission* work.  Maybe the biggest
>>   problem is that it just doesn’t work as a permission, because the
>>   commit job doesn’t own the BdrvChildren that would need to be
>>   blocked (namely the @backing BdrvChild).
> 
> What I'm doing at the moment in my blockdev-reopen series is check
> whether all parents of the node I want to change share the GRAPH_MOD
> permission. If any of them doesn't then the operation fails.
> 
> This can be checked by calling bdrv_get_cumulative_perm() or simply
> bdrv_check_update_perm(..., BLK_PERM_GRAPH_MOD, ...).

Sure.

> It solves the problem because e.g. the stream block job only allows
> BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED, so no graph
> modifications allowed.

But the problem is that the commit job needs to unshare the permission
on all backing links between base and top, so none of the backing links
can be broken.  But it cannot do that because those backing links do not
belong to it (but to the respective overlay nodes).

>>   (We never quite knew what “taking the GRAPH_PERMISSION” or
>>   “unsharing the GRPAH_MOD permission” was supposed to mean.  Figuring
>>   that out always took like half an our in any face-to-face meeting,
>>   and then we decided it was pretty much useless for any case we had
>>   at hand.)
> 
> Yeah it's a bit unclear. It can mean "none of this node's children can
> be replaced / removed", which is what I'm using it for at the moment.

Yes, but that's just not useful.  I don't think we have any case where
something would be annoyed by having its immediate child replaced (other
than the visible data changing, potentially).  Usually when we say
something (e.g. a block job) wants to prevent graph modification, it's
about changing edges that exist independently of the job (such as a
backing chain).

>> Reopen
>> ------
>>
>> How should permissions be handled while the reopen is under way?
>> Maybe we should take the union of @perm before and after, and the
>> intersection of @shared before and after?
> 
> Do you have an example of a case in which you're reopening a node and
> the change of permissions causes a problem?

I do not.  So currently we switch from the permissions before to the
ones after, right?  Which should be safe because that switch is a
transaction that is integrated into reopen.

However, I suppose it's possible the protocol layer gives us some issues
again.  It cannot switch the locks before commit, but committing may
fail.  What to do then?

It would be safer if we took the union/intersection in
bdrv_reopen_prepare() and then released the unnecessary flags in
bdrv_reopen_commit().  We can still get errors (as discussed), but those
can be safely ignored.  (They're just annoying, but not harmful.)

>> - Is it possible that changing an option may require taking an
>>   intermediate permission that is required neither before nor after
>>   the reopen process?
>>   Changing a child link comes to mind (like changing a child from one
>>   BDS to another, where the visible data changes, which would mean we
>>   may want to e.g. unshare CONSISTENT_READ during the reopen).
>>   However:
>>   1. It is unfeasible to unshare that for all child changes.
>>      Effectively everything requires CONSISTENT_READ, and for good
>>      reason.
>>   2. Why would a user even change a BDS to something of a different
>>      content?
>>   3. Anything that currently allows you to change a child node assumes
>>      that the user always changes it to something of the same content
>>      (some take extra care to verify this, like mirror, which makes
>>      sure that @replaces and the target are connected, and there are
>>      only filter nodes in between).
> 
> I don't think the user wants to change a BDS to something of a different
> content, but it may happen that QEMU doesn't have a way to verify
> whether the content is the same or not.
> 
> I think we have one such case already with 'blockdev-snapshot', in which
> you add a BDS with blockdev-add and then add its contents on top of an
> existing BDS.

Yeah.  The new overlay is expected to be empty so when you replace the
now-snapshot with it, the parent nodes don't see any change.  If it's
not empty...  Well, your problem.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]