qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v3] block/file-posix: do not fail on unlock byte


From: Max Reitz
Subject: Re: [Qemu-block] [PATCH v3] block/file-posix: do not fail on unlock bytes
Date: Fri, 29 Mar 2019 18:58:17 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

On 29.03.19 18:54, Vladimir Sementsov-Ogievskiy wrote:
> 29.03.2019 20:39, Max Reitz wrote:
>> On 29.03.19 18:31, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.03.2019 20:15, Max Reitz wrote:
>>>> On 29.03.19 12:04, Vladimir Sementsov-Ogievskiy wrote:
>>>>> bdrv_replace_child() calls bdrv_check_perm() with error_abort on
>>>>> loosening permissions. However file-locking operations may fail even
>>>>> in this case, for example on NFS. And this leads to Qemu crash.
>>>>>
>>>>> Let's avoid such errors. Note, that we ignore such things anyway on
>>>>> permission update commit and abort.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>>> ---
>>>>>    block/file-posix.c | 12 ++++++++++++
>>>>>    1 file changed, 12 insertions(+)
>>>>>
>>>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>>>> index db4cccbe51..1cf4ee49eb 100644
>>>>> --- a/block/file-posix.c
>>>>> +++ b/block/file-posix.c
>>>>> @@ -815,6 +815,18 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
>>>>>    
>>>>>        switch (op) {
>>>>>        case RAW_PL_PREPARE:
>>>>> +        if ((s->perm | new_perm) == s->perm &&
>>>>> +            (s->shared_perm & new_shared) == s->shared_perm)
>>>>> +        {
>>>>> +            /*
>>>>> +             * We are going to unlock bytes, it should not fail. If it 
>>>>> fail due
>>>>> +             * to some fs-dependent permission-unrelated reasons (which 
>>>>> occurs
>>>>> +             * sometimes on NFS and leads to abort in 
>>>>> bdrv_replace_child) we
>>>>> +             * can't prevent such errors by any check here. And we 
>>>>> ignore them
>>>>> +             * anyway in ABORT and COMMIT.
>>>>> +             */
>>>>> +            return 0;
>>>>> +        }
>>>>>            ret = raw_apply_lock_bytes(s, s->fd, s->perm | new_perm,
>>>>>                                       ~s->shared_perm | ~new_shared,
>>>>>                                       false, errp);
>>>>
>>>> Help me understand the exact issue, please.  I understand that there are
>>>> operations like bdrv_replace_child() that pass &error_abort to
>>>> bdrv_check_perm() because they just loosen the permissions, so it should
>>>> not fail.
>>>>
>>>> However, if the whole effect really would be to loosen permissions,
>>>> raw_apply_lock_bytes() wouldn't have failed here in PREPARE anyway:
>>>> @unlock is passed as false, so no bytes will be unlocked.  And if
>>>> permissions are just loosened (as your condition checks), it should not
>>>> lock any bytes.
>>>
>>> but it does qemu_lock_fd unconditionally on any locked byte (it may be 
>>> loosening,
>>> but not of all bytes).
>>
>> It only locks bytes that aren't already locked ("!(locked_perm & bit)").
>>   So if (s->perm | new_perm) & ~s->locked_perm == 0 (same for shared), it
> 
> Oops, missed that. We don't have this code in rhel base..
> 
>> won't lock anything.  That's the same as s->perm | new_perm == s->perm
>> and s->perm == s->locked_perm.  So if this patch prevents qemu_lock_fd()
>> errors in raw_apply_lock_bytes(), s->perm must be different from
>> s->locked_perm.
>>
>>> and fcntl is called, which may fail.
>>
>> Hm, yeah, and as Kevin said, raw_check_lock_bytes() if new_perm and
>> new_shared are already covered (though this could be fixed by passing
>> new_perm & ~s->perm and... new_shared | ~s->shared_perm? to
>> raw_check_lock_bytes).
>>
>> It just seems to me that we sometimes do end up in a position where
>> s->perm and s->locked_perm (or ~s->shared_perm and
>> s->locked_shared_perm) do not align, which should not be happening: This
>> would mean that the block layer thinks the permission locks are taken,
>> but file-posix disagrees.
> 
> So if it is it's possibly not my case, as all my crashes are on old code, 
> which
> don't have this additional checking  in raw_apply_lock_bytes

OK, that solves the mystery. :-)

Now I wonder whether it would be better to make raw_check_lock_bytes()
skip the bytes that have been locked already instead of skipping both
calls altogether.

OTOH, there should be no functional difference, so why not just keep the
patch that's already here...

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]