qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Safely reopening image files by stashing fds


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Safely reopening image files by stashing fds
Date: Tue, 9 Aug 2011 11:50:43 +0100

On Tue, Aug 9, 2011 at 11:35 AM, Kevin Wolf <address@hidden> wrote:
> Am 09.08.2011 12:25, schrieb Stefan Hajnoczi:
>> On Mon, Aug 8, 2011 at 4:16 PM, Kevin Wolf <address@hidden> wrote:
>>> Am 08.08.2011 16:49, schrieb Stefan Hajnoczi:
>>>> On Fri, Aug 5, 2011 at 10:48 AM, Kevin Wolf <address@hidden> wrote:
>>>>> Am 05.08.2011 11:29, schrieb Stefan Hajnoczi:
>>>>>> On Fri, Aug 5, 2011 at 10:07 AM, Kevin Wolf <address@hidden> wrote:
>>>>>>> Am 05.08.2011 10:40, schrieb Stefan Hajnoczi:
>>>>>>>> We've discussed safe methods for reopening image files (e.g. useful for
>>>>>>>> changing the hostcache parameter).  The problem is that closing the 
>>>>>>>> file first
>>>>>>>> and then opening it again exposes us to the error case where the open 
>>>>>>>> fails.
>>>>>>>> At that point we cannot get to the file anymore and our options are to
>>>>>>>> terminate QEMU, pause the VM, or offline the block device.
>>>>>>>>
>>>>>>>> This window of vulnerability can be eliminated by keeping the file 
>>>>>>>> descriptor
>>>>>>>> around and falling back to it should the open fail.
>>>>>>>>
>>>>>>>> The challenge for the file descriptor approach is that image formats, 
>>>>>>>> like
>>>>>>>> VMDK, can span multiple files.  Therefore the solution is not as 
>>>>>>>> simple as
>>>>>>>> stashing a single file descriptor and reopening from it.
>>>>>>>
>>>>>>> So far I agree. The rest I believe is wrong because you can't assume
>>>>>>> that every backend uses file descriptors. The qemu block layer is based
>>>>>>> on BlockDriverStates, not fds. They are a concept that should be hidden
>>>>>>> in raw-posix.
>>>>>>>
>>>>>>> I think something like this could do:
>>>>>>>
>>>>>>> struct BDRVReopenState {
>>>>>>>    BlockDriverState *bs;
>>>>>>>    /* can be extended by block drivers */
>>>>>>> };
>>>>>>>
>>>>>>> .bdrv_reopen(BlockDriverState *bs, BDRVReopenState **reopen_state, int
>>>>>>> flags);
>>>>>>> .bdrv_reopen_commit(BDRVReopenState *reopen_state);
>>>>>>> .bdrv_reopen_abort(BDRVReopenState *reopen_state);
>>>>>>>
>>>>>>> raw-posix would store the old file descriptor in its reopen_state. On
>>>>>>> commit, it closes the old descriptors, on abort it reverts to the old
>>>>>>> one and closes the newly opened one.
>>>>>>>
>>>>>>> Makes things a bit more complicated than the simple bdrv_reopen I had in
>>>>>>> mind before, but it allows VMDK to get an all-or-nothing semantics.
>>>>>>
>>>>>> Can you show how bdrv_reopen() would use these new interfaces?  I'm
>>>>>> not 100% clear on the idea.
>>>>>
>>>>> Well, you wouldn't only call bdrv_reopen, but also either
>>>>> bdrv_reopen_commit/abort (for the top-level caller we can have a wrapper
>>>>> function that does both, but that's syntactic sugar).
>>>>>
>>>>> For example we would have:
>>>>>
>>>>> int vmdk_reopen()
>>>>
>>>> .bdrv_reopen() is a confusing name for this operation because it does
>>>> not reopen anything.  bdrv_prepare_reopen() might be clearer.
>>>
>>> Makes sense.
>>>
>>>>
>>>>> {
>>>>>    *((VMDKReopenState**) rs) = malloc();
>>>>>
>>>>>    foreach (extent in s->extents) {
>>>>>        ret = bdrv_reopen(extent->file, &extent->reopen_state)
>>>>>        if (ret < 0)
>>>>>            goto fail;
>>>>>    }
>>>>>    return 0;
>>>>>
>>>>> fail:
>>>>>    foreach (extent in rs->already_reopened) {
>>>>>        bdrv_reopen_abort(extent->reopen_state);
>>>>>    }
>>>>>    return ret;
>>>>> }
>>>>
>>>>> void vmdk_reopen_commit()
>>>>> {
>>>>>    foreach (extent in s->extents) {
>>>>>        bdrv_reopen_commit(extent->reopen_state);
>>>>>    }
>>>>>    free(rs);
>>>>> }
>>>>>
>>>>> void vmdk_reopen_abort()
>>>>> {
>>>>>    foreach (extent in s->extents) {
>>>>>        bdrv_reopen_abort(extent->reopen_state);
>>>>>    }
>>>>>    free(rs);
>>>>> }
>>>>
>>>> Does the caller invoke bdrv_close(bs) after bdrv_prepare_reopen(bs,
>>>> &rs)?
>>>
>>> No. Closing the old backend would be part of bdrv_reopen_commit.
>>>
>>> Do you have a use case where it would be helpful if the caller invoked
>>> bdrv_close?
>>
>> When the caller does bdrv_close() two BlockDriverStates are never open
>> for the same image file.  I thought this was a property we wanted.
>>
>> Also, in the block_set_hostcache case we need to reopen without
>> switching to a new BlockDriverState instance.  That means the reopen
>> needs to be in-place with respect to the BlockDriverState *bs pointer.
>>  We cannot create a new instance.
>
> Yes, but where do you even get the second BlockDriverState from?
>
> My prototype only returns an int, not a new BlockDriverState. Until
> bdrv_reopen_commit() it would refer to the old file descriptors etc. and
> after bdrv_reopen_commit() the very same BlockDriverState would refer to
> the new ones.

It seems I don't understand the API.  I thought it was:

do_block_set_hostcache()
{
    bdrv_prepare_reopen(bs, &rs);
    ...open new file and check everything is okay...
    if (ret == 0) {
        bdrv_reopen_commit(rs);
    } else {
        bdrv_reopen_abort(rs);
    }
    return ret;
}

If the caller isn't opening the new file then what's the point of
giving the caller control over prepare, commit, and abort?

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]