[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] live block copy/stream/snapshot discussion
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] live block copy/stream/snapshot discussion |
Date: |
Tue, 12 Jul 2011 18:10:26 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Thunderbird/3.1.10 |
Am 12.07.2011 17:45, schrieb Stefan Hajnoczi:
>>>> Image streaming API
>>>> ===================
>>>>
>>>> For leaf images with copy-on-read semantics, the stream commands allow the
>>>> user
>>>> to populate local blocks by manually streaming them from the backing image.
>>>> Once all blocks have been streamed, the dependency on the original backing
>>>> image can be removed. Therefore, stream commands can be used to implement
>>>> post-copy live block migration and rapid deployment.
>>>>
>>>> The block_stream command can be used to stream a single cluster, to
>>>> start streaming the entire device, and to cancel an active stream. It
>>>> is easiest to allow the block_stream command to manage streaming for the
>>>> entire device but a managent tool could use single cluster mode to
>>>> throttle the I/O rate.
>>
>> As discussed earlier, having the management send requests for each
>> single cluster doesn't make any sense at all. It wouldn't only throttle
>> the I/O rate but bring it down to a level that makes it unusable. What
>> you really want is to allow the management to give us a range (offset +
>> length) that qemu should stream.
>
> I feel that an iteration interface is problematic whether the
> management tool or QEMU decide what to stream. Let's have just the
> background streaming operation.
>
> The problem with byte ranges is two-fold. The management tool doesn't
> know which regions of the image are allocated so it may do a lot of
> nop calls to already-allocated regions with no intelligence as to
> where the next sensible offset for streaming is. Secondly, because
> the progress and performance of image streaming depend largely on
> whether or not clusters are allocated (it is very fast when a cluster
> is already allocated and we have no work to do), offsets are bad
> indicators of progress to the user. I think it's best not to expose
> these details to the management tool at all.
>
> The only reason for the iteration interface was to punt I/O throttling
> to the management tool. I think it would be easier to just throttle
> inside the streaming function.
>
> Kevin: Are you happy with dropping the iteration interface?
> Adam: Is there a libvirt requirement for iteration or could we support
> background copy only?
Okay, works for me.
>>>> The command synopses are as follows:
>>>>
>>>> block_stream
>>>> ------------
>>>>
>>>> Copy data from a backing file into a block device.
>>>>
>>>> If the optional 'all' argument is true, this operation is performed in the
>>>> background until the entire backing file has been copied. The status of
>>>> ongoing block_stream operations can be checked with query-block-stream.
>>
>> Not sure if it's a good idea to use a bool argument to turn a command
>> into its opposite. I think having a separate command for stopping would
>> be cleaner. Something for the QMP folks to decide, though.
>
> git branch new_branch
> git branch -D new_branch
>
> Makes sense to me :)
I don't think you should compare a command line option to a programming
interface. Having a git_create_branch(const char *name, bool delete)
would really look strange. Anyway, probably a matter of taste.
A hint that separate commands would make sense is that the stop command
won't need the other arguments that the start command gets ('all' and
'base').
>>>> Arguments:
>>>>
>>>> - all: copy entire device (json-bool, optional)
>>>> - stop: stop copying to device (json-bool, optional)
>>>> - device: device name (json-string)
>>>
>>> It must be possible to specify backing file that will be
>>> active after streaming finishes (data from that file will not
>>> be streamed into active file, of course).
>>
>> Yes, I think the common base image belongs here.
>
> Right. We need to specify it by filename:
>
> - base: filename of base file (json-string, optional)
>
> Sectors are not copied from the base file and its backing file
> chain. The following describes this feature:
> Before: base <- sn1 <- sn2 <- sn3 <- vm.img
> After: base <- vm.img
Does this imply that a rebase -u happens always after completion?
>> With all = false, where does the streaming begin?
>
> Streaming begins at the start of the image.
>
>> Do you have something like the "current streaming offset" in the state of
>> each BlockDriverState?
>
> Yes, there is a StreamState for each block device that has an
> in-progress operation. The progress is saved between block_stream
> (without -a) invocations so the caller does not need to specify the
> streaming offset as an argument.
>
> Thanks for pointing out these weaknesses in the documentation. It
> should really be explained fully.
I think we also need to describe error cases. For example, what happens
if you try to start streaming while it's already in progress?
>>>> Return:
>>>>
>>>> - device: device name (json-string)
>>>> - len: size of the device, in bytes (json-int)
>>>> - offset: ending offset of the completed I/O, in bytes (json-int)
>>
>> So you only get the reply when the request has completed? With the
>> current monitor, this means that QMP is blocked while we stream, doesn't
>> it? How are you supposed to send the stop command then?
>
> Incomplete documentation again, sorry. The block_stream command
> behaves as follows:
>
> 1. block_stream all returns immediately and the BLOCK_STREAM_COMPLETED
> event is raised when streaming completes either successfully or with
> an error.
>
> 2. block_stream stop returns when the in-progress streaming operation
> has been safely stopped.
>
> 3. block_stream returns when one iteration of streaming has completed.
>
>> Two of three examples below have an empty return value instead, so they
>> are not compliant to this specification.
>
> I will update the documentation, the non-all invocations do not return
> anything.
Okay, then I don't understand what the 'offset' return value means. The
text says "offset of the completed I/O". If all=true immediately
returns, shouldn't it always be 0?
>> I find it rather disturbing that a command like 'change' has made it
>> into QMP... Anyway, I don't think this is really what we need.
>>
>> We have two switches to do. The first one happens before starting the
>> copy: Creating the copy, with the source as its backing file, and
>> switching to that. The monitor command to achieve this is snapshot_blkdev.
>
> I don't think that creating image files in QEMU is going to work when
> running KVM with libvirt (SELinux). The QEMU process does not have
> the ability to create new image files. It needs at least a file
> descriptor to an empty file or maybe a file that has been created
> using qemu-img like I showed above.
Independent problem. We're really creating an external snapshot here, so
we should use the function for external snapshots. libvirt can
pre-create an empty image file, so that qemu will write the image format
data into it, but we have discussed this before.
Kevin
- [Qemu-devel] live block copy/stream/snapshot discussion, Dor Laor, 2011/07/05
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Stefan Hajnoczi, 2011/07/11
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Stefan Hajnoczi, 2011/07/11
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Marcelo Tosatti, 2011/07/11
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Kevin Wolf, 2011/07/12
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Stefan Hajnoczi, 2011/07/12
- Re: [Qemu-devel] live block copy/stream/snapshot discussion,
Kevin Wolf <=
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Stefan Hajnoczi, 2011/07/13
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Stefan Hajnoczi, 2011/07/14
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Kevin Wolf, 2011/07/14
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Stefan Hajnoczi, 2011/07/14
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Kevin Wolf, 2011/07/14
- Re: [Qemu-devel] live block copy/stream/snapshot discussion, Adam Litke, 2011/07/12