qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/23] block migration: Fixes, cleanups and spee


From: Jan Kiszka
Subject: Re: [Qemu-devel] [PATCH 00/23] block migration: Fixes, cleanups and speedups
Date: Mon, 30 Nov 2009 20:44:51 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

Pierre Riteau wrote:
> On 30 nov. 2009, at 20:25, Jan Kiszka wrote:
> 
>> Pierre Riteau wrote:
>>> On 30 nov. 2009, at 19:34, Anthony Liguori wrote:
>>>
>>>> Jan Kiszka wrote:
>>>>> This series is a larger rework of the block migration support qemu
>>>>> recently gained. Besides lots of code refactorings the major changes
>>>>> are:
>>>>> - Faster restore due to larger block sizes (even if the target disk is
>>>>>  unallocated)
>>>>> - Off-by-one fixes in the block dirty tracking code
>>>>> - Allow for multiple migrations (after cancellation or if migrating
>>>>>  into a backup image)
>>>>> - Proper error handling
>>>>> - Progress reporting fixes: report to monitor instead of stdout, report
>>>>>  sum of multiple disks
>>>>> - Report disk migration progress via 'info migrate'
>>>>> - Progress report during restore
>>>>>
>>>>> One patch is directly taken from Pierre Riteau queue [1] who happend to
>>>>> work on the some topic the last days, two more are derived from his
>>>>> commits.
>>>>>
>>>>> These patches make block migration usable for us. Still, there are two
>>>>> more major improvements on my wish/todo list:
>>>>> - Respect specified maximum migration downtime (will require tracking
>>>>>  of the number of dirty blocks + some coordination with ram migration)
>>>>> - Do not transfere unallocated disk space (also for raw images, ie. add
>>>>>  bdrv_is_allocated support for the latter)
>>>>>
>>>>> In an off-list chat, Liran additionally brought up the topic that RAM
>>>>> migration should not start too early so that we avoid re-transmitting
>>>>> dirty pages over and over again while the disk image is slowly beamed
>>>>> over.
>>>>>
>>>>> I hope we can join our efforts to resolve the open topics quickly, the
>>>>> critical ones ideally before the merge window closes.
>>>>>
>>>> That really needs to happen no later than the end of this week.
>>>>
>>>> So Pierre/Liran, what do you think about Jan's series?
>>>>
>>>> Regards,
>>>>
>>>> Anthony Liguori
>>>
>>> I'm currently testing these patches. Here are a few issues I noticed, 
>>> before I forget about them.
>>>
>>> - "migrate -d -b tcp:dest:port" works, but "migrate -b -d tcp:dest:port" 
>>> doesn't, although "help migrate" doesn't really specify ordering as 
>>> important. But anyway I think Liran is working on a new version of the 
>>> command.
>> Saw that too. I think the monitor commands simply do very primitive
>> option parsing so far. Should be addressed if the final format comes
>> with this issue as well.
>>
>>> - We use bdrv_aio_readv() to read blocks from the disk. This function 
>>> increments rd_bytes and rd_ops, which are reported by "info blockstats". I 
>>> don't think this read operations should appear in VM activity, especially 
>>> if this interface is used by libvirt to report VM stats (and draw graphs in 
>>> virt-manager, etc.). Same for write stats.
>> Ack.
>>
>>> - We may need to call bdrv_reset_dirty() _before_ sending the data, to be 
>>> sure the block is not rewritten in the meantime (maybe it's an issue only 
>>> with kvm?)
>> Can you elaborate? Even in case of multi-threaded qemu, the iomutex
>> should protect us here.
> 
> I only said that because I remember seeing this kind of behavior, but with 
> ram migration on kvm.
> As I'm not familiar with the I/O emulation in qemu, if you say that it's OK, 
> no problem.

RAM is different as RAM access need not be synchronized across the vcpus
and the iothread.

> 
> By multi-threaded, are you talking about the IO thread feature?

Yes (which also includes per vcpu threads).

> 
>>> - I seem to remember that disk images with 0 size are now possible. I'm 
>>> afraid we will hit a divide by zero in this case: "progress = 
>>> completed_sector_sum * 100 / block_mig_state.total_sector_sum;"
>> Although I don't see their use, it should be handled gracefully, likely
>> by skipping such disks.
> 
> From a patch by Stefan Weil a few weeks ago:
> 
>> Images with disk size 0 may be used for
>> VM snapshots, but not to save normal block data.
>>
>> It is possible to create such images using
>> qemu-img, but opening them later fails.
>>
>> So even "qemu-img info image.qcow2" is not
>> possible for an image created with
>> "qemu-img create -f qcow2 image.qcow2 0".
> 
> I'm not sure if that concerns us...
> 

Good point. Then my add-on patch is definitely required.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]