qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: starting to look at qemu savevm performance, a first regression dete


From: Claudio Fontana
Subject: Re: starting to look at qemu savevm performance, a first regression detected
Date: Mon, 7 Mar 2022 13:26:08 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0

On 3/7/22 1:20 PM, Daniel P. Berrangé wrote:
> On Mon, Mar 07, 2022 at 01:09:55PM +0100, Claudio Fontana wrote:
>> On 3/7/22 1:00 PM, Daniel P. Berrangé wrote:
>>> On Mon, Mar 07, 2022 at 12:19:22PM +0100, Claudio Fontana wrote:
>>>> On 3/7/22 10:51 AM, Daniel P. Berrangé wrote:
>>>>> On Mon, Mar 07, 2022 at 10:44:56AM +0100, Claudio Fontana wrote:
>>>>>> Hello Daniel,
>>>>>>
>>>>>> On 3/7/22 10:27 AM, Daniel P. Berrangé wrote:
>>>>>>> On Sat, Mar 05, 2022 at 02:19:39PM +0100, Claudio Fontana wrote:
>>>>>>>>
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> I have been looking at some reports of bad qemu savevm performance in 
>>>>>>>> large VMs (around 20+ Gb),
>>>>>>>> when used in libvirt commands like:
>>>>>>>>
>>>>>>>>
>>>>>>>> virsh save domain /dev/null
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I have written a simple test to run in a Linux centos7-minimal-2009 
>>>>>>>> guest, which allocates and touches 20G mem.
>>>>>>>>
>>>>>>>> With any qemu version since around 2020, I am not seeing more than 580 
>>>>>>>> Mb/Sec even in the most ideal of situations.
>>>>>>>>
>>>>>>>> This drops to around 122 Mb/sec after commit: 
>>>>>>>> cbde7be900d2a2279cbc4becb91d1ddd6a014def .
>>>>>>>>
>>>>>>>> Here is the bisection for this particular drop in throughput:
>>>>>>>>
>>>>>>>> commit cbde7be900d2a2279cbc4becb91d1ddd6a014def (HEAD, refs/bisect/bad)
>>>>>>>> Author: Daniel P. Berrangé <berrange@redhat.com>
>>>>>>>> Date:   Fri Feb 19 18:40:12 2021 +0000
>>>>>>>>
>>>>>>>>     migrate: remove QMP/HMP commands for speed, downtime and cache size
>>>>>>>>     
>>>>>>>>     The generic 'migrate_set_parameters' command handle all types of 
>>>>>>>> param.
>>>>>>>>     
>>>>>>>>     Only the QMP commands were documented in the deprecations page, 
>>>>>>>> but the
>>>>>>>>     rationale for deprecating applies equally to HMP, and the 
>>>>>>>> replacements
>>>>>>>>     exist. Furthermore the HMP commands are just shims to the QMP 
>>>>>>>> commands,
>>>>>>>>     so removing the latter breaks the former unless they get 
>>>>>>>> re-implemented.
>>>>>>>>     
>>>>>>>>     Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>>>>>>     Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>>>>>>>
>>>>>>> That doesn't make a whole lot of sense as a bisect result.
>>>>>>> How reliable is that bisect end point ? Have you bisected
>>>>>>> to that point more than once ?
>>>>>>
>>>>>> I did run through the bisect itself only once, so I'll double check that.
>>>>>> The results seem to be reproducible almost to the second though, a 
>>>>>> savevm that took 35 seconds before the commit takes 2m 48 seconds after.
>>>>>>
>>>>>> For this test I am using libvirt v6.0.0.
>>>
>>> I've just noticed this.  That version of libvirt is 2 years old and
>>> doesn't have full support for migrate_set_parameters.
>>>
>>>
>>>> 2022-03-07 10:47:20.145+0000: 134386: info : qemuMonitorIOWrite:452 : 
>>>> QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 
>>>> buf={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-19"}^M
>>>>  len=93 ret=93 errno=0
>>>> 2022-03-07 10:47:20.146+0000: 134386: info : 
>>>> qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: 
>>>> mon=0x7fa4380028a0 reply={"id": "libvirt-19", "error": {"class": 
>>>> "CommandNotFound", "desc": "The command migrate_set_speed has not been 
>>>> found"}}
>>>> 2022-03-07 10:47:20.147+0000: 134391: error : 
>>>> qemuMonitorJSONCheckError:412 : internal error: unable to execute QEMU 
>>>> command 'migrate_set_speed': The command migrate_set_speed has not been 
>>>> found
>>>
>>> We see the migrate_set_speed failing and libvirt obviously ignores that
>>> failure.
>>>
>>> In current libvirt migrate_set_speed is not used as it properly
>>> handles migrate_set_parameters AFAICT.
>>>
>>> I think you just need to upgrade libvirt if you want to use this
>>> newer QEMU version
>>>
>>> Regards,
>>> Daniel
>>>
>>
>> Got it, this explains it, sorry for the noise on this.
>>
>> I'll continue to investigate the general issue of low throughput with virsh 
>> save / qemu savevm .
> 
> BTW, consider measuring with the --bypass-cache flag to virsh save.
> This causes libvirt to use a I/O helper that uses O_DIRECT when
> saving the image. This should give more predictable results by
> avoiding the influence of host I/O cache which can be in a differnt
> state of usage each time you measure.  It was also intended that
> by avoiding hitting cache, saving the memory image of a large VM
> will not push other useful stuff out of host I/O  cache which can
> negatively impact other running VMs.
> 
> Also it is possible to configure compression on the libvirt side
> which may be useful if you have spare CPU cycles, but your storage
> is slow. See 'save_image_format' in the /etc/libvirt/qemu.conf
> 
> With regards,
> Daniel
> 

Hi Daniel, thanks for these good info,

regarding slow storage, for these tests I am saving to /dev/null to avoid 
having to take storage into account
(and still getting low bandwidth unfortunately) so I guess compression is out 
of the question.

Thanks!

Claudio



reply via email to

[Prev in Thread] Current Thread [Next in Thread]