Re: [PATCH v2 0/4] Dirty ring and auto converge optimization

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/4] Dirty ring and auto converge optimization

From:	Chongyun Wu
Subject:	Re: [PATCH v2 0/4] Dirty ring and auto converge optimization
Date:	Sat, 2 Apr 2022 10:13:36 +0800
User-agent:	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0

Thanks for review.

On 4/1/2022 9:13 PM, Peter Xu wrote:

Chongyun,

On Mon, Mar 28, 2022 at 09:32:10AM +0800, wucy11@chinatelecom.cn wrote:

From: Chongyun Wu <wucy11@chinatelecom.cn>

v2:
-patch 1: remove patch_1

v1:
-rebase to qemu/master

Overview
============
This series of patches is to optimize the performance of
online migration using dirty ring and autoconverge.

Mainly through the following aspects to do optimization:
1. Dynamically adjust the dirty ring collection thread to
reduce the occurrence of ring full, thereby reducing the
impact on customers, improving the efficiency of dirty
page collection, and thus improving the migration efficiency.

2. When collecting dirty pages from KVM,
kvm_cpu_synchronize_kick_all is not called if the rate is
limited, and it is called only once before suspending the
virtual machine. Because kvm_cpu_synchronize_kick_all will
become very time-consuming when the CPU is limited, and
there will not be too many dirty pages, so it only needs
to be called once before suspending the virtual machine to
ensure that dirty pages will not be lost and the efficiency
of migration is guaranteed .

3. Based on the characteristic of collecting dirty pages
in the dirty ring, a new dirty page rate calculation method
is proposed to obtain a more accurate dirty page rate.

4. Use a more accurate dirty page rate and calculate the
matched speed limit throttle required to complete the
migration according to the current system bandwidth and
parameters, instead of the current time-consuming method
of trying to get a speed limit, greatly reducing migration
time.


Thanks for the patches.

I'm curious what's the relationship between this series and Yong's?

I personally think it is a complementary relationship. Yong's can limitper-vcpu. In the case of memory pressure threads in certain vcpu scenarios, therestrictions on other vcpus are very small, and the impact on customers duringthe migration process will be smaller. The auto-convergence optimization of thelast two patches in this patch series can cope with scenarios where the memorypressure is balanced on each vcpu. Each has its own advantages, and customerscan choose the appropriate mode according to their own application scenarios.The first two patches are for the dirty ring, and both auto converge and yongmodes can improve performance.


If talking about throttling, I do think the old auto-converge was kind of
inefficient comparing to the new per-vcpu ways of throttling at least in
either granularity or on read tolerances (e.g., dirty ring based solution
will not block vcpu readers even if the thread is heavily throttled).

Yes, I agree with that. Through the research of dirty ring and a lot of tests,some points that may affect the advantages of dirty ring have been found, sosome optimizations have been made, and these optimizations are found to beeffective through testing and verification.In this patch series, only the last two patches are optimized for autocoverge.The first two patches are for all situations where the dirty ring is used,including Yong's, and there is no conflict with his. Among them, "kvm:Dynamically adjust the rate of dirty ring reaper thread" is proposed to takeadvantage of dirty ring. When the memory pressure is high, speeding up the rateat which the reaper thread collects dirty pages can effectively solve theproblem that the frequent occurrence of ring full leads to the frequent exit ofthe guest and the performance of the guestperf is degraded. When the migrationthread migrates data, it also completes the synchronization of most dirty pages.When the migration thread of the dirty ring synchronizes the dirty pages, itwill take less time, which will also speed up the migration. These two patcheswill make yong's test results better, and the two optimization points are different.

We've got quite a few techniques taking care of migration convergence
issues (didn't mention postcopy yet..), and I'm wondering whether at some
point we should be more focused and make a chosen one better, rather than
building different blocks servicing the same purpose.

I'm sorry, maybe I should separate these patch series to avoidmisunderstandings. These patches and yong's should be complementary, and two ofthem can also help yong get some performance improvements.


Thanks,


--
Best Regard,
Chongyun Wu

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v2 0/4] Dirty ring and auto converge optimization, Peter Xu, 2022/04/01
- Re: [PATCH v2 0/4] Dirty ring and auto converge optimization, Chongyun Wu <=

Prev by Date: Re: [PATCH 1/7] virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa
Next by Date: Re: [PATCH v3 1/4] hw/arm/virt: Consider SMP configuration in CPU topology
Previous by thread: Re: [PATCH v2 0/4] Dirty ring and auto converge optimization
Next by thread: Re: [PATCH v5 3/4] hw/intc/exynos4210: replace 'qemu_split_irq' in combiner and gic
Index(es):
- Date
- Thread