[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] vm performance degradation after kvm live migration or
From: |
Xiao Guangrong |
Subject: |
Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled |
Date: |
Thu, 11 Jul 2013 18:39:39 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6 |
Hi,
Could you please test this patch?
>From 48df7db2ec2721e35d024a8d9850dbb34b557c1c Mon Sep 17 00:00:00 2001
From: Xiao Guangrong <address@hidden>
Date: Thu, 6 Sep 2012 16:56:01 +0800
Subject: [PATCH 10/11] using huge page on fast page fault path
---
arch/x86/kvm/mmu.c | 27 ++++++++++++++++++++-------
1 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6945ef4..7d177c7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2663,6 +2663,13 @@ static int kvm_handle_bad_page(struct kvm_vcpu *vcpu,
gfn_t gfn, pfn_t pfn)
return -EFAULT;
}
+static bool pfn_can_adjust(pfn_t pfn, int level)
+{
+ return !is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
+ level == PT_PAGE_TABLE_LEVEL &&
+ PageTransCompound(pfn_to_page(pfn));
+}
+
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
gfn_t *gfnp, pfn_t *pfnp, int *levelp)
{
@@ -2676,10 +2683,8 @@ static void transparent_hugepage_adjust(struct kvm_vcpu
*vcpu,
* PT_PAGE_TABLE_LEVEL and there would be no adjustment done
* here.
*/
- if (!is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
- level == PT_PAGE_TABLE_LEVEL &&
- PageTransCompound(pfn_to_page(pfn)) &&
- !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
+ if (pfn_can_adjust(pfn, level) &&
+ !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
unsigned long mask;
/*
* mmu_notifier_retry was successful and we hold the
@@ -2768,7 +2773,7 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, u64
*sptep, u64 spte)
* - false: let the real page fault path to fix it.
*/
static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
- u32 error_code)
+ u32 error_code, bool force_pt_level)
{
struct kvm_shadow_walk_iterator iterator;
bool ret = false;
@@ -2795,6 +2800,14 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t
gva, int level,
goto exit;
/*
+ * Let the real page fault path change the mapping if large
+ * mapping is allowed, for example, the memslot dirty log is
+ * disabled.
+ */
+ if (!force_pt_level && pfn_can_adjust(spte_to_pfn(spte), level))
+ goto exit;
+
+ /*
* Check if it is a spurious fault caused by TLB lazily flushed.
*
* Need not check the access of upper level table entries since
@@ -2854,7 +2867,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v,
u32 error_code,
} else
level = PT_PAGE_TABLE_LEVEL;
- if (fast_page_fault(vcpu, v, level, error_code))
+ if (fast_page_fault(vcpu, v, level, error_code, force_pt_level))
return 0;
mmu_seq = vcpu->kvm->mmu_notifier_seq;
@@ -3323,7 +3336,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t
gpa, u32 error_code,
} else
level = PT_PAGE_TABLE_LEVEL;
- if (fast_page_fault(vcpu, gpa, level, error_code))
+ if (fast_page_fault(vcpu, gpa, level, error_code, force_pt_level))
return 0;
mmu_seq = vcpu->kvm->mmu_notifier_seq;
--
1.7.7.6
On 07/11/2013 05:36 PM, Zhanghaoyu (A) wrote:
> hi all,
>
> I met similar problem to these, while performing live migration or
> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2,
> guest:suse11sp2), running tele-communication software suite in guest,
> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> https://bugzilla.kernel.org/show_bug.cgi?id=58771
>
> After live migration or virsh restore [savefile], one process's CPU
> utilization went up by about 30%, resulted in throughput degradation of this
> process.
> oprofile report on this process in guest,
> pre live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples % app name symbol name
> 248 12.3016 no-vmlinux (no symbols)
> 78 3.8690 libc.so.6 memset
> 68 3.3730 libc.so.6 memcpy
> 30 1.4881 cscf.scu SipMmBufMemAlloc
> 29 1.4385 libpthread.so.0 pthread_mutex_lock
> 26 1.2897 cscf.scu SipApiGetNextIe
> 25 1.2401 cscf.scu DBFI_DATA_Search
> 20 0.9921 libpthread.so.0 __pthread_mutex_unlock_usercnt
> 16 0.7937 cscf.scu DLM_FreeSlice
> 16 0.7937 cscf.scu receivemessage
> 15 0.7440 cscf.scu SipSmCopyString
> 14 0.6944 cscf.scu DLM_AllocSlice
>
> post live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples % app name symbol name
> 1586 42.2370 libc.so.6 memcpy
> 271 7.2170 no-vmlinux (no symbols)
> 83 2.2104 libc.so.6 memset
> 41 1.0919 libpthread.so.0 __pthread_mutex_unlock_usercnt
> 35 0.9321 cscf.scu SipMmBufMemAlloc
> 29 0.7723 cscf.scu DLM_AllocSlice
> 28 0.7457 libpthread.so.0 pthread_mutex_lock
> 23 0.6125 cscf.scu SipApiGetNextIe
> 17 0.4527 cscf.scu SipSmCopyString
> 16 0.4261 cscf.scu receivemessage
> 15 0.3995 cscf.scu SipcMsgStatHandle
> 14 0.3728 cscf.scu Urilex
> 12 0.3196 cscf.scu DBFI_DATA_Search
> 12 0.3196 cscf.scu SipDsmGetHdrBitValInner
> 12 0.3196 cscf.scu SipSmGetDataFromRefString
>
> So, memcpy costs much more cpu cycles after live migration. Then, I restart
> the process, this problem disappeared. save-restore has the similar problem.
>
> perf report on vcpu thread in host,
> pre live migration:
> Performance counter stats for thread id '21082':
>
> 0 page-faults
> 0 minor-faults
> 0 major-faults
> 31616 cs
> 506 migrations
> 0 alignment-faults
> 0 emulation-faults
> 5075957539 L1-dcache-loads
> [21.32%]
> 324685106 L1-dcache-load-misses # 6.40% of all L1-dcache hits
> [21.85%]
> 3681777120 L1-dcache-stores
> [21.65%]
> 65251823 L1-dcache-store-misses # 1.77%
> [22.78%]
> 0 L1-dcache-prefetches
> [22.84%]
> 0 L1-dcache-prefetch-misses
> [22.32%]
> 9321652613 L1-icache-loads
> [22.60%]
> 1353418869 L1-icache-load-misses # 14.52% of all L1-icache hits
> [21.92%]
> 169126969 LLC-loads
> [21.87%]
> 12583605 LLC-load-misses # 7.44% of all LL-cache hits
> [ 5.84%]
> 132853447 LLC-stores
> [ 6.61%]
> 10601171 LLC-store-misses #7.9%
> [ 5.01%]
> 25309497 LLC-prefetches #30%
> [ 4.96%]
> 7723198 LLC-prefetch-misses
> [ 6.04%]
> 4954075817 dTLB-loads
> [11.56%]
> 26753106 dTLB-load-misses # 0.54% of all dTLB cache
> hits [16.80%]
> 3553702874 dTLB-stores
> [22.37%]
> 4720313 dTLB-store-misses #0.13%
> [21.46%]
> <not counted> dTLB-prefetches
> <not counted> dTLB-prefetch-misses
>
> 60.000920666 seconds time elapsed
>
> post live migration:
> Performance counter stats for thread id '1579':
>
> 0 page-faults
> [100.00%]
> 0 minor-faults
> [100.00%]
> 0 major-faults
> [100.00%]
> 34979 cs
> [100.00%]
> 441 migrations
> [100.00%]
> 0 alignment-faults
> [100.00%]
> 0 emulation-faults
> 6903585501 L1-dcache-loads
> [22.06%]
> 525939560 L1-dcache-load-misses # 7.62% of all L1-dcache hits
> [21.97%]
> 5042552685 L1-dcache-stores
> [22.20%]
> 94493742 L1-dcache-store-misses #1.8%
> [22.06%]
> 0 L1-dcache-prefetches
> [22.39%]
> 0 L1-dcache-prefetch-misses
> [22.47%]
> 13022953030 L1-icache-loads
> [22.25%]
> 1957161101 L1-icache-load-misses # 15.03% of all L1-icache hits
> [22.47%]
> 348479792 LLC-loads
> [22.27%]
> 80662778 LLC-load-misses # 23.15% of all LL-cache hits
> [ 5.64%]
> 198745620 LLC-stores
> [ 5.63%]
> 14236497 LLC-store-misses # 7.1%
> [ 5.41%]
> 20757435 LLC-prefetches
> [ 5.42%]
> 5361819 LLC-prefetch-misses # 25%
> [ 5.69%]
> 7235715124 dTLB-loads
> [11.26%]
> 49895163 dTLB-load-misses # 0.69% of all dTLB cache
> hits [16.96%]
> 5168276218 dTLB-stores
> [22.44%]
> 6765983 dTLB-store-misses #0.13%
> [22.24%]
> <not counted> dTLB-prefetches
> <not counted> dTLB-prefetch-misses
>
> The "LLC-load-misses" went up by about 16%. Then, I restarted the process in
> guest, the perf data back to normal,
> Performance counter stats for thread id '1579':
>
> 0 page-faults
> [100.00%]
> 0 minor-faults
> [100.00%]
> 0 major-faults
> [100.00%]
> 30594 cs
> [100.00%]
> 327 migrations
> [100.00%]
> 0 alignment-faults
> [100.00%]
> 0 emulation-faults
> 7707091948 L1-dcache-loads
> [22.10%]
> 559829176 L1-dcache-load-misses # 7.26% of all L1-dcache hits
> [22.28%]
> 5976654983 L1-dcache-stores
> [23.22%]
> 160436114 L1-dcache-store-misses
> [22.80%]
> 0 L1-dcache-prefetches
> [22.51%]
> 0 L1-dcache-prefetch-misses
> [22.53%]
> 13798415672 L1-icache-loads
> [22.28%]
> 2017724676 L1-icache-load-misses # 14.62% of all L1-icache hits
> [22.49%]
> 254598008 LLC-loads
> [22.86%]
> 16035378 LLC-load-misses # 6.30% of all LL-cache hits
> [ 5.36%]
> 307019606 LLC-stores
> [ 5.60%]
> 13665033 LLC-store-misses
> [ 5.43%]
> 17715554 LLC-prefetches
> [ 5.57%]
> 4187006 LLC-prefetch-misses
> [ 5.44%]
> 7811502895 dTLB-loads
> [10.72%]
> 40547330 dTLB-load-misses # 0.52% of all dTLB cache
> hits [16.31%]
> 6144202516 dTLB-stores
> [21.58%]
> 6313363 dTLB-store-misses
> [21.91%]
> <not counted> dTLB-prefetches
> <not counted> dTLB-prefetch-misses
>
> 60.000812523 seconds time elapsed
>
> If EPT disabled, this problem gone.
>
> I suspect that kvm hypervisor has business with this problem.
> Based on above suspect, I want to find the two adjacent versions of kvm-kmod
> which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),
> and analyze the differences between this two versions, or apply the patches
> between this two versions by bisection method, finally find the key patches.
>
> Any better ideas?
>
> Thanks,
> Zhang Haoyu
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to address@hidden
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>