Re: [Qemu-devel] vm performance degradation after kvm live migration or

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] vm performance degradation after kvm live migration or

From:	Xiao Guangrong
Subject:	Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled
Date:	Thu, 11 Jul 2013 18:39:39 +0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6

Hi,

Could you please test this patch?

>From 48df7db2ec2721e35d024a8d9850dbb34b557c1c Mon Sep 17 00:00:00 2001
From: Xiao Guangrong <address@hidden>
Date: Thu, 6 Sep 2012 16:56:01 +0800
Subject: [PATCH 10/11] using huge page on fast page fault path

---
 arch/x86/kvm/mmu.c |   27 ++++++++++++++++++++-------
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6945ef4..7d177c7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2663,6 +2663,13 @@ static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, 
gfn_t gfn, pfn_t pfn)
        return -EFAULT;
 }

+static bool pfn_can_adjust(pfn_t pfn, int level)
+{
+       return !is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
+                  level == PT_PAGE_TABLE_LEVEL &&
+                     PageTransCompound(pfn_to_page(pfn));
+}
+
 static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
                                        gfn_t *gfnp, pfn_t *pfnp, int *levelp)
 {
@@ -2676,10 +2683,8 @@ static void transparent_hugepage_adjust(struct kvm_vcpu 
*vcpu,
         * PT_PAGE_TABLE_LEVEL and there would be no adjustment done
         * here.
         */
-       if (!is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
-           level == PT_PAGE_TABLE_LEVEL &&
-           PageTransCompound(pfn_to_page(pfn)) &&
-           !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
+       if (pfn_can_adjust(pfn, level) &&
+             !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
                unsigned long mask;
                /*
                 * mmu_notifier_retry was successful and we hold the
@@ -2768,7 +2773,7 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, u64 
*sptep, u64 spte)
  * - false: let the real page fault path to fix it.
  */
 static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
-                           u32 error_code)
+                           u32 error_code, bool force_pt_level)
 {
        struct kvm_shadow_walk_iterator iterator;
        bool ret = false;
@@ -2795,6 +2800,14 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t 
gva, int level,
                goto exit;

        /*
+        * Let the real page fault path change the mapping if large
+        * mapping is allowed, for example, the memslot dirty log is
+        * disabled.
+        */
+       if (!force_pt_level && pfn_can_adjust(spte_to_pfn(spte), level))
+               goto exit;
+
+       /*
         * Check if it is a spurious fault caused by TLB lazily flushed.
         *
         * Need not check the access of upper level table entries since
@@ -2854,7 +2867,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, 
u32 error_code,
        } else
                level = PT_PAGE_TABLE_LEVEL;

-       if (fast_page_fault(vcpu, v, level, error_code))
+       if (fast_page_fault(vcpu, v, level, error_code, force_pt_level))
                return 0;

        mmu_seq = vcpu->kvm->mmu_notifier_seq;
@@ -3323,7 +3336,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t 
gpa, u32 error_code,
        } else
                level = PT_PAGE_TABLE_LEVEL;

-       if (fast_page_fault(vcpu, gpa, level, error_code))
+       if (fast_page_fault(vcpu, gpa, level, error_code, force_pt_level))
                return 0;

        mmu_seq = vcpu->kvm->mmu_notifier_seq;
-- 
1.7.7.6


On 07/11/2013 05:36 PM, Zhanghaoyu (A) wrote:
> hi all,
> 
> I met similar problem to these, while performing live migration or 
> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, 
> guest:suse11sp2), running tele-communication software suite in guest,
> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> 
> After live migration or virsh restore [savefile], one process's CPU 
> utilization went up by about 30%, resulted in throughput degradation of this 
> process.
> oprofile report on this process in guest,
> pre live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        app name                 symbol name
> 248      12.3016  no-vmlinux               (no symbols)
> 78        3.8690  libc.so.6                memset
> 68        3.3730  libc.so.6                memcpy
> 30        1.4881  cscf.scu                 SipMmBufMemAlloc
> 29        1.4385  libpthread.so.0          pthread_mutex_lock
> 26        1.2897  cscf.scu                 SipApiGetNextIe
> 25        1.2401  cscf.scu                 DBFI_DATA_Search
> 20        0.9921  libpthread.so.0          __pthread_mutex_unlock_usercnt
> 16        0.7937  cscf.scu                 DLM_FreeSlice
> 16        0.7937  cscf.scu                 receivemessage
> 15        0.7440  cscf.scu                 SipSmCopyString
> 14        0.6944  cscf.scu                 DLM_AllocSlice
> 
> post live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        app name                 symbol name
> 1586     42.2370  libc.so.6                memcpy
> 271       7.2170  no-vmlinux               (no symbols)
> 83        2.2104  libc.so.6                memset
> 41        1.0919  libpthread.so.0          __pthread_mutex_unlock_usercnt
> 35        0.9321  cscf.scu                 SipMmBufMemAlloc
> 29        0.7723  cscf.scu                 DLM_AllocSlice
> 28        0.7457  libpthread.so.0          pthread_mutex_lock
> 23        0.6125  cscf.scu                 SipApiGetNextIe
> 17        0.4527  cscf.scu                 SipSmCopyString
> 16        0.4261  cscf.scu                 receivemessage
> 15        0.3995  cscf.scu                 SipcMsgStatHandle
> 14        0.3728  cscf.scu                 Urilex
> 12        0.3196  cscf.scu                 DBFI_DATA_Search
> 12        0.3196  cscf.scu                 SipDsmGetHdrBitValInner
> 12        0.3196  cscf.scu                 SipSmGetDataFromRefString
> 
> So, memcpy costs much more cpu cycles after live migration. Then, I restart 
> the process, this problem disappeared. save-restore has the similar problem.
> 
> perf report on vcpu thread in host,
> pre live migration:
> Performance counter stats for thread id '21082':
> 
>                  0 page-faults
>                  0 minor-faults
>                  0 major-faults
>              31616 cs
>                506 migrations
>                  0 alignment-faults
>                  0 emulation-faults
>         5075957539 L1-dcache-loads                                            
>   [21.32%]
>          324685106 L1-dcache-load-misses     #    6.40% of all L1-dcache hits 
>   [21.85%]
>         3681777120 L1-dcache-stores                                           
>   [21.65%]
>           65251823 L1-dcache-store-misses    # 1.77%                          
>          [22.78%]
>                  0 L1-dcache-prefetches                                       
>   [22.84%]
>                  0 L1-dcache-prefetch-misses                                  
>   [22.32%]
>         9321652613 L1-icache-loads                                            
>   [22.60%]
>         1353418869 L1-icache-load-misses     #   14.52% of all L1-icache hits 
>   [21.92%]
>          169126969 LLC-loads                                                  
>   [21.87%]
>           12583605 LLC-load-misses           #    7.44% of all LL-cache hits  
>   [ 5.84%]
>          132853447 LLC-stores                                                 
>   [ 6.61%]
>           10601171 LLC-store-misses          #7.9%                            
>        [ 5.01%]
>           25309497 LLC-prefetches             #30%                            
>       [ 4.96%]
>            7723198 LLC-prefetch-misses                                        
>   [ 6.04%]
>         4954075817 dTLB-loads                                                 
>   [11.56%]
>           26753106 dTLB-load-misses          #    0.54% of all dTLB cache 
> hits  [16.80%]
>         3553702874 dTLB-stores                                                
>   [22.37%]
>            4720313 dTLB-store-misses        #0.13%                            
>         [21.46%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
>       60.000920666 seconds time elapsed
> 
> post live migration:
> Performance counter stats for thread id '1579':
> 
>                  0 page-faults                                                
>   [100.00%]
>                  0 minor-faults                                               
>   [100.00%]
>                  0 major-faults                                               
>   [100.00%]
>              34979 cs                                                         
>   [100.00%]
>                441 migrations                                                 
>   [100.00%]
>                  0 alignment-faults                                           
>   [100.00%]
>                  0 emulation-faults
>         6903585501 L1-dcache-loads                                            
>   [22.06%]
>          525939560 L1-dcache-load-misses     #    7.62% of all L1-dcache hits 
>   [21.97%]
>         5042552685 L1-dcache-stores                                           
>   [22.20%]
>           94493742 L1-dcache-store-misses    #1.8%                            
>        [22.06%]
>                  0 L1-dcache-prefetches                                       
>   [22.39%]
>                  0 L1-dcache-prefetch-misses                                  
>   [22.47%]
>        13022953030 L1-icache-loads                                            
>   [22.25%]
>         1957161101 L1-icache-load-misses     #   15.03% of all L1-icache hits 
>   [22.47%]
>          348479792 LLC-loads                                                  
>   [22.27%]
>           80662778 LLC-load-misses           #   23.15% of all LL-cache hits  
>   [ 5.64%]
>          198745620 LLC-stores                                                 
>   [ 5.63%]
>           14236497 LLC-store-misses          #   7.1%                         
>            [ 5.41%]
>           20757435 LLC-prefetches                                             
>   [ 5.42%]
>            5361819 LLC-prefetch-misses       #   25%                          
>       [ 5.69%]
>         7235715124 dTLB-loads                                                 
>   [11.26%]
>           49895163 dTLB-load-misses          #    0.69% of all dTLB cache 
> hits  [16.96%]
>         5168276218 dTLB-stores                                                
>   [22.44%]
>            6765983 dTLB-store-misses        #0.13%                            
>         [22.24%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
> The "LLC-load-misses" went up by about 16%. Then, I restarted the process in 
> guest, the perf data back to normal,
> Performance counter stats for thread id '1579':
> 
>                  0 page-faults                                                
>   [100.00%]
>                  0 minor-faults                                               
>   [100.00%]
>                  0 major-faults                                               
>   [100.00%]
>              30594 cs                                                         
>   [100.00%]
>                327 migrations                                                 
>   [100.00%]
>                  0 alignment-faults                                           
>   [100.00%]
>                  0 emulation-faults
>         7707091948 L1-dcache-loads                                            
>   [22.10%]
>          559829176 L1-dcache-load-misses     #    7.26% of all L1-dcache hits 
>   [22.28%]
>         5976654983 L1-dcache-stores                                           
>   [23.22%]
>          160436114 L1-dcache-store-misses                                     
>   [22.80%]
>                  0 L1-dcache-prefetches                                       
>   [22.51%]
>                  0 L1-dcache-prefetch-misses                                  
>   [22.53%]
>        13798415672 L1-icache-loads                                            
>   [22.28%]
>         2017724676 L1-icache-load-misses     #   14.62% of all L1-icache hits 
>   [22.49%]
>          254598008 LLC-loads                                                  
>   [22.86%]
>           16035378 LLC-load-misses           #    6.30% of all LL-cache hits  
>   [ 5.36%]
>          307019606 LLC-stores                                                 
>   [ 5.60%]
>           13665033 LLC-store-misses                                           
>   [ 5.43%]
>           17715554 LLC-prefetches                                             
>   [ 5.57%]
>            4187006 LLC-prefetch-misses                                        
>   [ 5.44%]
>         7811502895 dTLB-loads                                                 
>   [10.72%]
>           40547330 dTLB-load-misses          #    0.52% of all dTLB cache 
> hits  [16.31%]
>         6144202516 dTLB-stores                                                
>   [21.58%]
>            6313363 dTLB-store-misses                                          
>   [21.91%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
>       60.000812523 seconds time elapsed
> 
> If EPT disabled, this problem gone.
> 
> I suspect that kvm hypervisor has business with this problem.
> Based on above suspect, I want to find the two adjacent versions of kvm-kmod 
> which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),
> and analyze the differences between this two versions, or apply the patches 
> between this two versions by bisection method, finally find the key patches.
> 
> Any better ideas?
> 
> Thanks,
> Zhang Haoyu
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to address@hidden
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Zhanghaoyu (A), 2013/07/11
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Michael S. Tsirkin, 2013/07/11
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Gleb Natapov, 2013/07/11
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Xiao Guangrong <=
  - Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Zhang Haoyu, 2013/07/11
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Andreas Färber, 2013/07/11
  - Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Zhanghaoyu (A), 2013/07/11
- Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Bruce Rogers, 2013/07/11
  - Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Zhanghaoyu (A), 2013/07/27
    - Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Andrea Arcangeli, 2013/07/29
    - Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Marcelo Tosatti, 2013/07/29
    - Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled, Zhanghaoyu (A), 2013/07/30

Prev by Date: Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled
Next by Date: Re: [Qemu-devel] [PATCH qom-next v2 3/5] target-arm: Use parent classes for reset + realize
Previous by thread: Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled
Next by thread: Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled
Index(es):
- Date
- Thread