[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH 0/1] Solve zero page causing multiple page faults
|
From: |
Liu, Yuan1 |
|
Subject: |
RE: [PATCH 0/1] Solve zero page causing multiple page faults |
|
Date: |
Tue, 2 Apr 2024 07:43:21 +0000 |
> -----Original Message-----
> From: Liu, Yuan1 <yuan1.liu@intel.com>
> Sent: Monday, April 1, 2024 11:41 PM
> To: peterx@redhat.com; farosas@suse.de
> Cc: qemu-devel@nongnu.org; hao.xiang@bytedance.com;
> bryan.zhang@bytedance.com; Liu, Yuan1 <yuan1.liu@intel.com>; Zou, Nanhai
> <nanhai.zou@intel.com>
> Subject: [PATCH 0/1] Solve zero page causing multiple page faults
>
> 1. Description of multiple page faults for received zero pages
> a. -mem-prealloc feature and hugepage backend are not enabled on the
> destination
> b. After receiving the zero pages, the destination first determines if
> the current page content is 0 via buffer_is_zero, this may cause a
> read page fault
>
> perf record -e page-faults information below
> 13.75% 13.75% multifdrecv_0 qemu-system-x86_64 [.]
> buffer_zero_avx512
> 11.85% 11.85% multifdrecv_1 qemu-system-x86_64 [.]
> buffer_zero_avx512
> multifd_recv_thread
> nocomp_recv
> multifd_recv_zero_page_process
> buffer_is_zero
> select_accel_fn
> buffer_zero_avx512
>
> c. Other page faults mainly come from writing operations to normal and
> zero pages.
>
> 2. Solution
> a. During the multifd migration process, the received pages are
> tracked
> through RAMBlock's receivedmap.
>
> b. If received zero page is not set in recvbitmap, the destination
> will not
> check whether the page content is 0, thus avoiding the occurrence
> of
> read fault.
>
> c. If the zero page has been set in receivedmap, set the page with 0
> directly.
>
> There are two reasons for this
> 1. It's unlikely a zero page if it's sent once or more.
> 2. For the 1st time destination received a zero page, it must be a
> zero
> page, so no need to scan for the 1st round.
>
> 3. Test Result 16 vCPUs and 64G memory VM, multifd number is 2,
> and 100G network bandwidth
>
> 3.1 Test case: 16 vCPUs are idle and only 2G memory are used
> +-----------+--------+--------+----------+
> |MultiFD | total |downtime| Page |
> |Nocomp | time | | Faults |
> | | (ms) | (ms) | |
> +-----------+--------+--------+----------+
> |with | | | |
> |recvbitmap | 7335| 180| 2716|
> +-----------+--------+--------+----------+
> |without | | | |
> |recvbitmap | 7771| 153| 121357|
> +-----------+--------+--------+----------+
>
> +-----------+--------+--------+--------+-------+--------+-------------
> +
> |MultiFD | total |downtime| SVM |SVM | IOTLB | IO
> PageFault|
> |QPL | time | | IO TLB |IO Page| MaxTime| MaxTime
> |
> | | (ms) | (ms) | Flush |Faults | (us) | (us)
> |
> +-----------+--------+--------+--------+-------+--------+-------------
> +
> |with | | | | | |
> |
> |recvbitmap | 10224| 175| 410| 27429| 1|
> 447|
> +-----------+--------+--------+--------+-------+--------+-------------
> +
> |without | | | | | |
> |
> |recvbitmap | 11253| 153| 80756| 38655| 25|
> 18349|
> +-----------+--------+--------+--------+-------+--------+-------------
> +
>
>
> 3.2 Test case: 16 vCPUs are idle and 56G memory(not zero) are used
> +-----------+--------+--------+----------+
> |MultiFD | total |downtime| Page |
> |Nocomp | time | | Faults |
> | | (ms) | (ms) | |
> +-----------+--------+--------+----------+
> |with | | | |
> |recvbitmap | 16825| 165| 52967|
> +-----------+--------+--------+----------+
> |without | | | |
> |recvbitmap | 12987| 159| 2672677|
> +-----------+--------+--------+----------+
>
> +-----------+--------+--------+--------+-------+--------+-------------
> +
> |MultiFD | total |downtime| SVM |SVM | IOTLB | IO
> PageFault|
> |QPL | time | | IO TLB |IO Page| MaxTime| MaxTime
> |
> | | (ms) | (ms) | Flush |Faults | (us) | (us)
> |
> +-----------+--------+--------+--------+-------+--------+-------------
> +
> |with | | | | | |
> |
> |recvbitmap | 132315| 77| 890| 937105| 60|
> 9581|
> +-----------+--------+--------+--------+-------+--------+-------------
> +
> |without | | | | | |
> |
> |recvbitmap | >138333| N/A| 1647701| 981899| 43|
> 21018|
> +-----------+--------+--------+--------+-------+--------+-------------
> +
>
>
> From the test result, both of page faults and IOTLB Flush operations can
> be significantly reduced. The reason is that zero page processing does not
> trigger read faults, and a large number of zero pages do not even trigger
> write faults (Test 3.1), because it is considered that after the
> destination
> is started, the content of unaccessed pages is 0.
>
> I have a concern here, the RAM memory is allocated by mmap with anonymous
> flag, and if the first received zero page is not set to 0 explicitly, does
> this ensure that the received zero pages memory data is 0?
I got the answer here
MAP_ANONYMOUS
The mapping is not backed by any file; its contents are initialized to zero.
The fd argument is ignored; however, some implementations require fd to be -1
if MAP_ANONYMOUS (or MAP_ANON) is specified, and porta�\
ble applications should ensure this. The offset argument should be zero. The
use of MAP_ANONYMOUS in conjunction with MAP_SHARED is supported on Linux only
since kernel 2.4.
> In this case, the performance impact of live migration is not big
> because the destination is not the bottleneck.
>
> When using QPL (SVM-capable device), even if IOTLB is improved, the
> overall performance will still be seriously degraded because a large
> number of IO page faults are still generated.
>
> Previous discussion link:
> 1.
> https://lore.kernel.org/all/CAAYibXib+TWnJpV22E=adncdBmwXJRqgRjJXK7X71J=bD
> faxDg@mail.gmail.com/
> 2.
> https://lore.kernel.org/all/PH7PR11MB594123F7EEFEBFCE219AF100A33A2@PH7PR11
> MB5941.namprd11.prod.outlook.com/
>
> Yuan Liu (1):
> migration/multifd: solve zero page causing multiple page faults
>
> migration/multifd-zero-page.c | 4 +++-
> migration/multifd-zlib.c | 1 +
> migration/multifd-zstd.c | 1 +
> migration/multifd.c | 1 +
> migration/ram.c | 4 ++++
> migration/ram.h | 1 +
> 6 files changed, 11 insertions(+), 1 deletion(-)
>
> --
> 2.39.3