|
From: | David Hildenbrand |
Subject: | Re: [PATCH v1 1/3] util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc() |
Date: | Tue, 20 Jul 2021 16:34:04 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 |
memset_thread_failed = false; threads_created_flag = false; memset_num_threads = get_memset_num_threads(smp_cpus); @@ -534,7 +558,7 @@ static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages, memset_thread[i].numpages = numpages_per_thread + (i < leftover); memset_thread[i].hpagesize = hpagesize; qemu_thread_create(&memset_thread[i].pgthread, "touch_pages", - do_touch_pages, &memset_thread[i], + touch_fn, &memset_thread[i], QEMU_THREAD_JOINABLE); addr += memset_thread[i].numpages * hpagesize; }Do you have an indication of what the speed differential is for the old read/write dance vs the kernel madvise. We needed to use threads previously because the read/write dance is pretty terribly slow.
The kernel patch has some performance numbers: https://lkml.kernel.org/r/20210712083917.16361-1-david@redhat.com
For example (compressed), ************************************************** 4096 MiB MAP_PRIVATE: ************************************************** Anon 4 KiB : Read/Write : 1054.041 ms Anon 4 KiB : POPULATE_WRITE : 572.582 ms Memfd 4 KiB : Read/Write : 1106.561 ms Memfd 4 KiB : POPULATE_WRITE : 805.881 ms Memfd 2 MiB : Read/Write : 357.606 ms Memfd 2 MiB : POPULATE_WRITE : 356.937 ms tmpfs : Read/Write : 1105.954 ms tmpfs : POPULATE_WRITE : 822.826 ms file : Read/Write : 1107.439 ms file : POPULATE_WRITE : 857.622 ms hugetlbfs : Read/Write : 356.127 ms hugetlbfs : POPULATE_WRITE : 355.138 ms 4096 MiB MAP_SHARED: ************************************************** Anon 4 KiB : Read/Write : 1060.350 m Anon 4 KiB : POPULATE_WRITE : 782.885 ms Anon 2 MiB : Read/Write : 357.992 ms Anon 2 MiB : POPULATE_WRITE : 357.808 ms Memfd 4 KiB : Read/Write : 1100.391 ms Memfd 4 KiB : POPULATE_WRITE : 804.394 ms Memfd 2 MiB : Read/Write : 358.250 ms Memfd 2 MiB : POPULATE_WRITE : 357.334 ms tmpfs : Read/Write : 1107.567 ms tmpfs : POPULATE_WRITE : 810.094 ms file : Read/Write : 1289.509 ms file : POPULATE_WRITE : 1106.816 ms hugetlbfs : Read/Write : 357.120 ms hugetlbfs : POPULATE_WRITE : 356.693 msFor huge pages, it barely makes a difference with smallish VMs. In the other cases, it speeds it up, but not as extreme as that it would allow for dropping multi-threading.
The original MADV_POPULATE from 2016 https://lore.kernel.org/patchwork/patch/389581/ mentiones that it especially helps speed up multi-threaded pre-faulting, due to reduced mmap_lock contention. I did not do any multi-threading benchmarks, though.
[...]
Initialized with random garbage from the stack+ + /* + * Sense on every invocation, as MADV_POPULATE_WRITE cannot be used for + * some special mappings, such as mapping /dev/mem. + */ + if (madv_populate_write_possible(area, hpagesize)) { + use_madv_populate_write = true; + }but this implicitly assumes it was initialized to false.
Indeed, thanks for catching that! -- Thanks, David / dhildenb
[Prev in Thread] | Current Thread | [Next in Thread] |