Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc

From:	David Hildenbrand
Subject:	Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout
Date:	Mon, 23 Jan 2023 14:56:39 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0

On 23.01.23 14:30, Daniil Tatianin wrote:

On 1/23/23 11:57 AM, David Hildenbrand wrote:

On 20.01.23 14:47, Daniil Tatianin wrote:

This series introduces new qemu_prealloc_mem_with_timeout() api,
which allows limiting the maximum amount of time to be spent on memory
preallocation. It also adds prealloc statistics collection that is
exposed via an optional timeout handler.

This new api is then utilized by hostmem for guest RAM preallocation
controlled via new object properties called 'prealloc-timeout' and
'prealloc-timeout-fatal'.

This is useful for limiting VM startup time on systems with
unpredictable page allocation delays due to memory fragmentation or the
backing storage. The timeout can be configured to either simply emit a
warning and continue VM startup without having preallocated the entire
guest RAM or just abort startup entirely if that is not acceptable for
a specific use case.


The major use case for preallocation is memory resources that cannot be
overcommitted (hugetlb, file blocks, ...), to avoid running out of such
resources later, while the guest is already running, and crashing it.


Wouldn't you say that preallocating memory for the sake of speeding up
guest kernel startup & runtime is a valid use case of prealloc? This way
we can avoid expensive (for a multitude of reasons) page faults that
will otherwise slow down the guest significantly at runtime and affect
the user experience.

With "ordinary" memory (anon/shmem/file), there is no such guaranteeunless you effectively prevent swapping/writeback or run in an extremelycontrolled environment. With anon memory, you further have to disableKSM, because that could immediately de-duplicate the zeroed pages again.

For this reason, I am not aware of preallocation getting used for theuse case you mentioned. Performance-sensitive workloads wantdeterminism, and consequently usually use hugetlb + preallocation. Ormlockall() to effectively allocate all memory and lock it beforestarting the VM.

Regarding page faults: with THP, the guest will touch a 2 MiB rangeonce, and you'll get a 2 MiB page populated, requiring no further writefaults, which should already heavily reduce page faults when booting aguest.

Preallocating all guest memory to make a guest kernel boot up fastersound a bit weird to me. Preallocating "some random part of guestmemory" also sounds weird, too: what if the guest uses exactly thememory locations you didn't preallocate?

I'd suggest doing some measurements if there are actually cases where"randomly preallocating some memory pages" are actually beneficial whenconsidering the overall startup time (setting up VM + starting the OS).

Allocating only a fraction "because it takes too long" looks quite
useless in that (main use-case) context. We shouldn't encourage QEMU
users to play with fire in such a way. IOW, there should be no way
around "prealloc-timeout-fatal". Either preallocation succeeded and the
guest can run, or it failed, and the guest can't run.


Here we basically accept the fact that e.g with fragmented memory the
kernel might take a while in a page fault handler especially for hugetlb
because of page compaction that has to run for every fault.

This way we can prefault at least some number of pages and let the guest
fault the rest on demand later on during runtime even if it's slow and
would cause a noticeable lag.

Sorry, I don't really see the value of this "preallcoating an randomportion of guest memory".

In practice, Linux guests will only touch all memory once that memory isrequired (e.g., allocated), not as default during bootup".

What you could do, is start the VM from a shmem/hugetlb/... file, andconcurrently start preallocating all memory from a second process. Theguest can boot up immediately and eventually you'll have all guestmemory allocated. It won't work with anon memory (memory-backend-ram)and private mappings (shared=false), of course.

... but then, management tools can simply start QEMU with "-S", start an
own timer, and zap QEMU if it didn't manage to come up in time, and
simply start a new QEMU instance without preallocation enabled.

The "good" thing about that approach is that it will also cover any
implicit memory preallocation, like using mlock() or VFIO, that don't
run in ordinary per-hostmem preallocation context. If setting QEMU up
takes to long, you might want to try on a different hypervisor in your
cluster instead.


This approach definitely works too but again it assumes that we always
want 'prealloc-timeout-fatal' to be on, which is, for the most part only
the case for working around issues that might be caused by overcommit.


Can you elaborate? Thanks.

--
Thanks,

David / dhildenb

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 3/4] backends/hostmem: add an ability to specify prealloc timeout, (continued)
- [PATCH 3/4] backends/hostmem: add an ability to specify prealloc timeout, Daniil Tatianin, 2023/01/20
- [PATCH 1/4] oslib: introduce new qemu_prealloc_mem_with_timeout() api, Daniil Tatianin, 2023/01/20
- Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, David Hildenbrand, 2023/01/23
  - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, Daniil Tatianin, 2023/01/23
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, Daniel P . Berrangé, 2023/01/23
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, David Hildenbrand, 2023/01/23
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, Daniil Tatianin, 2023/01/23
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, David Hildenbrand, 2023/01/23
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, Daniel P . Berrangé, 2023/01/23
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, Valentin Sinitsyn, 2023/01/24
    - Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout, David Hildenbrand <=

Prev by Date: Re: [PATCH v6 5/5] riscv: Introduce satp mode hw capabilities
Next by Date: Re: [PATCH] block/blkio: Fix inclusion of required headers
Previous by thread: Re: [PATCH v0 0/4] backends/hostmem: add an ability to specify prealloc timeout
Next by thread: [PATCH 0/2] vhost-user: Remove the nested event loop to unbreak the DPDK use case
Index(es):
- Date
- Thread