qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 07/15] qapi: use mmap for QmpInputVisitor


From: Peter Lieven
Subject: Re: [Qemu-devel] [PATCH 07/15] qapi: use mmap for QmpInputVisitor
Date: Mon, 4 Jul 2016 13:36:54 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

Am 04.07.2016 um 13:18 schrieb Markus Armbruster:
Paolo Bonzini <address@hidden> writes:

On 30/06/2016 16:12, Markus Armbruster wrote:
Implementing a stack as "big enough" array can be wasteful.
Implementing it as dynamically allocated list is differently wasteful.
Saving several mallocs and frees can be worth "wasting" a few pages of
memory for a short time.
Most usage of QmpInputVisitor at startup comes from
object_property_set_qobject, which only sets small scalar objects.  The
stack is entirely unused in this case.
A quick test run shows ~300 qmp_input_visitor_new() calls during
startup, with at most two alive at the same time.

Why would it matter whether these are in the order of 150 bytes or 25000
bytes each?  How could this materially impact RSS?

There's one type of waste here that I understand: we zero the whole
QmpInputVisitor on allocation.

I'm not opposed to changing how the stack is implemented, I just want to
first understand why the current implmementation behaves badly (assuming
it does).

The history behind this is that I observed that the RSS usage of Qemu
has dramatically increased between Qemu 2.2.0 and 2.5.0. I observed
that really clearly since we use hugetblfs everywhere and so I can clearly
distinct Qemu memory from VM memory. After having bisected one increase
in RSS usage to the introduction of RCU the theory came up that the memory
gets fragmented because alloc and dealloc patterns have changed. So I started
to trace all malloc calls above 4kB and started to use mmap everywhere where it
was possible.

To give you an idea of the diffence I observed I'd like to give an example.
I have a blade with 22 vServers running on it. Including OS the allocated
memory with current master is approx. at 6.5GB. With current master
and the following environment set:

MALLOC_MMAP_THRESHOLD_=32768

the allocated memory stays at approx. 2GB.

Peter





reply via email to

[Prev in Thread] Current Thread [Next in Thread]