|
From: | Peter Lieven |
Subject: | Re: [Qemu-devel] [PATCH 07/15] qapi: use mmap for QmpInputVisitor |
Date: | Mon, 4 Jul 2016 13:36:54 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 |
Am 04.07.2016 um 13:18 schrieb Markus Armbruster:
Paolo Bonzini <address@hidden> writes:On 30/06/2016 16:12, Markus Armbruster wrote:Implementing a stack as "big enough" array can be wasteful. Implementing it as dynamically allocated list is differently wasteful. Saving several mallocs and frees can be worth "wasting" a few pages of memory for a short time.Most usage of QmpInputVisitor at startup comes from object_property_set_qobject, which only sets small scalar objects. The stack is entirely unused in this case.A quick test run shows ~300 qmp_input_visitor_new() calls during startup, with at most two alive at the same time. Why would it matter whether these are in the order of 150 bytes or 25000 bytes each? How could this materially impact RSS? There's one type of waste here that I understand: we zero the whole QmpInputVisitor on allocation. I'm not opposed to changing how the stack is implemented, I just want to first understand why the current implmementation behaves badly (assuming it does).
The history behind this is that I observed that the RSS usage of Qemu has dramatically increased between Qemu 2.2.0 and 2.5.0. I observed that really clearly since we use hugetblfs everywhere and so I can clearly distinct Qemu memory from VM memory. After having bisected one increase in RSS usage to the introduction of RCU the theory came up that the memory gets fragmented because alloc and dealloc patterns have changed. So I started to trace all malloc calls above 4kB and started to use mmap everywhere where it was possible. To give you an idea of the diffence I observed I'd like to give an example. I have a blade with 22 vServers running on it. Including OS the allocated memory with current master is approx. at 6.5GB. With current master and the following environment set: MALLOC_MMAP_THRESHOLD_=32768 the allocated memory stays at approx. 2GB. Peter
[Prev in Thread] | Current Thread | [Next in Thread] |