qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH 1/5] virtio-balloon: Remove unnecessary MADV_WILLN


From: Michael S. Tsirkin
Subject: Re: [Qemu-ppc] [PATCH 1/5] virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate
Date: Tue, 5 Mar 2019 19:14:09 -0500

On Wed, Mar 06, 2019 at 10:35:12AM +1100, David Gibson wrote:
> On Tue, Mar 05, 2019 at 09:41:34AM -0500, Michael S. Tsirkin wrote:
> > On Tue, Mar 05, 2019 at 04:03:00PM +1100, David Gibson wrote:
> > > On Mon, Mar 04, 2019 at 09:29:24PM -0500, Michael S. Tsirkin wrote:
> > > > On Tue, Mar 05, 2019 at 11:52:08AM +1100, David Gibson wrote:
> > > > > On Thu, Feb 28, 2019 at 08:36:58AM -0500, Michael S. Tsirkin wrote:
> > > > > > On Thu, Feb 14, 2019 at 03:39:12PM +1100, David Gibson wrote:
> > > > > > > When the balloon is inflated, we discard memory place in it using 
> > > > > > > madvise()
> > > > > > > with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, 
> > > > > > > which
> > > > > > > sounds like it makes sense but is actually unnecessary.
> > > > > > > 
> > > > > > > The misleadingly named MADV_DONTNEED just discards the memory in 
> > > > > > > question,
> > > > > > > it doesn't set any persistent state on it in-kernel; all that's 
> > > > > > > necessary
> > > > > > > to bring the memory back is to touch it.  MADV_WILLNEED in 
> > > > > > > contrast
> > > > > > > specifically says that the memory will be used soon and faults it 
> > > > > > > in.
> > > > > > > 
> > > > > > > This patch simplify's the balloon operation by dropping the 
> > > > > > > madvise()
> > > > > > > on deflate.  This might have an impact on performance - it will 
> > > > > > > move a
> > > > > > > delay at deflate time until that memory is actually touched, which
> > > > > > > might be more latency sensitive.  However:
> > > > > > > 
> > > > > > >   * Memory that's being given back to the guest by deflating the
> > > > > > >     balloon *might* be used soon, but it equally could just sit 
> > > > > > > around
> > > > > > >     in the guest's pools until needed (or even be faulted out 
> > > > > > > again if
> > > > > > >     the host is under memory pressure).
> > > > > > > 
> > > > > > >   * Usually, the timescale over which you'll be adjusting the 
> > > > > > > balloon
> > > > > > >     is long enough that a few extra faults after deflation aren't
> > > > > > >     going to make a difference.
> > > > > > > 
> > > > > > > Signed-off-by: David Gibson <address@hidden>
> > > > > > > Reviewed-by: David Hildenbrand <address@hidden>
> > > > > > > Reviewed-by: Michael S. Tsirkin <address@hidden>
> > > > > > 
> > > > > > I'm having second thoughts about this. It might affect performance 
> > > > > > but
> > > > > > probably won't but we have no idea.  Might cause latency jitter 
> > > > > > after
> > > > > > deflate where it previously didn't happen.  This kind of patch 
> > > > > > should
> > > > > > really be accompanied by benchmarking results, not philosophy.
> > > > > 
> > > > > I guess I see your point, much as it's annoying to spend time
> > > > > benchmarking a device that's basically broken by design.
> > > > 
> > > > Because of 4K page thing?
> > > 
> > > For one thing.  I believe David H has bunch of other reasons.
> > > 
> > > > It's an annoying bug for sure.  There were
> > > > patches to add a feature bit to just switch to plan s/g format, but they
> > > > were abandoned. You are welcome to revive them though.
> > > > Additionally or alternatively, we can easily add a field specifying
> > > > page size.
> > > 
> > > We could, but I'm pretty disinclined to work on this when virtio-mem
> > > is a better solution in nearly every way.
> > 
> > Then one way would be to just let balloon be. Make it behave same as
> > always and don't make changes to it :)
> 
> I'd love to, but it is in real world use, so I think we do need to fix
> serious bugs in it - at least if they can be fixed on one side,
> without needing to roll out both qemu and guest changes (which adding
> page size negotiation would require).


Absolutely I'm just saying don't add optimizations in that case :)

> > 
> > > > > That said.. I don't really know how I'd go about benchmarking it.  Any
> > > > > guesses at a suitable workload which would be most likely to show a
> > > > > performance degradation here?
> > > > 
> > > > Here's one idea - I tried to come up with a worst case scenario here.
> > > > Basically based on idea by Alex Duyck. All credits are his, all bugs are
> > > > mine:
> > > 
> > > Ok.  I'll try to find time to implement this and test it.
> > > 
> > > > Setup:
> > > > Memory-15837 MB
> > > > Guest Memory Size-5 GB
> > > > Swap-Disabled
> > > > Test Program-Simple program which allocates 4GB memory via malloc, 
> > > > touches it via memset and exits.
> > > > Use case-Number of guests that can be launched completely including the 
> > > > successful execution of the test program.
> > > > Procedure:
> > > > Setup:
> > > > A first guest is launched and once its console is up,
> > > > test allocation program is executed with 4 GB memory request (Due to
> > > > this the guest occupies almost 4-5 GB of memory in the host)
> > > > Afterwards balloon is inflated by 4Gbyte in the guest.
> > > > We continue launching the guests until a guest gets
> > > > killed due to low memory condition in the host.
> > > > 
> > > > 
> > > > Now repeatedly, in each guest in turn, balloon is deflated and
> > > > test allocation program is executed with 4 GB memory request (Due to
> > > > this the guest occupies almost 4-5 GB of memory in the host)
> > > > After program finishes balloon is inflated by 4GB again.
> > > > 
> > > > Then we switch to another guest.
> > > > 
> > > > Time how many cycles of this we can do.
> > > > 
> > > > 
> > > > Hope this helps.
> > > > 
> > > > 
> > > > 
> > > 
> > 
> > 
> 
> -- 
> David Gibson                  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au        | minimalist, thank you.  NOT _the_ 
> _other_
>                               | _way_ _around_!
> http://www.ozlabs.org/~dgibson





reply via email to

[Prev in Thread] Current Thread [Next in Thread]