[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migrat
From: |
Daniel P. Berrange |
Subject: |
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration |
Date: |
Mon, 13 May 2013 13:33:27 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, May 10, 2013 at 10:08:05AM -0500, Anthony Liguori wrote:
> "Daniel P. Berrange" <address@hidden> writes:
>
> > On Fri, May 10, 2013 at 08:07:51AM -0500, Anthony Liguori wrote:
> >> Chegu Vinod <address@hidden> writes:
> >>
> >> > If a user chooses to turn on the auto-converge migration capability
> >> > these changes detect the lack of convergence and throttle down the
> >> > guest. i.e. force the VCPUs out of the guest for some duration
> >> > and let the migration thread catchup and help converge.
> >> >
> >> > Verified the convergence using the following :
> >> > - SpecJbb2005 workload running on a 20VCPU/256G guest(~80% busy)
> >> > - OLTP like workload running on a 80VCPU/512G guest (~80% busy)
> >> >
> >> > Sample results with SpecJbb2005 workload : (migrate speed set to 20Gb
> >> > and
> >> > migrate downtime set to 4seconds).
> >>
> >> Would it make sense to separate out the "slow the VCPU down" part of
> >> this?
> >>
> >> That would give a management tool more flexibility to create policies
> >> around slowing the VCPU down to encourage migration.
> >>
> >> In fact, I wonder if we need anything in the migration path if we just
> >> expose the "slow the VCPU down" bit as a feature.
> >>
> >> Slow the VCPU down is not quite the same as setting priority of the VCPU
> >> thread largely because of the QBL so I recognize the need to have
> >> something for this in QEMU.
> >
> > Rather than the priority, could you perhaps do the VCPU slow down
> > using cfs_quota_us + cfs_period_us settings though ? These let you
> > place hard caps on schedular time afforded to vCPUs and we can already
> > control those via libvirt + cgroups.
>
> The problem with the bandwidth controller is the same with priorities.
> You can end up causing lock holder pre-emption which would negatively
> impact migration performance.
>
> It's far better for QEMU to voluntarily give up some time knowing that
> it's not holding the QBL since then migration can continue without
> impact.
IMHO it'd be nice to get some clear benchmark numbers of just how bug
the lock holder pre-emption problem is when using cgroup hard caps,
before we invent another mechanism for throttling the CPUs that has
to be plumbed into the whole stack.
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, (continued)
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Paolo Bonzini, 2013/05/10
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Anthony Liguori, 2013/05/10
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Daniel P. Berrange, 2013/05/10
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Anthony Liguori, 2013/05/10
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration,
Daniel P. Berrange <=
[Qemu-devel] [RFC PATCH v5 1/3] Introduce async_run_on_cpu(), Chegu Vinod, 2013/05/09