[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migrat
From: |
Anthony Liguori |
Subject: |
Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration |
Date: |
Mon, 13 May 2013 07:18:15 -0500 |
User-agent: |
Notmuch/0.15.2+77~g661dcf8 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu) |
Paolo Bonzini <address@hidden> writes:
> Il 10/05/2013 17:11, Anthony Liguori ha scritto:
>> Chegu Vinod <address@hidden> writes:
>>
>>> On 5/10/2013 6:07 AM, Anthony Liguori wrote:
>>>> Chegu Vinod <address@hidden> writes:
>>>>
>>>>> If a user chooses to turn on the auto-converge migration capability
>>>>> these changes detect the lack of convergence and throttle down the
>>>>> guest. i.e. force the VCPUs out of the guest for some duration
>>>>> and let the migration thread catchup and help converge.
>>>>>
>>>>> Verified the convergence using the following :
>>>>> - SpecJbb2005 workload running on a 20VCPU/256G guest(~80% busy)
>>>>> - OLTP like workload running on a 80VCPU/512G guest (~80% busy)
>>>>>
>>>>> Sample results with SpecJbb2005 workload : (migrate speed set to 20Gb
>>>>> and
>>>>> migrate downtime set to 4seconds).
>>>> Would it make sense to separate out the "slow the VCPU down" part of
>>>> this?
>>>>
>>>> That would give a management tool more flexibility to create policies
>>>> around slowing the VCPU down to encourage migration.
>>>
>>> I believe one can always enhance libvirt tools to monitor the migration
>>> statistics and control the shares/entitlements of the vcpus via
>>> cgroups..thereby slowing the guest down to allow for convergence (I had
>>> that listed in my earlier versions of the patches as an option and also
>>> noted that it requires external (i.e. tool driven) monitoring and
>>> triggers...and that this alternative was kind of automatic after the
>>> initial setting of the capability).
>>>
>>> Is that what you meant by your comment above (or) are you talking about
>>> something outside the scope of cgroups and from an implementation point
>>> of view also outside the migration code path...i.e. a new knob that an
>>> external tool can use to just throttle down the vcpus of a guest ?
>>
>> I'm saying, a knob to throttle the guest vcpus within QEMU that could be
>> used by management tools to encourage convergence.
>>
>> For instance, consider an imaginary "vcpu_throttle" command that took a
>> number between 0 and 1 that throttled VCPU performance accordingly.
>>
>> Then migration would look like:
>>
>> 0) throttle = 1.0
>> 1) call migrate command to start migration
>> 2) query progress until you decide you aren't converging
>> 3) throttle *= 0.75; call vcpu_throttle $throttle
>> 4) goto (2)
>>
>> Now I'm not opposed to a series like this that adds this sort of policy
>> to QEMU itself too but I want to make sure the pieces are exposed for a
>> management tool to implement its own policies too.
>
> Note that QEMU can also throttle VCPUs as they dirty guest memory,
> rather than based on CPU time. That's not something that management
> cannot do (you can approximate it based on the recent history if you
> provide dirtying statistics, but it's not the same thing).
Sure but in that case, I'd argue you would want to expose that as a
command that libvirt could invoke too.
Regards,
Anthony Liguori
>
> Paolo
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, (continued)
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Paolo Bonzini, 2013/05/10
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Anthony Liguori, 2013/05/10
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Daniel P. Berrange, 2013/05/10
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Anthony Liguori, 2013/05/10
- Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration, Daniel P. Berrange, 2013/05/13
[Qemu-devel] [RFC PATCH v5 1/3] Introduce async_run_on_cpu(), Chegu Vinod, 2013/05/09