qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [qom-cpu PATCH 2/2] i386: disable PMU passthrough mode


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [qom-cpu PATCH 2/2] i386: disable PMU passthrough mode by default
Date: Wed, 24 Jul 2013 10:44:11 -0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Jul 24, 2013 at 03:21:48PM +0200, Paolo Bonzini wrote:
> Il 24/07/2013 15:15, Eduardo Habkost ha scritto:
> > On Tue, Jul 23, 2013 at 09:43:06PM +0200, Paolo Bonzini wrote:
> >> Il 23/07/2013 19:41, Eduardo Habkost ha scritto:
> >>> On Tue, Jul 23, 2013 at 06:23:08PM +0200, Paolo Bonzini wrote:
> >>>> Il 23/07/2013 17:40, Eduardo Habkost ha scritto:
> >>>>> On Tue, Jul 23, 2013 at 05:09:02PM +0200, Paolo Bonzini wrote:
> >>>>>> Il 23/07/2013 16:13, Eduardo Habkost ha scritto:
> >>>>>>> On Tue, Jul 23, 2013 at 11:18:03AM +0200, Paolo Bonzini wrote:
> >>>>>>>> Il 22/07/2013 21:25, Eduardo Habkost ha scritto:
> >>>>>>>>> Bug description: QEMU currently gets all bits from 
> >>>>>>>>> GET_SUPPORTED_CPUID
> >>>>>>>>> for CPUID leaf 0xA and passes them directly to the guest. This makes
> >>>>>>>>> the guest ABI depend on host kernel and host CPU capabilities, and
> >>>>>>>>> breaks live migration if we migrate between host with different
> >>>>>>>>> capabilities (e.g. different number of PMU counters).
> >>>>>>>>>
> >>>>>>>>> This patch adds a "pmu-passthrough" property to X86CPU, and set it 
> >>>>>>>>> to
> >>>>>>>>> true only on "-cpu host", or on pc-*-1.5 and older machine-types.
> >>>>>>>>
> >>>>>>>> Can we just call the property "pmu"?  It doesn't have to be 
> >>>>>>>> passthough.
> >>>>>>>
> >>>>>>> Yes, but the only options we have today are "no PMU" and "passthrough
> >>>>>>> PMU". I wouldn't like to make "pmu=on" enable the passthrough behavior
> >>>>>>> implicitly (I don't want things that break live-migration to be 
> >>>>>>> enabled
> >>>>>>> without making it explicit that it is a host-dependent/passthrough
> >>>>>>> mode).
> >>>>>>
> >>>>>> I think "passthrough PMU" should be considered a bug except of course
> >>>>>> with "-cpu host".
> >>>>>>
> >>>>>> If "-cpu Nehalem,pmu=on" goes from passthrough to Nehalem-compatible in
> >>>>>> a future QEMU release, that'll be a bugfix.
> >>>>>
> >>>>> Exactly. But then I don't understand your suggestion. We still need a
> >>>>> property to enable pasthrough behavior on old machine-types (not
> >>>>> perfect, but a best-effort way to try to keep compatibility),
> >>>>
> >>>> Do we?
> >>>>
> >>>> We only need "pmu=on"---which right now is buggy on old machine types
> >>>> because it will always passthrough.
> >>>
> >>> I am not sure I understand what you are arguing for.
> >>>
> >>> You agree that pmu=on needs to keep the buggy passthrough behavior on
> >>> pc-1.5 and older, right?
> >>
> >> I agree it needs to remain enabled on 1.5.  But if, for example, 1.8
> >> makes pmu=on emulate a Nehalem-compatible PMU, I think it is fine if
> >> pc-1.5 moves from a host-compatible PMU to a Nehalem-compatible PMU.
> > 
> > That's where I disagree. Today users are (luckily) able to migrate
> > safely between hosts with the same number of PMU counters. But if we
> > make, e.g., "qemu-1.6 -machine pc-1.5 -cpu Westmere" present a smaller
> > number of PMU counters than "qemu-1.5 -machine pc-1.5 -cpu Westmere" on
> > the same host, we will break an existing setup where everything was
> > working before, which is something we could have easily avoided.
> 
> But at the same time we will fix live migration from a Sandy Bridge host
> to a Westmere.  So it's a choice we have to make anyway.

True.

> 
> > (Just to clarify what breaking this means in practice: changing the
> > number of PMU counters under the guest on live-migration means the guest
> > will crash when trying to use counters that suddenly went away, and it
> > may crash a very long time after it was migrated.)
> 
> And at the same time we fix live migration of a Sandy Bridge to a Westmere.

Something that never worked in the first place. Breaking what is working
today, on the other hand, is a regression.

If users are interested in a fix for the new SandyBrige->Westmere
use-case, we can always say "please upgrade your VM to a newer
machine-type".

> 
> >> The reason is that pc-1.5 has never guaranteed any feature of the
> >> emulated PMU.
> > 
> > Right, current behavior is buggy and we never guaranteed anything, but
> > IMO we shouldn't break on purpose something that is working today.
> 
> Even if it is to fix something else?

I believe so, because machine-types allow us to have both: we can fix
the new use-cases in new machine-types while keeping existing working
setups without regressions on the older machine-types.

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]