qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH for-2.13 01/10] spapr: Avoid redundant calls to sp


From: Greg Kurz
Subject: Re: [Qemu-ppc] [PATCH for-2.13 01/10] spapr: Avoid redundant calls to spapr_cpu_reset()
Date: Mon, 18 Jun 2018 11:01:27 +0200

On Mon, 18 Jun 2018 13:42:37 +1000
David Gibson <address@hidden> wrote:

> On Fri, Apr 20, 2018 at 05:39:42PM +0200, Greg Kurz wrote:
> > On Fri, 20 Apr 2018 11:15:01 +0200
> > Greg Kurz <address@hidden> wrote:
> >   
> > > On Fri, 20 Apr 2018 16:34:37 +1000
> > > David Gibson <address@hidden> wrote:
> > >   
> > > > On Thu, Apr 19, 2018 at 03:48:23PM +0200, Greg Kurz wrote:    
> > > > > On Tue, 17 Apr 2018 17:17:13 +1000
> > > > > David Gibson <address@hidden> wrote:
> > > > >       
> > > > > > af81cf323c1 "spapr: CPU hotplug support" added a direct call to
> > > > > > spapr_cpu_reset() in spapr_cpu_init(), as well as registering it as 
> > > > > > a
> > > > > > reset callback.  That was in order to make sure that the reset 
> > > > > > function
> > > > > > got called for a newly hotplugged cpu, which would miss the global 
> > > > > > machine
> > > > > > reset.
> > > > > > 
> > > > > > However, this change means that spapr_cpu_reset() gets called twice 
> > > > > > for
> > > > > > normal cold-plugged cpus: once from spapr_cpu_init(), then again 
> > > > > > during
> > > > > > the system reset.  As well as being ugly in its redundancy, the 
> > > > > > first call
> > > > > > happens before the machine reset calls have happened, which will 
> > > > > > cause
> > > > > > problems for some things we're going to want to add.
> > > > > > 
> > > > > > So, we remove the reset call from spapr_cpu_init().  We instead put 
> > > > > > an
> > > > > > explicit reset call in the hotplug specific path.
> > > > > > 
> > > > > > Signed-off-by: David Gibson <address@hidden>
> > > > > > ---      
> > > > > 
> > > > > I had sent a tentative patch to do something similar earlier this 
> > > > > year:
> > > > > 
> > > > > https://patchwork.ozlabs.org/patch/862116/
> > > > > 
> > > > > but it got nacked for several reasons, one of them being you were
> > > > > "always wary of using the hotplugged parameter, because what qemu
> > > > > means by it often doesn't line up with what PAPR means by it."      
> > > > 
> > > > Yeah, I was and am wary of that, but convinced myself it was correct
> > > > in this case (which doesn't really interact with the PAPR meaning of
> > > > hotplug).
> > > >     
> > > > > >  hw/ppc/spapr.c                  |  6 ++++--
> > > > > >  hw/ppc/spapr_cpu_core.c         | 13 ++++++++++++-
> > > > > >  include/hw/ppc/spapr_cpu_core.h |  2 ++
> > > > > >  3 files changed, 18 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > > > index 7b2bc4e25d..81b50af3b5 100644
> > > > > > --- a/hw/ppc/spapr.c
> > > > > > +++ b/hw/ppc/spapr.c
> > > > > > @@ -3370,9 +3370,11 @@ static void spapr_core_plug(HotplugHandler 
> > > > > > *hotplug_dev, DeviceState *dev,
> > > > > >  
> > > > > >          if (hotplugged) {      
> > > > > 
> > > > > ... but you rely on it here. Can you explain why it is
> > > > > okay now ?      
> > > > 
> > > > So the value I actually need here is "wasn't present at the last
> > > > system reset" (with false positives being mostly harmless, but not
> > > > false negatives).
> > > >     
> > > 
> > > Hmm... It is rather the other way around, sth like "will be caught
> > > by the initial machine reset".
> > >   
> > > > > Also, if QEMU is started with -incoming and the CPU core
> > > > > is hotplugged before migration begins, the following will
> > > > > return false:
> > > > > 
> > > > > static inline bool spapr_drc_hotplugged(DeviceState *dev)
> > > > > {
> > > > >     return dev->hotplugged && !runstate_check(RUN_STATE_INMIGRATE);
> > > > > }
> > > > > 
> > > > > and the CPU core won't be reset.      
> > > > 
> > > > Uh... spapr_dtc_hotplugged() would definitely be wrong here, which is
> > > > why I'm not using it.
> > > >     
> > > 
> > > This is how hotplugged is set in spapr_core_plug():
> > > 
> > >     bool hotplugged = spapr_drc_hotplugged(dev);
> > > 
> > > but to detect the "will be caught by the initial machine reset" condition,
> > > we only need to check dev->hotplugged actually.
> > >   
> > > > >       
> > > > > >              /*
> > > > > > -             * Send hotplug notification interrupt to the guest 
> > > > > > only
> > > > > > -             * in case of hotplugged CPUs.
> > > > > > +             * For hotplugged CPUs, we need to reset them (they 
> > > > > > missed
> > > > > > +             * out on the system reset), and send the guest a
> > > > > > +             * notification
> > > > > >               */
> > > > > > +            spapr_cpu_core_reset(core);      
> > > > > 
> > > > > spapr_cpu_reset() also sets the compat mode, which is used
> > > > > to set some properties in the DT, ie, this should be called
> > > > > before spapr_populate_hotplug_cpu_dt().      
> > > > 
> > > > Good point.  I've moved the reset to fix that.
> > > >     
> > 
> > Thinking of it again: since cold-plugged devices reach this before machine
> > reset, we would then attach to the DRC a DT blob based on a non-reset CPU 
> > :-\
> > 
> > Instead of registering a reset handler for each individual CPUs, maybe
> > we should rather register it a the CPU core level. The handler would
> > first reset all CPUs in the core and then setup the DRC for new cores only,
> > like it is done currently in spapr_core_plug().
> > 
> > spapr_core_plug() would then just need to register the reset handler,
> > and call it only for hotplugged cores.  
> 
> Handling the resets via the core level might be a good idea anyway,
> but I don't think it can address the problem we're hitting here.
> 
> I've investigated further and I'm pretty sure we can't fix this
> without generic code changes.  cpu_common_realizefn() (which is called
> by our ppc cpu realize hook via the parent_realize chain) contains
> this:
> 
>     if (dev->hotplugged) {
>         cpu_synchronize_post_init(cpu);
>         cpu_resume(cpu);
>     }
> 

Yes, these come from:

commit 13eed94ed5617b98e657163490584dc2a0cc4b32
Author: Igor Mammedov <address@hidden>
Date:   Tue Apr 23 10:29:36 2013 +0200

    cpu: Call cpu_synchronize_post_init() from DeviceClass::realize()
    
    If hotplugged, synchronize CPU state to KVM.
    
    Signed-off-by: Igor Mammedov <address@hidden>
    Reviewed-by: Eduardo Habkost <address@hidden>
    Signed-off-by: Andreas Färber <address@hidden>

and

commit 6afb4721f3e45da727110470a61aafcd6682395e
Author: Igor Mammedov <address@hidden>
Date:   Tue Apr 23 10:29:38 2013 +0200

    cpu: Resume CPU from DeviceClass::realize() if hot-plugged
    
    Signed-off-by: Igor Mammedov <address@hidden>
    Signed-off-by: Andreas Färber <address@hidden>

> So, as soon as the hotplugged cpu is realized, it's running, which
> means by the time we call the plug() hotplug handler we're too late to
> do any reset initialization.
> 
> I think there are two ways to look at this:
> 
> 1) The reset handlers are specifically about *system* reset, not
> device reset, and so we shoudln't really expect them to be called for
> hotplugged devices.  If we want to share reset initialization with
> "initial" initialization, we should explicitly call the reset handler
> from the (realize time) init code.. which is what we do now.
> 
> 2) Common core realize should _not_ activate the cpu.  Instead that
> should be the plug() handler's job.  This would require changing the
> x86 cpu plug handler (and whatever else) to kick off the cpu after
> realization.
> 
> For now I'm inclined to just let it stay at (1).
> 

Maybe Igor and Eduardo (now Cc'd) can provide a hint about 2) ?

> The problem I had which I thought required this, doesn't after all.  I
> came up with a different solution that involves moving the spapr caps
> initialization earlier, instead of the cpu reset later.  That turned
> out to be substantially easier than I first thought, and regardless of
> what we do above long term, I think it's a better way to handle the
> caps.
> 

Attachment: pgp1kgD7YpYUz.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]