qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 7/7] spapr.c: consider CPU core online state before allowi


From: David Gibson
Subject: Re: [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug
Date: Mon, 18 Jan 2021 12:18:48 +1100

On Fri, Jan 15, 2021 at 03:52:56PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 1/15/21 2:22 PM, Greg Kurz wrote:
> > On Thu, 14 Jan 2021 15:06:28 -0300
> > Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> > 
> > > The only restriction we have when unplugging CPUs is to forbid unplug of
> > > the boot cpu core. spapr_core_unplug_possible() does not contemplate the
> 
> I'll look into it.
> 
> > 
> > I can't remember why this restriction was introduced in the first place...
> > This should be investigated and documented if the limitation still stands.
> > 
> > > possibility of some cores being offlined by the guest, meaning that we're
> > > rolling the dice regarding on whether we're unplugging the last online
> > > CPU core the guest has.
> > > 
> > 
> > Trying to unplug the last CPU is obviously something that deserves
> > special care. LoPAPR is quite explicit on the outcome : this should
> > terminate the partition.
> > 
> > 13.7.4.1.1. Isolation of CPUs
> > 
> > The isolation of a CPU, in all cases, is preceded by the stop-self
> > RTAS function for all processor threads, and the OS insures that all
> > the CPU’s threads are in the RTAS stopped state prior to isolating the
> > CPU. Isolation of a processor that is not stopped produces unpredictable
> > results. The stopping of the last processor thread of a LPAR partition
> > effectively kills the partition, and at that point, ownership of all
> > partition resources reverts to the platform firmware.
> 
> 
> I was just investigating a reason why we should check for all thread
> states before unplugging the core, like David suggested in his reply.
> rtas_stop_self() was setting 'cs->halted = 1' without a thread activity
> check like ibm,suspend-me() does and I was wondering why. This text you sent
> explains it, quoting:
> 
> "> The isolation of a CPU, in all cases, is preceded by the stop-self
>  RTAS function for all processor threads, and the OS insures that all
>  the CPU’s threads are in the RTAS stopped state prior to isolating the
>  CPU."
> 
> 
> This seems to be corroborated by arch/powerpc/platform/pseries/hotplug-cpu.c:

Um... this seems like you're overcomplicating this.  The crucial point
here is that 'start-cpu' and 'stop-self' operate on individual
threads, where as dynamic reconfiguration hotplug and unplug works on
whole cores.

> static void pseries_cpu_offline_self(void)
> {
>       unsigned int hwcpu = hard_smp_processor_id();
> 
>       local_irq_disable();
>       idle_task_exit();
>       if (xive_enabled())
>               xive_teardown_cpu();
>       else
>               xics_teardown_cpu();
> 
>       unregister_slb_shadow(hwcpu);
>       rtas_stop_self();
> 
>       /* Should never get here... */
>       BUG();
>       for(;;);
> }
> 
> 
> IIUC this means that we can rely on cs->halted = 1 since it's coming right
> after a rtas_stop_self() call. This is still a bit confusing though and I
> wouldn't mind standardizing the 'CPU core is offline' condition with what
> ibm,suspend-me is already doing.

At the moment we have no tracking of whether a core is online.  We
kinda-sorta track whether a *thread* is online through stop-self /
start-cpu.

> David, what do you think?

I think we can rely on cs->halted = 1 when the thread is offline.
What I'm much less certain about is whether we can count on the thread
being offline when cs->halted = 1.

> > R1-13.7.4.1.1-1. For the LRDR option: Prior to issuing the RTAS
> > set-indicator specifying isolate isolation-state of a CPU DR
> > connector type, all the CPU threads must be in the RTAS stopped
> > state.
> > 
> > R1-13.7.4.1.1-2. For the LRDR option: Stopping of the last processor
> > thread of a LPAR partition with the stop-self RTAS function, must kill
> > the partition, with ownership of all partition resources reverting to
> > the platform firmware.
> > 
> > This is clearly not how things work today : linux doesn't call
> > "stop-self" on the last vCPU and even if it did, QEMU doesn't
> > terminate the VM.
> > 
> > If there's a valid reason to not implement this PAPR behavior, I'd like
> > it to be documented.
> 
> 
> I can only speculate. This would create a unorthodox way of shutting down
> the guest, when the user can just shutdown the whole thing gracefully.
> 
> Unless we're considering borderline cases, like the security risk mentioned
> in the kernel docs (Documentation/core-api/cpu_hotplug.rst):
> 
> "Such advances require CPUs available to a kernel to be removed either for
> provisioning reasons, or for RAS purposes to keep an offending CPU off
> system execution path. Hence the need for CPU hotplug support in the
> Linux kernel."
> 
> In this extreme scenario I can see a reason to kill the partition/guest
> by offlining the last online CPU - if it's compromising the host we'd
> rather terminate immediately instead of waiting for graceful
> shutdown.

I'm a bit confused by this comment.  You seem to be conflating
online/offline operations (start-cpu/stop-self) with hot plug/unplug
operations - they're obviously related, but they're not the same
thing.

> > > If we hit the jackpot, we're going to detach the core DRC and pulse the
> > > hotplug IRQ, but the guest OS will refuse to release the CPU. Our
> > > spapr_core_unplug() DRC release callback will never be called and the CPU
> > > core object will keep existing in QEMU. No error message will be sent
> > > to the user, but the CPU core wasn't unplugged from the guest.
> > > 
> > > If the guest OS onlines the CPU core again we won't be able to hotunplug 
> > > it
> > > either. 'dmesg' inside the guest will report a failed attempt to offline 
> > > an
> > > unknown CPU:
> > > 
> > > [  923.003994] pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16
> > > 
> > > This is the result of stopping the DRC state transition in the middle in 
> > > the
> > > first failed attempt.
> > > 
> > 
> > Yes, at this point only a machine reset can fix things up.
> > 
> > Given this is linux's choice not to call "stop-self" as it should do, I'm 
> > not
> > super fan of hardcoding this logic in QEMU, unless there are really good
> > reasons to do so.
> 
> I understand where are you coming from and I sympathize. Not sure about how 
> users
> would feel about that though. They expect a somewhat compatible behavior of
> multi-arch features like hotplug/hotunplug, and x86 will neither hotunplug 
> nor offline
> the last CPU as well.
> 
> There is a very high chance that, even if we pull this design off, I'll need 
> to go to
> Libvirt and disable it because we broke compatibility with how vcpu unplug 
> operated
> earlier.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]