[PATCH v1 7/7] spapr.c: consider CPU core online state before allowing u

From: Daniel Henrique Barboza
Subject: [PATCH v1 7/7] spapr.c: consider CPU core online state before allowing unplug
Date: Thu, 14 Jan 2021 15:06:28 -0300

The only restriction we have when unplugging CPUs is to forbid unplug of
the boot cpu core. spapr_core_unplug_possible() does not contemplate the
possibility of some cores being offlined by the guest, meaning that we're
rolling the dice regarding on whether we're unplugging the last online
CPU core the guest has.

If we hit the jackpot, we're going to detach the core DRC and pulse the
hotplug IRQ, but the guest OS will refuse to release the CPU. Our
spapr_core_unplug() DRC release callback will never be called and the CPU
core object will keep existing in QEMU. No error message will be sent
to the user, but the CPU core wasn't unplugged from the guest.

If the guest OS onlines the CPU core again we won't be able to hotunplug it
either. 'dmesg' inside the guest will report a failed attempt to offline an
unknown CPU:

[  923.003994] pseries-hotplug-cpu: Failed to offline CPU <NULL>, rc: -16

This is the result of stopping the DRC state transition in the middle in the
first failed attempt.

We can avoid this, and potentially other bad things from happening, if we
avoid to attempt the unplug altogether in this scenario. Let's check for
the online/offline state of the CPU cores in the guest before allowing
the hotunplug, and forbid removing a CPU core if it's the last one online
in the guest.

Reported-by: Xujun Ma <xuma@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
 hw/ppc/spapr.c | 39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a2f01c21aa..d269dcd102 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3709,9 +3709,16 @@ static void spapr_core_unplug(HotplugHandler 
*hotplug_dev, DeviceState *dev)
 static int spapr_core_unplug_possible(HotplugHandler *hotplug_dev, CPUCore *cc,
                                       Error **errp)
+    CPUArchId *core_slot;
+    SpaprCpuCore *core;
+    PowerPCCPU *cpu;
+    CPUState *cs;
+    bool last_cpu_online = true;
     int index;
-    if (!spapr_find_cpu_slot(MACHINE(hotplug_dev), cc->core_id, &index)) {
+    core_slot = spapr_find_cpu_slot(MACHINE(hotplug_dev), cc->core_id,
+                                    &index);
+    if (!core_slot) {
         error_setg(errp, "Unable to find CPU core with core-id: %d",
         return -1;
@@ -3722,6 +3729,36 @@ static int spapr_core_unplug_possible(HotplugHandler 
*hotplug_dev, CPUCore *cc,
         return -1;
+    /* Allow for any non-boot CPU core to be unplugged if already offline */
+    core = SPAPR_CPU_CORE(core_slot->cpu);
+    cs = CPU(core->threads[0]);
+    if (cs->halted) {
+        return 0;
+    }
+    /*
+     * Do not allow core unplug if it's the last core online.
+     */
+    cpu = POWERPC_CPU(cs);
+    CPU_FOREACH(cs) {
+        PowerPCCPU *c = POWERPC_CPU(cs);
+        if (c == cpu) {
+            continue;
+        }
+        if (!cs->halted) {
+            last_cpu_online = false;
+            break;
+        }
+    }
+    if (last_cpu_online) {
+        error_setg(errp, "Unable to unplug CPU core with core-id %d: it is "
+                   "the only CPU core online in the guest", cc->core_id);
+        return -1;
+    }
     return 0;

