[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11] spapr: reset DRCs after dev

From: Daniel Henrique Barboza
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11] spapr: reset DRCs after devices
Date: Fri, 17 Nov 2017 18:42:25 -0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

On 11/17/2017 04:19 PM, Michael Roth wrote:
Quoting Daniel Henrique Barboza (2017-11-17 10:21:27)

On 11/17/2017 10:56 AM, Greg Kurz wrote:
A DRC with a pending unplug request releases its associated device at
machine reset time.

In the case of LMB, when all DRCs for a DIMM device have been reset,
the DIMM gets unplugged, causing guest memory to disappear. This may
be very confusing for anything still using this memory.

This is exactly what happens with vhost backends, and QEMU aborts

qemu-system-ppc64: used ring relocated for ring 2
qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
   `r >= 0' failed.

The issue is that each DRC registers a QEMU reset handler, and we
don't control the order in which these handlers are called (ie,
a LMB DRC will unplug a DIMM before the virtio device using the
memory on this DIMM could stop its vhost backend).

To avoid such situations, let's reset DRCs after all devices
have been reset.

Reported-by: Mallesh N. Koti <address@hidden>
Signed-off-by: Greg Kurz <address@hidden>
I believe this has to do with the problem discussed at the memory
unplug reboot with vhost=on, right?

I don't see any problem with this patch and it's probably the best solution
short-term for the problem (and potentially other related LMB release
problems), but that said, how hard it is to forbid LMB unplug at all if the
memory is being used in vhost?

The way I see it, the root problem is that a device unplug failed at the
side and we don't know about it, leaving drc->unplug_requested flags set
around the code when it shouldn't. Reading that thread I see a comment from
I'm not sure we want to go down the road of preemptively aborting
specific cases where we think unplug will fail in the guest. We already
need to deal with the cases where we don't know whether they'll fail
ahead of time, so it seems like more of a headache to add special cases
on top of that that management would need to deal with.

Just to be clear, I only advocate this kind of constraint if the LMB unplug,
at this specific scenario, has a 100% chance of fail. Otherwise we're
better of rolling the dice.

And in this particular case we'd never be able to unplug the DIMM if
the guest happens to use it for the vring each boot (unless maybe the
unplug request occured during early boot), even though there's no
technical reason we shouldn't be able to at least handle it on the
next reset.

I am ok going this route if we have our bases covered. Does this
behavior of unplugging the LMBs pending unplug at reset documented
in the PAPR specs? If not, we would need to document it somewhere else
(or ask for a PAPR revision to add it).

Otherwise, I am not sure if management will be OK with LMBs being
"inadvertently" hot unplugged in a reset due to a failed LMB hot unplug that
might have occurred days ago and the user doesn't even remember it
anymore :)



Michael S. Tsirkin saying that this bug is somewhat expected, that vhost
backend will
not behave well if memory is going away. Unlike other LMB unplug
failures from
the guest side that might happen for any other reason, this one sees
and we can avoid it.

I think this is something worth considering. But for now,

Reviewed-by: Daniel Henrique Barboza <address@hidden>

   hw/ppc/spapr.c     |   21 +++++++++++++++++++++
   hw/ppc/spapr_drc.c |    7 -------
   2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6841bd294b3c..6285f7211f9a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice 
*sbdev, void *opaque)

+static int spapr_reset_drcs(Object *child, void *opaque)
+    sPAPRDRConnector *drc =
+        (sPAPRDRConnector *) object_dynamic_cast(child,
+                                                 TYPE_SPAPR_DR_CONNECTOR);
+    if (drc) {
+        spapr_drc_reset(drc);
+    }
+    return 0;
   static void ppc_spapr_reset(void)
       MachineState *machine = MACHINE(qdev_get_machine());
@@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void)

+    /* DRC reset may cause a device to be unplugged. This will cause troubles
+     * if this device is used by another device (eg, a running vhost backend
+     * will crash QEMU if the DIMM holding the vring goes away). To avoid such
+     * situations, we reset DRCs after all devices have been reset.
+     */
+    object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, NULL);

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 915e9b51c40c..e3b122968e89 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc)

-static void drc_reset(void *opaque)
-    spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
   bool spapr_drc_needed(void *opaque)
       sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
@@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp)
       vmstate_register(DEVICE(drc), spapr_drc_index(drc), &vmstate_spapr_drc,
-    qemu_register_reset(drc_reset, drc);

@@ -529,7 +523,6 @@ static void unrealize(DeviceState *d, Error **errp)
       gchar *name;

-    qemu_unregister_reset(drc_reset, drc);
       vmstate_unregister(DEVICE(drc), &vmstate_spapr_drc, drc);
       root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
       name = g_strdup_printf("%x", spapr_drc_index(drc));

reply via email to

[Prev in Thread] Current Thread [Next in Thread]