qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/4] DEVICE_NOT_DELETED/DEVICE_UNPLUG_ERROR QAPI events


From: Daniel Henrique Barboza
Subject: Re: [PATCH 0/4] DEVICE_NOT_DELETED/DEVICE_UNPLUG_ERROR QAPI events
Date: Wed, 24 Mar 2021 16:09:59 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1



On 3/23/21 10:40 PM, David Gibson wrote:
On Tue, Mar 23, 2021 at 02:10:22PM -0300, Daniel Henrique Barboza wrote:


On 3/22/21 10:12 PM, David Gibson wrote:
On Fri, Mar 12, 2021 at 05:07:36PM -0300, Daniel Henrique Barboza wrote:
Hi,

This series adds 2 new QAPI events, DEVICE_NOT_DELETED and
DEVICE_UNPLUG_ERROR. They were (and are still being) discussed in [1].

Patches 1 and 3 are independent of the ppc patches and can be applied
separately. Patches 2 and 4 are based on David's ppc-for-6.0 branch and
are dependent on the QAPI patches.

Implementation looks fine, but I think there's a bit more to discuss
before we can apply.

I think it would make sense to re-order this and put UNPLUG_ERROR
first.  Its semantics are clearer, and I think there's a stronger case
for it.

Alright


I'm a bit less sold on DEVICE_NOT_DELETED, after consideration.  Does
it really tell the user/management anything useful beyond what
receiving neither a DEVICE_DELETED nor a DEVICE_UNPLUG_ERROR does?


It informs that the hotunplug operation exceed the timeout that QEMU
internals considers adequate, but QEMU can't assert that it was caused
by an error or an unexpected delay. The end result is that the device
is not going to be deleted from QMP, so DEVICE_NOT_DELETED.

Is it, though?  I mean, it is with this implementation for papr:
because we clear the unplug_requested flag, even if the guest later
tries to complete the unplug, it will fail.

But if I understand what Markus was saying correctly, that might not
be possible for all hotplug systems.  I believe Markus was suggesting
that DEVICE_NOT_DELETED could just mean that we haven't deleted the
device yet, but it could still happen later.

And in that case, I'm not yet sold on the value of a message that
essentially just means "Ayup, still dunno what's happening, sorry".

Perhaps we should just be straightforward and create a DEVICE_UNPLUG_TIMEOUT
event.

Hm... what if we added a "reason" field to UNPLUG_ERROR.  That could
be "guest rejected hotplug", or something more specific, in the rare
case that the guest has a way of signalling something more specific,
or "timeout" - but the later *only* to be sent in cases where on the
timeout we're able to block any later completion of the unplug (as we
can on papr).

I believe that's already covered by the existing API:


+# @DEVICE_UNPLUG_ERROR:
+#
+# Emitted when a device hot unplug error occurs.
+#
+# @device: device name
+#
+# @msg: Informative message

The 'informative message' would be the reason the event occurred. In patch
4/4, for the memory hotunplug refused by the guest, it is being set as:

     qapi_error = g_strdup_printf("Memory hotunplug rejected by the guest "
                                  "for device %s", dev->id);
     qapi_event_send_device_unplug_error(dev->id, qapi_error);



We could use the same DEVICE_UNPLUG_ERROR event in the CPU hotunplug timeout
case (currently on patch 2/4) by just changing 'msg', e.g.:


     qapi_error = g_strdup_printf("CPU hotunplug timeout for device %s", 
dev->id);
     qapi_event_send_device_unplug_error(dev->id, qapi_error);


Thanks,

DHB



Thoughs, Markus?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]