qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.2 v6 6/7] spapr: use DEVICE_UNPLUG_ERROR to report unpl


From: David Gibson
Subject: Re: [PATCH for-6.2 v6 6/7] spapr: use DEVICE_UNPLUG_ERROR to report unplug errors
Date: Tue, 10 Aug 2021 11:03:29 +1000

On Mon, Aug 09, 2021 at 03:47:14PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 8/7/21 11:06 AM, Markus Armbruster wrote:
> > Daniel Henrique Barboza <danielhb413@gmail.com> writes:
> > 
> > > Linux Kernel 5.12 is now unisolating CPU DRCs in the device_removal
> > > error path, signalling that the hotunplug process wasn't successful.
> > > This allow us to send a DEVICE_UNPLUG_ERROR in drc_unisolate_logical()
> > > to signal this error to the management layer.
> > > 
> > > We also have another error path in spapr_memory_unplug_rollback() for
> > > configured LMB DRCs. Kernels older than 5.13 will not unisolate the LMBs
> > > in the hotunplug error path, but it will reconfigure them. Let's send
> > > the DEVICE_UNPLUG_ERROR event in that code path as well to cover the
> > > case of older kernels.
> > > 
> > > Reviewed-by: Greg Kurz <groug@kaod.org>
> > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> > > ---
> > >   hw/ppc/spapr.c     |  9 ++++++++-
> > >   hw/ppc/spapr_drc.c | 18 ++++++++++++------
> > >   2 files changed, 20 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 1611d7ab05..5459f9a7e9 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -29,6 +29,7 @@
> > >   #include "qemu/datadir.h"
> > >   #include "qapi/error.h"
> > >   #include "qapi/qapi-events-machine.h"
> > > +#include "qapi/qapi-events-qdev.h"
> > >   #include "qapi/visitor.h"
> > >   #include "sysemu/sysemu.h"
> > >   #include "sysemu/hostmem.h"
> > > @@ -3686,13 +3687,19 @@ void 
> > > spapr_memory_unplug_rollback(SpaprMachineState *spapr, DeviceState *dev)
> > >       /*
> > >        * Tell QAPI that something happened and the memory
> > > -     * hotunplug wasn't successful.
> > > +     * hotunplug wasn't successful. Keep sending
> > > +     * MEM_UNPLUG_ERROR even while sending DEVICE_UNPLUG_ERROR
> > > +     * until the deprecation MEM_UNPLUG_ERROR is due.
> > >        */
> > >       if (dev->id) {
> > >           qapi_error = g_strdup_printf("Memory hotunplug rejected by the 
> > > guest "
> > >                                        "for device %s", dev->id);
> > >           qapi_event_send_mem_unplug_error(dev->id, qapi_error);
> > >       }
> > > +
> > > +    qapi_event_send_device_unplug_error(!!dev->id, dev->id,
> > > +                                        dev->canonical_path,
> > > +                                        qapi_error != NULL, qapi_error);
> > >   }
> > 
> > When dev->id is null, we send something like
> > 
> >      {"event": "DEVICE_UNPLUG_ERROR",
> >       "data": {"path": "/machine/..."},
> >       "timestamp": ...}
> > 
> > Unless I'm missing something, this is all the information the management
> > application really needs.
> > 
> > When dev->id is non-null, we add to "data":
> > 
> >                "device": "dev123",
> >                "msg": "Memory hotunplug rejected by the guest for device 
> > dev123",
> > 
> > I'm fine with emitting the device ID when we have it.
> > 
> > What's the intended use of "msg"?
> > 
> > Could DEVICE_UNPLUG_ERROR ever be emitted for this device with a
> > different "msg"?
> 
> 
> It won't have a different 'msg' for the current use of the event in both ppc64
> and x86. It'll always be the same '<dev> hotunplug rejected by the guest'
> message.
> 
> The idea is that a future caller might want to insert a more informative
> message, such as "hotunplug failed: memory is being used by kernel space"
> or any other more specific condition. But then I guess we can argue that,
> if that time comes, one can just add this new optional 'msg' member in this
> event, and for now we can live without it.

Right.  We could also consider making the current message more
specific about why we chose to cancel the unplug: e.g. "guest
unisolated DRC after unplug request" for PAPR, and something
appropriate to the ACPI specifics for x86.  Not sure if that's useful
enough to justify it.

> Would you oppose to renaming this new event to "DEVICE_UNPLUG_GUEST_ERROR"
> and then remove the 'msg' member? I guess this rename would make it clearer
> for management that we're reporting a guest side error, making any further
> clarifications via 'msg' unneeded.
> 
> 
> Thanks,
> 
> 
> Daniel
> 
> 
> 
> 
> > 
> > If "msg" is useful when dev->id is non-null, then it's likely useful
> > when dev->id is null.  Why not
> > 
> >                "msg": "Memory hotunplug rejected by the guest",
> > 
> > always?
> > 
> > If we do that here, we'll likely do it everywhere, and then member @msg
> > isn't actually optional.
> > 
> > >   /* Callback to be called during DRC release. */
> > > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > > index a4d9496f76..8f0479631f 100644
> > > --- a/hw/ppc/spapr_drc.c
> > > +++ b/hw/ppc/spapr_drc.c
> > > @@ -17,6 +17,8 @@
> > >   #include "hw/ppc/spapr_drc.h"
> > >   #include "qom/object.h"
> > >   #include "migration/vmstate.h"
> > > +#include "qapi/error.h"
> > > +#include "qapi/qapi-events-qdev.h"
> > >   #include "qapi/visitor.h"
> > >   #include "qemu/error-report.h"
> > >   #include "hw/ppc/spapr.h" /* for RTAS return codes */
> > > @@ -160,6 +162,11 @@ static uint32_t drc_unisolate_logical(SpaprDrc *drc)
> > >            * means that the kernel is refusing the removal.
> > >            */
> > >           if (drc->unplug_requested && drc->dev) {
> > > +            const char qapi_error_fmt[] = \
> > 
> > Drop the superfluous \
> > 
> > > +"Device hotunplug rejected by the guest for device %s";
> > 
> > Unusual indentation.
> > 
> > > +
> > > +            g_autofree char *qapi_error = NULL;
> > > +
> > >               if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_LMB) {
> > >                   spapr = SPAPR_MACHINE(qdev_get_machine());
> > > @@ -169,14 +176,13 @@ static uint32_t drc_unisolate_logical(SpaprDrc *drc)
> > >               drc->unplug_requested = false;
> > >               if (drc->dev->id) {
> > > -                error_report("Device hotunplug rejected by the guest "
> > > -                             "for device %s", drc->dev->id);
> > > +                qapi_error = g_strdup_printf(qapi_error_fmt, 
> > > drc->dev->id);
> > > +                error_report(qapi_error_fmt, drc->dev->id);
> > 
> > Simpler:
> > 
> >                     qapi_error = ...
> >                     error_report("%s", qapi_error);
> > 
> > Matter of taste.  Maintainer decides.
> > 
> > >               }
> > > -            /*
> > > -             * TODO: send a QAPI DEVICE_UNPLUG_ERROR event when
> > > -             * it is implemented.
> > > -             */
> > > +            qapi_event_send_device_unplug_error(!!drc->dev->id, 
> > > drc->dev->id,
> > > +                                                drc->dev->canonical_path,
> > > +                                                qapi_error != NULL, 
> > > qapi_error);
> > 
> > My questions on "msg" apply.
> > 
> > >           }
> > >           return RTAS_OUT_SUCCESS; /* Nothing to do */
> > 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]