qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer


From: David Gibson
Subject: Re: [PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer
Date: Wed, 17 Feb 2021 12:20:54 +1100

On Thu, Feb 11, 2021 at 07:52:44PM -0300, Daniel Henrique Barboza wrote:
> The LoPAR spec provides no way for the guest kernel to report failure of
> hotplug/hotunplug events. This wouldn't be bad if those operations were
> granted to always succeed, but that's far for the reality.
> 
> What ends up happening is that, in the case of a failed hotunplug,
> regardless of whether it was a QEMU error or a guest misbehavior, the pSeries
> machine is retaining the unplug state of the device in the running guest.
> This state is cleanup in machine reset, where it is assumed that this state
> represents a device that is pending unplug, and the device is hotunpluged
> from the board. Until the reset occurs, any hotunplug operation of the same
> device is forbid because there is a pending unplug state.
> 
> This behavior has at least one undesirable side effect. A long standing 
> pending
> unplug state is, more often than not, the result of a hotunplug error. The 
> user
> had to dealt with it, since retrying to unplug the device is noy allowed, and 
> then
> in the machine reset we're removing the device from the guest. This means that
> we're failing the user twice - failed to hotunplug when asked, then 
> hotunplugged
> without notice.
> 
> Solutions to this problem range between trying to predict when the hotunplug 
> will
> fail and forbid the operation from the QEMU layer, from opening up the IRQ 
> queue
> to allow for multiple hotunplug attempts, from telling the users to 'reboot 
> the
> machine if something goes wrong'. The first solution is flawed because we 
> can't
> fully predict guest behavior from QEMU, the second solution is a trial and 
> error
> remediation that counts on a hope that the unplug will eventually succeed, 
> and the
> third is ... well.
> 
> This patch introduces a crude, but effective solution to hotunplug errors in
> the pSeries machine. For each unplug done, we'll timeout after some time. If
> a certain amount of time passes, we'll cleanup the hotunplug state from the 
> machine.
> During the timeout period, any unplug operations in the same device will still
> be blocked. After that, we'll assume that the guest failed the operation, and
> allow the user to try again. If the timeout is too short we'll prevent 
> legitimate
> hotunplug situations to occur, so we'll need to overestimate the regular time
> an unplug operation takes to succeed to account that.
> 
> The true solution for the hotunplug errors in the pSeries machines is a PAPR 
> change
> to allow for the guest to warn the platform about it. For now, the work done 
> in this
> timeout design can be used for the new PAPR 'abort hcall' in the future, 
> given that
> for both cases we'll need code to cleanup the existing unplug states of the 
> DRCs.
> 
> At this moment we're adding the basic wiring of the timer into the DRC. Next 
> patch
> will use the timer to timeout failed CPU hotunplugs.
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  hw/ppc/spapr_drc.c         | 36 ++++++++++++++++++++++++++++++++++++
>  include/hw/ppc/spapr_drc.h |  2 ++
>  2 files changed, 38 insertions(+)
> 
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 67041fb212..c88bb524c5 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -57,6 +57,8 @@ static void spapr_drc_release(SpaprDrc *drc)
>      drck->release(drc->dev);
>  
>      drc->unplug_requested = false;
> +    timer_del(drc->unplug_timeout_timer);
> +
>      g_free(drc->fdt);
>      drc->fdt = NULL;
>      drc->fdt_start_offset = 0;
> @@ -453,6 +455,24 @@ static const VMStateDescription 
> vmstate_spapr_drc_unplug_requested = {
>      }
>  };
>  
> +static bool spapr_drc_unplug_timeout_timer_needed(void *opaque)
> +{
> +    SpaprDrc *drc = opaque;
> +
> +    return timer_pending(drc->unplug_timeout_timer);
> +}
> +
> +static const VMStateDescription vmstate_spapr_drc_unplug_timeout_timer = {
> +    .name = "DRC unplug timeout timer",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_drc_unplug_timeout_timer_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_TIMER_PTR(unplug_timeout_timer, SpaprDrc),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static bool spapr_drc_needed(void *opaque)
>  {
>      SpaprDrc *drc = opaque;
> @@ -486,10 +506,20 @@ static const VMStateDescription vmstate_spapr_drc = {
>      },
>      .subsections = (const VMStateDescription * []) {
>          &vmstate_spapr_drc_unplug_requested,
> +        &vmstate_spapr_drc_unplug_timeout_timer,
>          NULL
>      }
>  };
>  
> +static void drc_unplug_timeout_cb(void *opaque)
> +{
> +    SpaprDrc *drc = opaque;
> +
> +    if (drc->unplug_requested) {
> +        drc->unplug_requested = false;
> +    }

Sorry, forgot to mention in first pass - I think we want some kind of
reporting here.  At least a trace, and maybe also a warn_report() or
the like.

Hrm.. looking wider, I wonder if we should add a DEVICE_DELETE_FAILED
message to QAPI to mirror the DEVICE_DELETED message.  Fixing PAPR is
pretty tedious, but fixing at least qemu's own interfaces is a bit
more approachable.  That could certainly be a follow up enhancement,
though.

> +}
> +
>  static void drc_realize(DeviceState *d, Error **errp)
>  {
>      SpaprDrc *drc = SPAPR_DR_CONNECTOR(d);
> @@ -512,6 +542,11 @@ static void drc_realize(DeviceState *d, Error **errp)
>      object_property_add_alias(root_container, link_name,
>                                drc->owner, child_name);
>      g_free(link_name);
> +
> +    drc->unplug_timeout_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> +                                             drc_unplug_timeout_cb,
> +                                             drc);
> +
>      vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), 
> &vmstate_spapr_drc,
>                       drc);
>      trace_spapr_drc_realize_complete(spapr_drc_index(drc));
> @@ -529,6 +564,7 @@ static void drc_unrealize(DeviceState *d)
>      name = g_strdup_printf("%x", spapr_drc_index(drc));
>      object_property_del(root_container, name);
>      g_free(name);
> +    timer_free(drc->unplug_timeout_timer);
>  }
>  
>  SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 02a63b3666..b2e6222d09 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -187,6 +187,8 @@ typedef struct SpaprDrc {
>      bool unplug_requested;
>      void *fdt;
>      int fdt_start_offset;
> +
> +    QEMUTimer *unplug_timeout_timer;
>  } SpaprDrc;
>  
>  struct SpaprMachineState;

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]