[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [PATCH v2] spapr: fix memory hotplug error path
From: |
David Gibson |
Subject: |
Re: [Qemu-ppc] [PATCH v2] spapr: fix memory hotplug error path |
Date: |
Tue, 4 Jul 2017 20:57:55 +1000 |
User-agent: |
Mutt/1.8.0 (2017-02-23) |
On Tue, Jul 04, 2017 at 09:42:52AM +0200, Greg Kurz wrote:
> On Mon, 3 Jul 2017 14:03:07 -0300
> Daniel Henrique Barboza <address@hidden> wrote:
>
> > On 07/03/2017 11:54 AM, Greg Kurz wrote:
> > > QEMU shouldn't abort if spapr_add_lmbs()->spapr_drc_attach() fails.
> > > Let's propagate the error instead, like it is done everywhere else
> > > where spapr_drc_attach() is called.
> > >
> > > Signed-off-by: Greg Kurz <address@hidden>
> > > ---
> > > Changes in v2: - added rollback code
> > > ---
> > > hw/ppc/spapr.c | 26 ++++++++++++++++++++++----
> > > 1 file changed, 22 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 70b3fd374e2b..ba8f57a5a054 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -2601,6 +2601,7 @@ static void spapr_add_lmbs(DeviceState *dev,
> > > uint64_t addr_start, uint64_t size,
> > > int i, fdt_offset, fdt_size;
> > > void *fdt;
> > > uint64_t addr = addr_start;
> > > + Error *local_err = NULL;
> > >
> > > for (i = 0; i < nr_lmbs; i++) {
> > > drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
> > > @@ -2611,7 +2612,18 @@ static void spapr_add_lmbs(DeviceState *dev,
> > > uint64_t addr_start, uint64_t size,
> > > fdt_offset = spapr_populate_memory_node(fdt, node, addr,
> > >
> > > SPAPR_MEMORY_BLOCK_SIZE);
> > >
> > > - spapr_drc_attach(drc, dev, fdt, fdt_offset, errp);
> > > + spapr_drc_attach(drc, dev, fdt, fdt_offset, &local_err);
> > > + if (local_err) {
> > > + while (addr > addr_start) {
> > > + addr -= SPAPR_MEMORY_BLOCK_SIZE;
> > > + drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
> > > + addr / SPAPR_MEMORY_BLOCK_SIZE);
> > > + spapr_drc_detach(drc, dev, NULL);
> >
> > Question: why pass NULL instead of '&local_err' in detach() ?
>
> local_err already contains the error returned by spapr_drc_attach(), it
> cannot be used anymore.
>
> > Can we ignore any
> > any error that happens in detach() because you're propagating the
> > local_err set
> > in spapr_drc_attach() already?
>
> This is a rollback path: the best we can do is to try to undo the
> already attached DRCs...
Yeah, this is basically the classic exception-during-destructor
problem. Dealing with an error condition while in the middle of
rolling back another error can get really messy.
> > ps: I am aware that ATM detach() does not propagate anything to the errp
> > being
> > passed. I don't think it's wise to write new code based on this
> > assumption though.
> >
>
> ... and indeed detach() cannot fail with the current code base. :)
> David was even planning to drop the errp argument in the "spapr:
> Refactor spapr_drc_detach()" patch.
Yeah, but as you pointed out in comments on that patch, that's
probably not a good idea. Except.. then I looked deeper still and the
story gets more complex again. I'll revisit that when I respin those
DRC patches.
> Otherwise, maybe pass &error_abort since failing to rollback is
> likely to be a bug ?
Yeah, I think the only sane options here are either to ignore (second
order) errors, or abort.
My usual inclination would be to abort, but I'm a bit concerned that
provides a means by which the guest can crash qemu. I'm going to
apply the patch as is for now, since I'm pretty confident that it
makes things better than they weren. I hope we can continue to
improve the detach error handling as a follow up.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature