[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging witho
From: |
Greg Kurz |
Subject: |
Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS |
Date: |
Wed, 24 May 2017 17:54:10 +0200 |
On Wed, 24 May 2017 12:14:02 +0200
Igor Mammedov <address@hidden> wrote:
> On Wed, 24 May 2017 11:28:57 +0200
> Greg Kurz <address@hidden> wrote:
>
> > On Wed, 24 May 2017 15:07:54 +1000
> > David Gibson <address@hidden> wrote:
> >
> > > On Tue, May 23, 2017 at 01:18:11PM +0200, Laurent Vivier wrote:
> > > > If the OS is not started, QEMU sends an event to the OS
> > > > that is lost and cannot be recovered. An unplug is not
> > > > able to restore QEMU in a coherent state.
> > > > So, while the OS is not started, disable CPU and memory hotplug.
> > > > We use option vector 6 to know if the OS is started
> > > >
> > > > Signed-off-by: Laurent Vivier <address@hidden>
> > >
> > > Urgh.. I'm not terribly confident that this is really correct. As
> > > discussed on the previous patch, you're essentially using OV6 as a
> > > flag that CAS is complete.
> > >
> > > But while it undoubtedly makes the race window much smaller, I don't
> > > see that there's any guarantee the guest OS will really be able to
> > > handle hotplug events immediately after CAS.
> > >
> > > In particular if the CAS process completes partially but then needs to
> > > trigger a reboot, I think that would end up setting the ov6 variable,
> > > but the OS would definitely not be in a state to accept events.
> wouldn't guest on reboot pick up updated fdt and online hotplugged
> before crash cpu along with initial cpus?
>
Yes and that's what actually happens with cpus.
But catching up with the background for this series, I have the
impression that the issue isn't the fact we loose an event if the OS
isn't started (which is not true), but more something wrong happening
when hotplugging+unplugging memory as described in this commit:
commit fe6824d12642b005c69123ecf8631f9b13553f8b
Author: Laurent Vivier <address@hidden>
Date: Tue Mar 28 14:09:34 2017 +0200
spapr: fix memory hot-unplugging
> > We never have any guarantee that the OS will process an event that
> > we've sent actually (think of a kernel crash just after a successful
> > CAS negotiation for example, or any failure with the various guest
> > components involved in the process of hotplug).
> >
> > > Mike, I really think we need some input from someone familiar with how
> > > these hotplug events are supposed to work. What do we need to do to
> > > handle lost or stale events, such as those delivered when an OS is not
> > > booted.
> > >
> >
> > AFAIK, in the PowerVM world, the HMC exposes a user configurable timeout.
> >
> > https://www.ibm.com/support/knowledgecenter/POWER8/p8hat/p8hat_dlparprocpoweraddp6.htm
> >
> > I'm not sure we can do anything better than being able to "cancel" a
> > previous
> > hotplug attempt if it takes too long, but I'm not necessarily the expert
> > you're
> > looking for :)
> From x86/ACPI world:
> - if hotplug happens early at boot before guest OS is running
> hotplug notification (SCI interrupt) stays pending and once guest
> is up it will/should handle it and online CPU
> - if guest crashed and is rebooted it will pickup updated apci tables (fdt
> equivalent)
> with all present cpus (including hotplugged one before crash) and online
> hotplugged cpu along with coldplugged ones
> - if guest looses SCI somehow, it's considered guest issue and such cpu
> stays unpluggable until guest picks it somehow (reboot, manually running
> cpus scan
> method from ACPI or another cpu hotplug event) and explicitly ejects it.
>
> Taking in account that CPUs don't support surprise removal and requires
> guest cooperation it's fine to leave CPU plugged in until guest ejects it.
> That's what I'd expect to happen on baremetal,
> you hotplug CPU, hardware notifies OS about it and that's all,
> cpu won't suddenly pop out if OS isn't able to online it.
>
> More over that hotplugged cpu might be executing some code or one of
> already present cpus might be executing initialization routines to online
> it (think of host overcommit and arbitrary delays) so it is not really safe
> to remove hotplugged but not onlined cpu without OS consent
> (i.e. explicit eject by OS/firmware). I think the lost event handling should
> be
> fixed on guest side and not in QEMU.
>
>
pgpdOytsVsKIO.pgp
Description: OpenPGP digital signature
- [Qemu-devel] [PATCH 1/4] spapr: add pre_plug function for memory, (continued)
[Qemu-devel] [PATCH 3/4] spapr: disable hotplugging without OS, Laurent Vivier, 2017/05/23
- Re: [Qemu-devel] [PATCH 3/4] spapr: disable hotplugging without OS, David Gibson, 2017/05/24
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, Greg Kurz, 2017/05/24
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, Igor Mammedov, 2017/05/24
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS,
Greg Kurz <=
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, Laurent Vivier, 2017/05/24
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, Michael Roth, 2017/05/24
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, David Gibson, 2017/05/24
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, Michael Roth, 2017/05/30
- Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, David Gibson, 2017/05/31
Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, David Gibson, 2017/05/24
Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/4] spapr: disable hotplugging without OS, David Gibson, 2017/05/24
[Qemu-devel] [PATCH 4/4] Revert "spapr: fix memory hot-unplugging", Laurent Vivier, 2017/05/23
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/4] spapr: disable hotplugging without OS, Daniel Henrique Barboza, 2017/05/23