qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SM


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI
Date: Thu, 24 Nov 2016 01:31:50 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.0

On 11/24/16 01:01, Laszlo Ersek wrote:
> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
>>
>>
>> On 18/11/2016 11:36, Laszlo Ersek wrote:
>>> This is v3 of the series, with updates based on the v2 discussion:
>>> <http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg02687.html>.
>>>
>>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>>
>>> Tested with new OVMF patches (about to send out those as well).
>>> Regression tested with SeaBIOS (beyond simple functional tests with
>>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>>> with SeaBIOS).
>>>
>>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>>> crashes very quickly for me when running OVMF:
>>>
>>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>>
>>> It is my understanding that there are patches on the list for this:
>>>
>>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>>
>>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>>> differences.
>>
>> Hi Laszlo,
>>
>> sorry for the slightly delayed reply.
>>
>> First of all, I'm wondering if we would be better off adding a new port
>> 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.
> 
> BTW, what happens in QEMU if the guest reads an unimplemented port?
> Hm... unassigned_io_write() seems to be a no-op, and
> unassigned_io_read() returns all-bits-one. This means that for a new
> port, the negotiation protocol / values have to be reworked.
> 
> Port 0xB1 is occupied by ICH9 according to the spec:
> 
>   Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)
> 
>   I/O
>   Address  Read Target           Write Target          Internal Unit
>   -------  --------------------  --------------------  -------------
>   B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt
> 
> I wonder if we care -- after all, APM_STS (0xB3) is documented not to
> have any hardware effects ("scratchpad register").
> 
>> Second, I now remembered the reason why I was against broadcast SMI.
>> The reason is that it breaks hot-plug.
> 
> How does it break hot-plug? After reading your explanation below: is it
> that the broadcast SMI (possibly raised by the OS directly) would get to
> the new VCPU before the firmware relocated its SMBASE?
> 
>>
>> On hot-plug, the firmware (if it wants to use SMI for anything secure)
>> must relocate the SMBASE of the newly-hotplugged CPU before the OS has
>> any chance to put its fangs on it.  This is because the OS can send
>> direct SMIs and is in control of the area at 0x30000.  So OVMF is
>> already broken with respect to hotplug,
> 
> Yes. We theorized that there could be further edk2 core modules that
> hadn't been open sourced yet, necessary for handling VCPU hotplug.
> 
>> but I am not yet sure if these
>> patches would break it further.
> 
> Hard to say without seeing those modules.
> 
> I will speculate: assuming that the non-public modules fit together with
> the public modules in some way, I expect the broadcast SMI shouldn't
> break them. The reason is that the broadcast SMI / traditional delivery
> are the default method in UefiCpuPkg, and in practice they work better
> (more reliably) with the rest of the edk2 infrastructure, in my
> experience, than the relaxed sync method.
> 
> In his review today, Jordan wrote
> <https://lists.01.org/pipermail/edk2-devel/2016-November/005088.html>,
> 
>> I'm glad we'll be using a mechanism that broadcasts to all the
>> processors like the real hardware. It is a bit unfortunate that it
>> doesn't go through the b2 port for it.
> 
> If broadcast is how real hardware does it (even by default!,
> apparently), I expect those as-yet unreleased, hotplug-handling modules
> in edk2 should be able to cope with broadcast.
> 
>> The full solution is to follow a protocol similar to what real hardware
>> does.
> 
> Real hardware seems to use broadcast, according to the above...
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> 
>> On hot-plug, before the new CPU starts running, the DSDT should
>> generate an SMI (with a well-known value written to 0xB2).
> 
> I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
> that's just another way to say, "the firmware vendor asks the operating
> system to write to 0xB2". If the malicious OS is smart enough, it can
> realize (from the hardware signal to run the ACPI GPE handler, IIRC)
> that a new VCPU is available, and simply not trigger the SMI.
> 
>> The handler
>> then:
>>
>> 1) waits for SMM rendezvous
>>
>> 2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
>> react to either INITs or of course SIPIs (this would need to be
>> implemented separately for both TCG and KVM! but only in QEMU, not in
>> the kernel).
> 
> Okay, this does plug the hole I sketched out above. This logic (the
> QEMU-specific unparking) can be done in another platform API in OVMF I
> guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
> provide the infrastructure in platform code up to the separate SMI
> command handler. I think it again depends on those unreleased modules.
> 
>> 3) relocates SMBASE
>>
>> 4) records the presence of the new VCPU's APIC id for subsequent SMIs.
>>
>>
>> The other important things are:
>>
>> * new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
>> interfaces.  I think this would only need a new bit in the write
>> register at 0x4 ("CPU device control fields").
>>
>> The 0x0-0x3 write register, currently reserved for reading, might
>> become read/write for easier communication with the SMI handler.  The
>> SMI handler would write 1 to the new bit in order to unpark the VCPU.
>>
>> * how to enable this.  I think it would need a new SMM feature, just
>> like those that you are adding here, in order to make it negotiable.  If
>> it is not negotiated, but broadcast SMI is negotiated, you'd need to do
>> something such as not allowing CPU-hotplug.  (This is the only part that
>> I think is required for 2.9).
> 
> My hope is that we can work on this incrementally (or at least that we
> don't forfeit our chance at SMM + VCPU hotplug) by going down the
> broadcast SMI route first. Based on what Jordan wrote, this hope should
> not be unfounded.
> 
> Also, the SMM stack (across components: core edk2 code, OVMF platform
> code, QEMU and KVM) is now more than a year old, and we still have
> stability problems (hence this series). I'd like to stabilize the base
> features before getting wild with hotplug.
> 
> (The fact that OVMF lags behind SeaBIOS in a number of features is just
> business as usual.)
> 
> So, I'd vastly prefer if I needed to extend this series only with a way
> to prevent VCPUs from being hotplugged. I think negotiating SMI
> broadcast should be good enough for that, as a hook point, given that it
> runs in the firmware before the OS gets control (on S3 resume too).
> 
> I guess a global variable (or a board-level query function) that would
> cause pc_query_hotpluggable_cpus() to return an empty list would be
> frowned upon; what's the recommended method?
> 
> There is DeviceClass.hotpluggable, but that's not a dynamic property.
> 
>>
>> In order to trigger the SMI, the
>>
>>                  ifctx = aml_if(aml_equal(ins_evt, one));
>>                  {
>>                      aml_append(ifctx,
>>                          aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
>>                      aml_append(ifctx, aml_store(one, ins_evt));
>>                      aml_append(ifctx, aml_store(one, has_event));
>>                  }
>>
>> would be replaced by something like this pseudo-AML:
>>
>>              Store(And(smm_features, 2), Local1)
>>              ...
>>              Store(next_cpu_cmd, cpu_cmd)
>>              If (Equal(ins_evt, One)) {
>>                      If (Greater(Local1, Zero)) {
>>                              Store(CPU_HP_APM_CMD, apm_cmd)
>>                      }
>>                      CPU_NOTIFY_METHOD(cpu_data, dev_chk)
>>                      Store(One, ins_evt)
>>                      Store(One, has_event)
>>              }
>>
>>
>> Of course all this is for OVMF only.  SeaBIOS doesn't need to do
>> anything of this, because it actually likes to have its SMIs only on the
>> current CPU (and it had better be the BSP, since SMBASE is not relocated
>> elsewhere!).
>>
>> Igor, any thoughts?
>>
>> I understand that this is quite huge in both OVMF and QEMU, but we've
>> only been delaying it and we knew about it. :(
> 
> I agree about being aware of lack of VCPU hotplug support wrt. SMM.
> 
> I disagree about delaying it. I've been aware of problems with the base
> SMM features (of differing importance), for example
> 
> - that S3 resume with the Ia32 build of OVMF was almost hit-or-miss,
> 
> - or that the heavy use of UEFI variable services during Windows 8 boot
> caused user-visible choppiness in the rotating animation (because of how
> our SMIs worked) -- I think I even mentioned this to you.
> 
> Since the stalemate in
> 
>   http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg00125.html
> 
> from last November, I've never stopped disliking the lack of broadcast
> SMIs, but I couldn't do anything about it, despite it leaving the base
> feature set unreliable.

And let's not forget about "niceties" like

- host kernel commit 879ae1880449 ("KVM: x86: obey
KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()"), and

- kernel bugzilla <https://bugzilla.kernel.org/show_bug.cgi?id=116731>,

which ate up incredible amounts of time, sending a whole lot of
VFIO/OVMF/GPU-assign gamers from the vfio-users list complaining on
every forum they could think of.

(These issues weren't related to SMM directly, but PiSmmCpuDxeSmm has a
depex on gEfiMpServiceProtocolGuid, so anything that interferes with
multiprocessing in CpuDxe or MpInitLib is automatically an SMM bug too.)

I can think of yet more: have you gotten around looking into
<https://bugzilla.redhat.com/show_bug.cgi?id=1348092>, which is about
SMM breaking on KVM hosts that lack EPT?

I believe you haven't. Which is fine. I just take issue with the notion
that we've been "too lazy" to enable SMM + VCPU hotplug.

We got problems that come before.

Thanks
Laszlo

> Recently, the open source edk2 SMM stack has acquired a bunch of new
> code (you're aware), which exacerbate the instabilities (as I've
> documented on the edk2-devel list in excruciating detail). I was
> overjoyed when you finally suggested (!) to add broadcast SMI, getting
> the discussion un-stuck, because (a) it miraculously fixes everything,
> and (b) honestly, I see no other way from keeping the SMM stack from
> falling apart otherwise, *without* VCPU hotplug in the picture.
> 
> For the last few weeks, I've been working day and night (I'm not
> exaggerating) on testing & reviewing edk2 patches related to SMM or even
> just multiprocessing, catching errors, reporting them, and even
> debugging and fixing them myself. (See for example commits
> 
>   00650c531a90 UefiCpuPkg/MpInitLib/X64/MpFuncs.nasm: fix fatal typo
>   4af3ae146379 UefiCpuPkg/LocalApicLib: fix feature test for Extended
>                Topology CPUID leaf
>   1cbd83308989 UefiCpuPkg/MpInitLib: fix feature test for Extended
>                Topology CPUID leaf
> 
> ). Putting out fires more or less as a norm.
> 
> Furthermore, I haven't been pretending that VCPU hotplug "doesn't
> exist", see
> 
>   https://lists.01.org/pipermail/edk2-devel/2016-November/005131.html
> 
> for example, which I posted independently, ~8 hours ago.
> 
> So, no. I reject the idea that we've been procrastinating SMM enabled
> VCPU hotplug. We can't build the second floor while the foundation is
> shaking. And then we need to ask Intel to release more code as open source.
> 
> Thanks
> Laszlo
> 
>>
>> Paolo
>>
>>> Cc: "Kevin O'Connor" <address@hidden>
>>> Cc: "Michael S. Tsirkin" <address@hidden>
>>> Cc: Gerd Hoffmann <address@hidden>
>>> Cc: Paolo Bonzini <address@hidden>
>>>
>>> Thanks
>>> Laszlo
>>>
>>> Laszlo Ersek (3):
>>>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>>>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>>>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
>>>
>>>  docs/specs/q35-apm-sts.txt | 80 
>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/hw/i386/ich9.h     |  9 ++++++
>>>  include/hw/isa/apm.h       |  9 +++---
>>>  hw/acpi/piix4.c            |  2 +-
>>>  hw/isa/apm.c               | 15 ++++++---
>>>  hw/isa/lpc_ich9.c          | 64 +++++++++++++++++++++++++++++++++++--
>>>  hw/isa/vt82c686.c          |  2 +-
>>>  7 files changed, 168 insertions(+), 13 deletions(-)
>>>  create mode 100644 docs/specs/q35-apm-sts.txt
>>>
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]