[PATCH 0/3] kvm: fix two svm pmu virtualization bugs

From: Dongli Zhang
Subject: [PATCH 0/3] kvm: fix two svm pmu virtualization bugs
Date: Sat, 19 Nov 2022 04:28:58 -0800

This patchset is to fix two svm pmu virtualization bugs.

1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization.

To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu
virtualization. There is still below at the VM linux side ...

[    0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.

... although we expect something like below.

[    0.596381] Performance Events: PMU not available due to virtualization, 
using software events only.
[    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled

The patch 1-2 is to disable the pmu virtualization via KVM_PMU_CAP_DISABLE
if the per-vcpu "pmu" property is disabled.

I considered 'KVM_X86_SET_MSR_FILTER' initially.
Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I
finally used the latter because it is easier to use.

2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset)
at the KVM side may inject random unwanted/unknown NMIs to the VM.

The svm pmu registers are not reset during QEMU system_reset.

(1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it
is running "perf top". The pmu registers are not disabled gracefully.

(2). Although the x86_cpu_reset() resets many registers to zero, the
kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result,
some pmu events are still enabled at the KVM side.

(3). The KVM pmc_speculative_in_use() always returns true so that the events
will not be reclaimed. The kvm_pmc->perf_event is still active.

(4). After the reboot, the VM kernel reports below error:

[    0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, 
complain to your hardware vendor.
[    0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 
c0010200 is 530076)

(5). In a worse case, the active kvm_pmc->perf_event is still able to
inject unknown NMIs randomly to the VM kernel.

[...] Uhhuh. NMI received for unknown reason 30 on CPU 0.

The patch 3 is to fix the issue by resetting AMD pmu registers as well as
Intel registers.

This patchset does cover does not cover PerfMonV2, until the below patchset
is merged into the KVM side.

[PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support

Dongli Zhang (3):
      kvm: introduce a helper before creating the 1st vcpu
      i386: kvm: disable KVM_CAP_PMU_CAPABILITY if "pmu" is disabled
      target/i386/kvm: get and put AMD pmu registers

 accel/kvm/kvm-all.c    |   7 ++-
 include/sysemu/kvm.h   |   2 +
 target/arm/kvm64.c     |   4 ++
 target/i386/cpu.h      |   5 +++
 target/i386/kvm/kvm.c  | 104 +++++++++++++++++++++++++++++++++++++++++++-
 target/mips/kvm.c      |   4 ++
 target/ppc/kvm.c       |   4 ++
 target/riscv/kvm.c     |   4 ++
 target/s390x/kvm/kvm.c |   4 ++
 9 files changed, 134 insertions(+), 4 deletions(-)

Thank you very much!

Dongli Zhang

