[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_n
|
From: |
Vitaly Kuznetsov |
|
Subject: |
Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state() |
|
Date: |
Thu, 26 Oct 2023 10:52:46 +0200 |
Cc'ing Max :-) At first glance the condition in vmx_set_nested_state()
is correct so I guess we either have a stale
KVM_STATE_NESTED_RUN_PENDING when in SMM or stale smm.flags when outside
of it...
Philippe Mathieu-Daudé <philmd@linaro.org> writes:
> Cc'ing Vitaly.
>
> On 26/10/23 07:49, Eiichi Tsukata wrote:
>> Hi all,
>>
>> Here is additional details on the issue.
>>
>> We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
>> We sometimes saw live migration failures of VSM-enabled VMs. It turned
>> out that the issue happens during live migration when VMs change boot related
>> EFI variables (ex: BootOrder, Boot0001).
>> After some debugging, I've found the race I mentioned in the commit message.
>>
>> Symptom
>> =======
>>
>> When it happnes with the latest Qemu which has commit
>> https://github.com/qemu/qemu/commit/7191f24c7fcfbc1216d09
>> Qemu shows the following error message on destination.
>>
>> qemu-system-x86_64: Failed to put registers after init: Invalid argument
>>
>> If it happens with older Qemu which doesn't have the commit, then we see
>> CPU dump something like this:
>>
>> KVM internal error. Suberror: 3
>> extra data[0]: 0x0000000080000b0e
>> extra data[1]: 0x0000000000000031
>> extra data[2]: 0x0000000000000683
>> extra data[3]: 0x000000007f809000
>> extra data[4]: 0x0000000000000026
>> RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000
>> RDX=0000000000000f61
>> RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000
>> RSP=0000000000000000
>> R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000
>> R11=0000000000000000
>> R12=0000000000000000 R13=0000000000000000 R14=0000000000000000
>> R15=0000000000000000
>> RIP=000000000000fff0 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
>> SS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> DS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> FS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> GS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> LDT=0000 0000000000000000 ffffffff 00c00000
>> TR =0040 000000007f7df050 00068fff 00808b00 DPL=0 TSS64-busy
>> GDT= 000000007f7df000 0000004f
>> IDT= 000000007f836000 000001ff
>> CR0=80010033 CR2=000000000000fff0 CR3=000000007f809000 CR4=00000668
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000d00
>> Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ??
>> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>> ?? ?? ??
>>
>> In the above dump, CR3 is pointing to SMRAM region though SMM=0.
>>
>> Repro
>> =====
>>
>> Repro step is pretty simple.
>>
>> * Run SMM enabled Linux guest with secure boot enabled OVMF.
>> * Run the following script in the guest.
>>
>> /usr/libexec/qemu-kvm &
>> while true
>> do
>> efibootmgr -n 1
>> done
>>
>> * Do live migration
>>
>> On my environment, live migration fails in 20%.
>>
>> VMX specific
>> ============
>>
>> This issue is VMX sepcific and SVM is not affected as the validation
>> in svm_set_nested_state() is a bit different from VMX one.
>>
>> VMX:
>>
>> static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>> struct kvm_nested_state __user
>> *user_kvm_nested_state,
>> struct kvm_nested_state *kvm_state)
>> {
>> .. /* * SMM temporarily disables VMX, so we cannot
>> be in guest mode,
>> * nor can VMLAUNCH/VMRESUME be pending. Outside SMM, SMM flags
>> * must be zero.
>> */ if (is_smm(vcpu) ?
>> (kvm_state->flags &
>> (KVM_STATE_NESTED_GUEST_MODE |
>> KVM_STATE_NESTED_RUN_PENDING))
>> : kvm_state->hdr.vmx.smm.flags)
>> return -EINVAL;
>> ..
>>
>> SVM:
>>
>> static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>> struct kvm_nested_state __user
>> *user_kvm_nested_state,
>> struct kvm_nested_state *kvm_state)
>> {
>> .. /* SMM temporarily disables SVM, so we cannot be in guest
>> mode. */ if (is_smm(vcpu) && (kvm_state->flags &
>> KVM_STATE_NESTED_GUEST_MODE))
>> return -EINVAL;
>> ..
>>
>> Thanks,
>>
>> Eiichi
>>
>>> On Oct 26, 2023, at 14:42, Eiichi Tsukata <eiichi.tsukata@nutanix.com>
>>> wrote:
>>>
>>> kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
>>> because vCPU's hflag is referred in KVM vmx_get_nested_state()
>>> validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
>>> a vCPU is in VMX operation and enters SMM mode. This leads to live
>>> migration failure.
>>>
>>> Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
>>> ---
>>> target/i386/kvm/kvm.c | 13 +++++++++----
>>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>> index e7c054cc16..cd635c9142 100644
>>> --- a/target/i386/kvm/kvm.c
>>> +++ b/target/i386/kvm/kvm.c
>>> @@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>> return ret;
>>> }
>>>
>>> + /*
>>> + * must be before kvm_put_nested_state so that HF_SMM_MASK is set
>>> during
>>> + * SMM.
>>> + */
>>> + ret = kvm_put_vcpu_events(x86_cpu, level);
>>> + if (ret < 0) {
>>> + return ret;
>>> + }
>>> +
>>> if (level >= KVM_PUT_RESET_STATE) {
>>> ret = kvm_put_nested_state(x86_cpu);
>>> if (ret < 0) {
>>> @@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>> if (ret < 0) {
>>> return ret;
>>> }
>>> - ret = kvm_put_vcpu_events(x86_cpu, level);
>>> - if (ret < 0) {
>>> - return ret;
>>> - }
>>> if (level >= KVM_PUT_RESET_STATE) {
>>> ret = kvm_put_mp_state(x86_cpu);
>>> if (ret < 0) {
>>> --
>>> 2.41.0
>>>
>>
>>
>
--
Vitaly