qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [question] VFIO Device Migration: The vCPU may be paused during vfio


From: Kunkun Jiang
Subject: Re: [question] VFIO Device Migration: The vCPU may be paused during vfio device DMA in iommu nested stage mode && vSVA
Date: Mon, 27 Sep 2021 20:30:03 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

Hi Kevin:

On 2021/9/24 14:47, Tian, Kevin wrote:
From: Kunkun Jiang <jiangkunkun@huawei.com>
Sent: Friday, September 24, 2021 2:19 PM

Hi all,

I encountered a problem in vfio device migration test. The
vCPU may be paused during vfio-pci DMA in iommu nested
stage mode && vSVA. This may lead to migration fail and
other problems related to device hardware and driver
implementation.

It may be a bit early to discuss this issue, after all, the iommu
nested stage mode and vSVA are not yet mature. But judging
from the current implementation, we will definitely encounter
this problem in the future.
Yes, this is a known limitation to support migration with vSVA.

This is the current process of vSVA processing translation fault
in iommu nested stage mode (take SMMU as an example):

guest os            4.handle translation fault 5.send CMD_RESUME to vSMMU


qemu                3.inject fault into guest os 6.deliver response to
host os
(vfio/vsmmu)


host os              2.notify the qemu 7.send CMD_RESUME to SMMU
(vfio/smmu)


SMMU              1.address translation fault              8.retry or
terminate

The order is 1--->8.

Currently, qemu may pause vCPU at any step. It is possible to
pause vCPU at step 1-5, that is, in a DMA. This may lead to
migration fail and other problems related to device hardware
and driver implementation. For example, the device status
cannot be changed from RUNNING && SAVING to SAVING,
because the device DMA is not over.

As far as i can see, vCPU should not be paused during a device
IO process, such as DMA. However, currently live migration
does not pay attention to the state of vfio device when pausing
the vCPU. And if the vCPU is not paused, the vfio device is
always running. This looks like a *deadlock*.
Basically this requires:

1) stopping vCPU after stopping device (could selectively enable
this sequence for vSVA);
How to tell if vSVA is open?
In fact, as long as it is in iommu nested stage mode, there will
be such a problem, whether it is vSVA or no-vSVA. In no-vSVA mode,
a fault can also be generated by modifying the guest device driver.

2) when stopping device, the driver should block new requests
from vCPU (queued to a pending list) and then drain all in-fly
requests including faults;
     * to block this further requires switching from fast-path to
slow trap-emulation path for the cmd portal before stopping
the device;

3) save the pending requests in the vm image and replay them
after the vm is resumed;
     * finally disable blocking by switching back to the fast-path for
the cmd portal;
Is there any related patch sent out and discussed? I might have
overlooked that.

We may be able to discuss and finalize a specification for this
problem.

Thanks,
Kunkun Jiang
Do you have any ideas to solve this problem?
Looking forward to your replay.

We verified above flow can work in our internal POC.

Thanks
Kevin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]