qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [question] VFIO Device Migration: The vCPU may be paused during vfio


From: Tian, Kevin
Subject: RE: [question] VFIO Device Migration: The vCPU may be paused during vfio device DMA in iommu nested stage mode && vSVA
Date: Sun, 26 Sep 2021 02:48:06 +0000

> From: Kirti Wankhede <kwankhede@nvidia.com>
> Sent: Friday, September 24, 2021 5:29 PM
> 
> On 9/24/2021 12:17 PM, Tian, Kevin wrote:
> >> From: Kunkun Jiang <jiangkunkun@huawei.com>
> >> Sent: Friday, September 24, 2021 2:19 PM
> >>
> >> Hi all,
> >>
> >> I encountered a problem in vfio device migration test. The
> >> vCPU may be paused during vfio-pci DMA in iommu nested
> >> stage mode && vSVA. This may lead to migration fail and
> >> other problems related to device hardware and driver
> >> implementation.
> >>
> >> It may be a bit early to discuss this issue, after all, the iommu
> >> nested stage mode and vSVA are not yet mature. But judging
> >> from the current implementation, we will definitely encounter
> >> this problem in the future.
> >
> > Yes, this is a known limitation to support migration with vSVA.
> >
> >>
> >> This is the current process of vSVA processing translation fault
> >> in iommu nested stage mode (take SMMU as an example):
> >>
> >> guest os            4.handle translation fault 5.send CMD_RESUME to vSMMU
> >>
> >>
> >> qemu                3.inject fault into guest os 6.deliver response to
> >> host os
> >> (vfio/vsmmu)
> >>
> >>
> >> host os              2.notify the qemu 7.send CMD_RESUME to SMMU
> >> (vfio/smmu)
> >>
> >>
> >> SMMU              1.address translation fault              8.retry or
> >> terminate
> >>
> >> The order is 1--->8.
> >>
> >> Currently, qemu may pause vCPU at any step. It is possible to
> >> pause vCPU at step 1-5, that is, in a DMA. This may lead to
> >> migration fail and other problems related to device hardware
> >> and driver implementation. For example, the device status
> >> cannot be changed from RUNNING && SAVING to SAVING,
> >> because the device DMA is not over.
> >>
> >> As far as i can see, vCPU should not be paused during a device
> >> IO process, such as DMA. However, currently live migration
> >> does not pay attention to the state of vfio device when pausing
> >> the vCPU. And if the vCPU is not paused, the vfio device is
> >> always running. This looks like a *deadlock*.
> >
> > Basically this requires:
> >
> > 1) stopping vCPU after stopping device (could selectively enable
> > this sequence for vSVA);
> >
> 
> I don't think this is change is required. When vCPUs are at halt vCPU
> states are already saved, step 4 or 5 will be taken care by that. Then
> when device is transitioned in SAVING state, save qemu and host os state
> in the migration stream, i.e. state at step 2 and 3, depending on that
> take action while resuming, about step 6 or 7 to run.
> 

this is not like normal pending CPU interrupts which can be saved and
migrated.

Here to save the device state you need drain in-fly requests. But in-fly
requests may already hit I/O page faults and are waiting for fault
completion from the CPU. If you pause vCPU in the middle, the fault is
never fixed thus the in-fly requests cannot be drained (unless the device
support preemption per fault, which imho not the case for most SVA-
capable devices). Then you'll either fail migration or migrate broken 
device state.

vCPUs have to continue run until device draining can be completed.
This requirement could be indicated via the migration region.

Thanks
Kevin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]