On Wed, Nov 25, 2020 at 8:09 AM Jason Wang <jasowang@redhat.com> wrote:
On 2020/11/21 上午2:50, Eugenio Pérez wrote:
This series enable vDPA software assisted live migration for vhost-net
devices. This is a new method of vhost devices migration: Instead of
relay on vDPA device's dirty logging capability, SW assisted LM
intercepts dataplane, forwarding the descriptors between VM and device.
In this migration mode, qemu offers a new vring to the device to
read and write into, and disable vhost notifiers, processing guest and
vhost notifications in qemu. On used buffer relay, qemu will mark the
dirty memory as with plain virtio-net devices. This way, devices does
not need to have dirty page logging capability.
This series is a POC doing SW LM for vhost-net devices, which already
have dirty page logging capabilities. None of the changes have actual
effect with current devices until last two patches (26 and 27) are
applied, but they can be rebased on top of any other. These checks the
device to meet all requirements, and disable vhost-net devices logging
so migration goes through SW LM. This last patch is not meant to be
applied in the final revision, it is in the series just for testing
purposes.
For use SW assisted LM these vhost-net devices need to be instantiated:
* With IOMMU (iommu_platform=on,ats=on)
* Without event_idx (event_idx=off)
So a question is at what level do we want to implement qemu assisted
live migration. To me it could be done at two levels:
1) generic vhost level which makes it work for both vhost-net/vhost-user
and vhost-vDPA
2) a specific type of vhost
To me, having a generic one looks better but it would be much more
complicated. So what I read from this series is it was a vhost kernel
specific software assisted live migration which is a good start.
Actually it may even have real use case, e.g it can save dirty bitmaps
for guest with large memory. But we need to address the above
limitations first.
So I would like to know what's the reason for mandating iommu platform
and ats? And I think we need to fix case of event idx support.
There is no specific reason for mandating iommu & ats, it was just
started that way.
I will extend the patch to support those cases too.
Just the notification forwarding (with no descriptor relay) can be
achieved with patches 7 and 9, and starting migration. Partial applies
between 13 and 24 will not work while migrating on source, and patch
25 is needed for the destination to resume network activity.
It is based on the ideas of DPDK SW assisted LM, in the series of
Actually we're better than that since there's no need the trick like
hardcoded IOVA for mediated(shadow) virtqueue.
DPDK's https://patchwork.dpdk.org/cover/48370/ .
I notice that you do GPA->VA translations and try to establish a VA->VA
(use VA as IOVA) mapping via device IOTLB. This shortcut should work for
vhost-kernel/user but not vhost-vDPA. The reason is that there's no
guarantee that the whole 64bit address range could be used as IOVA. One
example is that for hardware IOMMU like intel, it usually has 47 or 52
bits of address width.
So we probably need an IOVA allocator that can make sure the IOVA is not
overlapped and fit for [1]. We can probably build the IOVA for guest VA
via memory listeners. Then we have
1) IOVA for GPA
2) IOVA for shadow VQ
And advertise IOVA to VA mapping to vhost.
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b48dc03e575a872404f33b04cd237953c5d7498
Got it, will control it too.
Maybe for vhost-net we could directly send iotlb miss for [0,~0ULL].