[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 00/27] vDPA software assisted live migration
From: |
Eugenio Perez Martin |
Subject: |
Re: [RFC PATCH 00/27] vDPA software assisted live migration |
Date: |
Wed, 25 Nov 2020 13:03:16 +0100 |
On Wed, Nov 25, 2020 at 8:09 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/11/21 上午2:50, Eugenio Pérez wrote:
> > This series enable vDPA software assisted live migration for vhost-net
> > devices. This is a new method of vhost devices migration: Instead of
> > relay on vDPA device's dirty logging capability, SW assisted LM
> > intercepts dataplane, forwarding the descriptors between VM and device.
> >
> > In this migration mode, qemu offers a new vring to the device to
> > read and write into, and disable vhost notifiers, processing guest and
> > vhost notifications in qemu. On used buffer relay, qemu will mark the
> > dirty memory as with plain virtio-net devices. This way, devices does
> > not need to have dirty page logging capability.
> >
> > This series is a POC doing SW LM for vhost-net devices, which already
> > have dirty page logging capabilities. None of the changes have actual
> > effect with current devices until last two patches (26 and 27) are
> > applied, but they can be rebased on top of any other. These checks the
> > device to meet all requirements, and disable vhost-net devices logging
> > so migration goes through SW LM. This last patch is not meant to be
> > applied in the final revision, it is in the series just for testing
> > purposes.
> >
> > For use SW assisted LM these vhost-net devices need to be instantiated:
> > * With IOMMU (iommu_platform=on,ats=on)
> > * Without event_idx (event_idx=off)
>
>
> So a question is at what level do we want to implement qemu assisted
> live migration. To me it could be done at two levels:
>
> 1) generic vhost level which makes it work for both vhost-net/vhost-user
> and vhost-vDPA
> 2) a specific type of vhost
>
> To me, having a generic one looks better but it would be much more
> complicated. So what I read from this series is it was a vhost kernel
> specific software assisted live migration which is a good start.
> Actually it may even have real use case, e.g it can save dirty bitmaps
> for guest with large memory. But we need to address the above
> limitations first.
>
> So I would like to know what's the reason for mandating iommu platform
> and ats? And I think we need to fix case of event idx support.
>
There is no specific reason for mandating iommu & ats, it was just
started that way.
I will extend the patch to support those cases too.
>
> >
> > Just the notification forwarding (with no descriptor relay) can be
> > achieved with patches 7 and 9, and starting migration. Partial applies
> > between 13 and 24 will not work while migrating on source, and patch
> > 25 is needed for the destination to resume network activity.
> >
> > It is based on the ideas of DPDK SW assisted LM, in the series of
>
>
> Actually we're better than that since there's no need the trick like
> hardcoded IOVA for mediated(shadow) virtqueue.
>
>
> > DPDK's https://patchwork.dpdk.org/cover/48370/ .
>
>
> I notice that you do GPA->VA translations and try to establish a VA->VA
> (use VA as IOVA) mapping via device IOTLB. This shortcut should work for
> vhost-kernel/user but not vhost-vDPA. The reason is that there's no
> guarantee that the whole 64bit address range could be used as IOVA. One
> example is that for hardware IOMMU like intel, it usually has 47 or 52
> bits of address width.
>
> So we probably need an IOVA allocator that can make sure the IOVA is not
> overlapped and fit for [1]. We can probably build the IOVA for guest VA
> via memory listeners. Then we have
>
> 1) IOVA for GPA
> 2) IOVA for shadow VQ
>
> And advertise IOVA to VA mapping to vhost.
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b48dc03e575a872404f33b04cd237953c5d7498
>
Got it, will control it too.
Maybe for vhost-net we could directly send iotlb miss for [0,~0ULL].
>
> >
> > Comments are welcome.
> >
> > Thanks!
> >
> > Eugenio Pérez (27):
> > vhost: Add vhost_dev_can_log
> > vhost: Add device callback in vhost_migration_log
> > vhost: Move log resize/put to vhost_dev_set_log
> > vhost: add vhost_kernel_set_vring_enable
> > vhost: Add hdev->dev.sw_lm_vq_handler
> > virtio: Add virtio_queue_get_used_notify_split
> > vhost: Route guest->host notification through qemu
> > vhost: Add a flag for software assisted Live Migration
> > vhost: Route host->guest notification through qemu
> > vhost: Allocate shadow vring
> > virtio: const-ify all virtio_tswap* functions
> > virtio: Add virtio_queue_full
> > vhost: Send buffers to device
> > virtio: Remove virtio_queue_get_used_notify_split
> > vhost: Do not invalidate signalled used
> > virtio: Expose virtqueue_alloc_element
> > vhost: add vhost_vring_set_notification_rcu
> > vhost: add vhost_vring_poll_rcu
> > vhost: add vhost_vring_get_buf_rcu
> > vhost: Return used buffers
> > vhost: Add vhost_virtqueue_memory_unmap
> > vhost: Add vhost_virtqueue_memory_map
> > vhost: unmap qemu's shadow virtqueues on sw live migration
> > vhost: iommu changes
> > vhost: Do not commit vhost used idx on vhost_virtqueue_stop
> > vhost: Add vhost_hdev_can_sw_lm
> > vhost: forbid vhost devices logging
> >
> > hw/virtio/vhost-sw-lm-ring.h | 39 +++
> > include/hw/virtio/vhost.h | 5 +
> > include/hw/virtio/virtio-access.h | 8 +-
> > include/hw/virtio/virtio.h | 4 +
> > hw/net/virtio-net.c | 39 ++-
> > hw/virtio/vhost-backend.c | 29 ++
> > hw/virtio/vhost-sw-lm-ring.c | 268 +++++++++++++++++++
> > hw/virtio/vhost.c | 431 +++++++++++++++++++++++++-----
> > hw/virtio/virtio.c | 18 +-
> > hw/virtio/meson.build | 2 +-
> > 10 files changed, 758 insertions(+), 85 deletions(-)
> > create mode 100644 hw/virtio/vhost-sw-lm-ring.h
> > create mode 100644 hw/virtio/vhost-sw-lm-ring.c
>
>
> So this looks like a pretty huge patchset which I'm trying to think of
> ways to split. An idea is to do this is two steps
>
> 1) implement a shadow virtqueue mode for vhost first (w/o live
> migration). Then we can test descriptors relay, IOVA allocating, etc.
How would that mode be activated if it is not tied to live migration?
New backend/command line switch?
Maybe it is better to also start with no iommu & ats support and add it on top.
> 2) add live migration support on top
>
> And it looks to me it's better to split the shadow virtqueue (virtio
> driver part) into an independent file. And use generic name (w/o
> "shadow") in order to be reused by other use cases as well.
>
I think the same.
Thanks!
> Thoughts?
>
- Re: [RFC PATCH 23/27] vhost: unmap qemu's shadow virtqueues on sw live migration, (continued)
- [RFC PATCH 24/27] vhost: iommu changes, Eugenio Pérez, 2020/11/20
- [RFC PATCH 25/27] vhost: Do not commit vhost used idx on vhost_virtqueue_stop, Eugenio Pérez, 2020/11/20
- [RFC PATCH 26/27] vhost: Add vhost_hdev_can_sw_lm, Eugenio Pérez, 2020/11/20
- [RFC PATCH 27/27] vhost: forbid vhost devices logging, Eugenio Pérez, 2020/11/20
- Re: [RFC PATCH 00/27] vDPA software assisted live migration, Eugenio Perez Martin, 2020/11/20
- Re: [RFC PATCH 00/27] vDPA software assisted live migration, no-reply, 2020/11/20
- Re: [RFC PATCH 00/27] vDPA software assisted live migration, Jason Wang, 2020/11/25
- Re: [RFC PATCH 00/27] vDPA software assisted live migration,
Eugenio Perez Martin <=
- Re: [RFC PATCH 00/27] vDPA software assisted live migration, Stefano Garzarella, 2020/11/27