qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] vhost-vdpa: Do not send empty IOTLB update batches


From: Eugenio Perez Martin
Subject: Re: [RFC PATCH] vhost-vdpa: Do not send empty IOTLB update batches
Date: Wed, 11 Aug 2021 18:40:51 +0200

On Thu, Aug 5, 2021 at 9:10 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/8/5 下午3:04, Eugenio Perez Martin 写道:
> > On Thu, Aug 5, 2021 at 8:16 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Wed, Aug 4, 2021 at 10:44 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>> With the introduction of the batch hinting, meaningless batches can be
> >>> created with no IOTLB updates if the memory region was skipped by
> >>> vhost_vdpa_listener_skipped_section. This is the case of host notifiers
> >>> memory regions, but others could fall on this category. This caused the
> >>> vdpa device to receive dma mapping settings with no changes, a possibly
> >>> expensive operation for nothing.
> >>>
> >>> To avoid that, VHOST_IOTLB_BATCH_BEGIN hint is delayed until we have a
> >>> meaningful (not skipped section) mapping or unmapping operation, and
> >>> VHOST_IOTLB_BATCH_END is not written unless at least one of _UPDATE /
> >>> _INVALIDATE has been issued.
> >> Hi Eugeni:
> >>
> >> This should work but it looks to me it's better to optimize the kernel.
> >>
> >> E.g to make sure we don't send set_map() if there is no real updating
> >> between batch start and end.
> >>
> > Hi Jason,
> >
> > I think we should do both in parallel anyway.
>
>
> Ok, I'm fine with this.
>
>
> >   We also obtain an
> > (unmeasured at this moment) decrease in startup time for qemu with
> > vdpa this way, for example. I consider that this particular RFC has
> > room to improve or change totally of course.
> >
> > I've made these changes in the kernel too, just counting the number of
> > memory updates and not calling set_map if no actual changes have been
> > made.
>
>
> Right, that is what we want to have.
>
>
> >
> >> This makes sure that it can work for all kinds of userspace (not only for 
> >> Qemu).
> >>
> >> Another optimization is to batch the memory region transaction by adding:
> >>
> >> memory_region_transaction_begin() and memory_region_transaction_end() in
> >>
> >> both vhost_vdpa_host_notifiers_init() and 
> >> vhost_vdpa_host_notifiers_uninit().
> >>
> >> This can make sure only at least one memory transactions when
> >> adding/removing host notifier regions.
> >>
> > That solves the updates about memory regions, but it does not solve
> > the case where updating memory regions are not interesting to the
> > device.
>
>
> Kind of, I guess with this we only get one more set_map().
>
>
> > In other words, where all the changes meet
> > vhost_vdpa_listener_skipped_section() == true. I did not spend a lot
> > of time trying to raise these though, maybe it happens when
> > hot-plugging a device, for example?
>
>
> Yes, so transaction is per device optimization that can't help in this case.
>

I've left it out, since we already obtain 0 IOTLB update commits with
this approach. Let me know if you think it should be included.

>
> >
> > We could abstract these changes in memory_region_transaction_begin() /
> > memory_region_transaction_end() wrappers though.
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>   include/hw/virtio/vhost-vdpa.h |  1 +
> >>>   hw/virtio/vhost-vdpa.c         | 38 +++++++++++++++++++++++-----------
> >>>   2 files changed, 27 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/include/hw/virtio/vhost-vdpa.h 
> >>> b/include/hw/virtio/vhost-vdpa.h
> >>> index e98e327f12..0a7edbe4eb 100644
> >>> --- a/include/hw/virtio/vhost-vdpa.h
> >>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>> @@ -23,6 +23,7 @@ typedef struct vhost_vdpa {
> >>>       int device_fd;
> >>>       int index;
> >>>       uint32_t msg_type;
> >>> +    size_t n_iotlb_sent_batch;
>
>
> Not a native speaker but we probably need a better name, e.g "n_mr_updated?"
>

I totally agree.

>
> >>>       MemoryListener listener;
> >>>       struct vhost_dev *dev;
> >>>       VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>> index 6ce94a1f4d..2517fc6103 100644
> >>> --- a/hw/virtio/vhost-vdpa.c
> >>> +++ b/hw/virtio/vhost-vdpa.c
> >>> @@ -89,19 +89,13 @@ static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, 
> >>> hwaddr iova,
> >>>       return ret;
> >>>   }
> >>>
> >>> -static void vhost_vdpa_listener_begin(MemoryListener *listener)
> >>> +static void vhost_vdpa_listener_begin_batch(struct vhost_vdpa *v)
> >>>   {
> >>> -    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, 
> >>> listener);
> >>> -    struct vhost_dev *dev = v->dev;
> >>> -    struct vhost_msg_v2 msg = {};
> >>>       int fd = v->device_fd;
> >>> -
> >>> -    if (!(dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH))) {
> >>> -        return;
> >>> -    }
> >>> -
> >>> -    msg.type = v->msg_type;
> >>> -    msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
> >>> +    struct vhost_msg_v2 msg = {
> >>> +        .type = v->msg_type,
> >>> +        .iotlb.type = VHOST_IOTLB_BATCH_BEGIN,
> >>> +    };
> >>>
> >>>       if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> >>>           error_report("failed to write, fd=%d, errno=%d (%s)",
> >>> @@ -120,6 +114,11 @@ static void 
> >>> vhost_vdpa_listener_commit(MemoryListener *listener)
> >>>           return;
> >>>       }
> >>>
> >>> +    if (v->n_iotlb_sent_batch == 0) {
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    v->n_iotlb_sent_batch = 0;
> >>>       msg.type = v->msg_type;
> >>>       msg.iotlb.type = VHOST_IOTLB_BATCH_END;
> >>>
> >>> @@ -170,6 +169,14 @@ static void 
> >>> vhost_vdpa_listener_region_add(MemoryListener *listener,
> >>>
> >>>       llsize = int128_sub(llend, int128_make64(iova));
> >>>
> >>> +    if (v->dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH)) {
> >>> +        if (v->n_iotlb_sent_batch == 0) {
> >>> +            vhost_vdpa_listener_begin_batch(v);
> >>> +        }
> >>> +
> >>> +        v->n_iotlb_sent_batch++;
> >>> +    }
>
>
> Let abstract this as a helper to be reused by region_del.
>
> Other looks good.
>
> Thanks
>

I sent a PATCH v2 instead of a non-RFC v1 by mistake. Please let me
know if I should do something to change it.

Thanks!

>
> >>> +
> >>>       ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> >>>                                vaddr, section->readonly);
> >>>       if (ret) {
> >>> @@ -221,6 +228,14 @@ static void 
> >>> vhost_vdpa_listener_region_del(MemoryListener *listener,
> >>>
> >>>       llsize = int128_sub(llend, int128_make64(iova));
> >>>
> >>> +    if (v->dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH)) {
> >>> +        if (v->n_iotlb_sent_batch == 0) {
> >>> +            vhost_vdpa_listener_begin_batch(v);
> >>> +        }
> >>> +
> >>> +        v->n_iotlb_sent_batch++;
> >>> +    }
> >>> +
> >>>       ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> >>>       if (ret) {
> >>>           error_report("vhost_vdpa dma unmap error!");
> >>> @@ -234,7 +249,6 @@ static void 
> >>> vhost_vdpa_listener_region_del(MemoryListener *listener,
> >>>    * depends on the addnop().
> >>>    */
> >>>   static const MemoryListener vhost_vdpa_memory_listener = {
> >>> -    .begin = vhost_vdpa_listener_begin,
> >>>       .commit = vhost_vdpa_listener_commit,
> >>>       .region_add = vhost_vdpa_listener_region_add,
> >>>       .region_del = vhost_vdpa_listener_region_del,
> >>> --
> >>> 2.27.0
> >>>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]