qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v4 08/20] vhost: Route guest->host notification through s


From: Eugenio Perez Martin
Subject: Re: [RFC PATCH v4 08/20] vhost: Route guest->host notification through shadow virtqueue
Date: Thu, 14 Oct 2021 14:00:05 +0200

On Wed, Oct 13, 2021 at 5:27 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/10/1 下午3:05, Eugenio Pérez 写道:
> > Shadow virtqueue notifications forwarding is disabled when vhost_dev
> > stops, so code flow follows usual cleanup.
> >
> > Also, host notifiers must be disabled at SVQ start,
>
>
> Any reason for this?
>

It will be addressed in a later series, sorry.

>
> > and they will not
> > start if SVQ has been enabled when device is stopped. This is trivial
> > to address, but it is left out for simplicity at this moment.
>
>
> It looks to me this patch also contains the following logics
>
> 1) codes to enable svq
>
> 2) codes to let svq to be enabled from QMP.
>
> I think they need to be split out,

I agree that we can split this more, with the code that belongs to SVQ
and the code that belongs to vhost-vdpa. it will be addressed in
future series.

> we may endup with the following
> series of patches
>

With "series of patches" do you mean to send every step in a separated
series? There are odds of having the need of modifying code already
sent & merged with later ones. If you confirm to me that it is fine, I
can do it that way for sure.

> 1) svq skeleton with enable/disable
> 2) route host notifier to svq
> 3) route guest notifier to svq
> 4) codes to enable svq
> 5) enable svq via QMP
>

I'm totally fine with that, but there is code that is never called if
the qmp command is not added. The compiler complains about static
functions that are not called, making impossible things like bisecting
through these commits, unless I use attribute((unused)) or similar. Or
have I missed something?

We could do that way with the code that belongs to SVQ though, since
all of it is declared in headers. But to delay the "enable svq via
qmp" to the last one makes debugging harder, as we cannot just enable
notifications forwarding with no buffers forwarding.

If I introduce a change in the notifications code, I can simply go to
these commits and enable SVQ for notifications. This way I can have an
idea of what part is failing. A similar logic can be applied to other
devices than vp_vdpa. We would lose it if we

>
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >   qapi/net.json                      |   2 +-
> >   hw/virtio/vhost-shadow-virtqueue.h |   8 ++
> >   include/hw/virtio/vhost-vdpa.h     |   4 +
> >   hw/virtio/vhost-shadow-virtqueue.c | 138 ++++++++++++++++++++++++++++-
> >   hw/virtio/vhost-vdpa.c             | 116 +++++++++++++++++++++++-
> >   5 files changed, 264 insertions(+), 4 deletions(-)
> >
> > diff --git a/qapi/net.json b/qapi/net.json
> > index a2c30fd455..fe546b0e7c 100644
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -88,7 +88,7 @@
> >   #
> >   # @enable: true to use the alternate shadow VQ notifications
> >   #
> > -# Returns: Always error, since SVQ is not implemented at the moment.
> > +# Returns: Error if failure, or 'no error' for success.
> >   #
> >   # Since: 6.2
> >   #
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
> > b/hw/virtio/vhost-shadow-virtqueue.h
> > index 27ac6388fa..237cfceb9c 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > @@ -14,6 +14,14 @@
> >
> >   typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> >
> > +EventNotifier *vhost_svq_get_svq_call_notifier(VhostShadowVirtqueue *svq);
>
>
> Let's move this function to another patch since it's unrelated to the
> guest->host routing.
>

Right, I missed it while squashing commits and at later reviews.

>
> > +void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int 
> > call_fd);
> > +
> > +bool vhost_svq_start(struct vhost_dev *dev, unsigned idx,
> > +                     VhostShadowVirtqueue *svq);
> > +void vhost_svq_stop(struct vhost_dev *dev, unsigned idx,
> > +                    VhostShadowVirtqueue *svq);
> > +
> >   VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx);
> >
> >   void vhost_svq_free(VhostShadowVirtqueue *vq);
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index 0d565bb5bd..48aae59d8e 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -12,6 +12,8 @@
> >   #ifndef HW_VIRTIO_VHOST_VDPA_H
> >   #define HW_VIRTIO_VHOST_VDPA_H
> >
> > +#include <gmodule.h>
> > +
> >   #include "qemu/queue.h"
> >   #include "hw/virtio/virtio.h"
> >
> > @@ -24,6 +26,8 @@ typedef struct vhost_vdpa {
> >       int device_fd;
> >       uint32_t msg_type;
> >       MemoryListener listener;
> > +    bool shadow_vqs_enabled;
> > +    GPtrArray *shadow_vqs;
> >       struct vhost_dev *dev;
> >       QLIST_ENTRY(vhost_vdpa) entry;
> >       VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
> > b/hw/virtio/vhost-shadow-virtqueue.c
> > index c4826a1b56..21dc99ab5d 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > @@ -9,9 +9,12 @@
> >
> >   #include "qemu/osdep.h"
> >   #include "hw/virtio/vhost-shadow-virtqueue.h"
> > +#include "hw/virtio/vhost.h"
> > +
> > +#include "standard-headers/linux/vhost_types.h"
> >
> >   #include "qemu/error-report.h"
> > -#include "qemu/event_notifier.h"
> > +#include "qemu/main-loop.h"
> >
> >   /* Shadow virtqueue to relay notifications */
> >   typedef struct VhostShadowVirtqueue {
> > @@ -19,14 +22,146 @@ typedef struct VhostShadowVirtqueue {
> >       EventNotifier kick_notifier;
> >       /* Shadow call notifier, sent to vhost */
> >       EventNotifier call_notifier;
> > +
> > +    /*
> > +     * Borrowed virtqueue's guest to host notifier.
> > +     * To borrow it in this event notifier allows to register on the event
> > +     * loop and access the associated shadow virtqueue easily. If we use 
> > the
> > +     * VirtQueue, we don't have an easy way to retrieve it.
> > +     *
> > +     * So shadow virtqueue must not clean it, or we would lose VirtQueue 
> > one.
> > +     */
> > +    EventNotifier host_notifier;
> > +
> > +    /* Guest's call notifier, where SVQ calls guest. */
> > +    EventNotifier guest_call_notifier;
>
>
> To be consistent, let's simply use "guest_notifier" here.
>

It could be confused when the series adds a guest -> qemu kick
notifier then. Actually, I think it would be better to rename
host_notifier to something like host_svq_notifier. Or host_call and
guest_call, since "notifier" is already in the type, making the name
to be a little bit "Hungarian notation".

>
> > +
> > +    /* Virtio queue shadowing */
> > +    VirtQueue *vq;
> >   } VhostShadowVirtqueue;
> >
> > +/* Forward guest notifications */
> > +static void vhost_handle_guest_kick(EventNotifier *n)
> > +{
> > +    VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
> > +                                             host_notifier);
> > +
> > +    if (unlikely(!event_notifier_test_and_clear(n))) {
> > +        return;
> > +    }
>
>
> Is there a chance that we may stop the processing of available buffers
> during the svq enabling? There could be no kick from the guest in this case.
>

Actually, yes, I think you are right. The guest kick eventfd could
have been consumed by vhost but there may be still pending buffers.

I think it would be better to check for available buffers first, then
clear the notifier unconditionally, and then re-check and process them
if any [1].

However, this problem arises later in the series: At this moment the
device is not reset and guest's host notifier is not replaced, so
either vhost/device receives the kick, or SVQ does and forwards it.

Does it make sense to you?

>
> > +
> > +    event_notifier_set(&svq->kick_notifier);
> > +}
> > +
> > +/*
> > + * Obtain the SVQ call notifier, where vhost device notifies SVQ that there
> > + * exists pending used buffers.
> > + *
> > + * @svq Shadow Virtqueue
> > + */
> > +EventNotifier *vhost_svq_get_svq_call_notifier(VhostShadowVirtqueue *svq)
> > +{
> > +    return &svq->call_notifier;
> > +}
> > +
> > +/*
> > + * Set the call notifier for the SVQ to call the guest
> > + *
> > + * @svq Shadow virtqueue
> > + * @call_fd call notifier
> > + *
> > + * Called on BQL context.
> > + */
> > +void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int 
> > call_fd)
> > +{
> > +    event_notifier_init_fd(&svq->guest_call_notifier, call_fd);
> > +}
> > +
> > +/*
> > + * Restore the vhost guest to host notifier, i.e., disables svq effect.
> > + */
> > +static int vhost_svq_restore_vdev_host_notifier(struct vhost_dev *dev,
> > +                                                unsigned vhost_index,
> > +                                                VhostShadowVirtqueue *svq)
> > +{
> > +    EventNotifier *vq_host_notifier = 
> > virtio_queue_get_host_notifier(svq->vq);
> > +    struct vhost_vring_file file = {
> > +        .index = vhost_index,
> > +        .fd = event_notifier_get_fd(vq_host_notifier),
> > +    };
> > +    int r;
> > +
> > +    /* Restore vhost kick */
> > +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
>
>
> And remap the notification area if necessary.

Totally right, that step is missed in this series.

However, remapping guest host notifier memory region has no advantages
over using ioeventfd to perform guest -> SVQ notifications, doesn't
it? By both methods flow needs to go through the hypervisor kernel.

>
>
> > +    return r ? -errno : 0;
> > +}
> > +
> > +/*
> > + * Start shadow virtqueue operation.
> > + * @dev vhost device
> > + * @hidx vhost virtqueue index
> > + * @svq Shadow Virtqueue
> > + */
> > +bool vhost_svq_start(struct vhost_dev *dev, unsigned idx,
> > +                     VhostShadowVirtqueue *svq)
> > +{
> > +    EventNotifier *vq_host_notifier = 
> > virtio_queue_get_host_notifier(svq->vq);
> > +    struct vhost_vring_file file = {
> > +        .index = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + 
> > idx),
> > +        .fd = event_notifier_get_fd(&svq->kick_notifier),
> > +    };
> > +    int r;
> > +
> > +    /* Check that notifications are still going directly to vhost dev */
> > +    assert(virtio_queue_is_host_notifier_enabled(svq->vq));
> > +
> > +    /*
> > +     * event_notifier_set_handler already checks for guest's notifications 
> > if
> > +     * they arrive in the switch, so there is no need to explicitely check 
> > for
> > +     * them.
> > +     */
>
>
> If this is true, shouldn't we call vhost_set_vring_kick() before the
> event_notifier_set_handler()?
>

Not at this point of the series, but it could be another solution when
we need to reset the device and we are unsure if all buffers have been
read. But I think I prefer the solution exposed in [1] and to
explicitly call vhost_handle_guest_kick here. Do you think
differently?

> Btw, I think we should update the fd if set_vring_kick() was called
> after this function?
>

Kind of. This is currently bad in the code, but...

Backend callbacks vhost_ops->vhost_set_vring_kick and
vhost_ops->vhost_set_vring_addr are only called at
vhost_virtqueue_start. And they are always called with known data
already stored in VirtQueue.

To avoid storing more state in vhost_vdpa, I think that we should
avoid duplicating them, but ignore new kick_fd or address in SVQ mode,
and retrieve them again at the moment the device is (re)started in SVQ
mode. Qemu already avoids things like allowing the guest to set
addresses at random time, using the VirtIOPCIProxy to store them.

I also see how duplicating that status could protect vdpa SVQ code
against future changes to vhost code, but that would make this series
bigger and more complex with no need at this moment in my opinion.

Do you agree?

>
> > +    event_notifier_init_fd(&svq->host_notifier,
> > +                           event_notifier_get_fd(vq_host_notifier));
> > +    event_notifier_set_handler(&svq->host_notifier, 
> > vhost_handle_guest_kick);
> > +
> > +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
>
>
> And we need to stop the notification area mmap.
>

Right.

>
> > +    if (unlikely(r != 0)) {
> > +        error_report("Couldn't set kick fd: %s", strerror(errno));
> > +        goto err_set_vring_kick;
> > +    }
> > +
> > +    return true;
> > +
> > +err_set_vring_kick:
> > +    event_notifier_set_handler(&svq->host_notifier, NULL);
> > +
> > +    return false;
> > +}
> > +
> > +/*
> > + * Stop shadow virtqueue operation.
> > + * @dev vhost device
> > + * @idx vhost queue index
> > + * @svq Shadow Virtqueue
> > + */
> > +void vhost_svq_stop(struct vhost_dev *dev, unsigned idx,
> > +                    VhostShadowVirtqueue *svq)
> > +{
> > +    int r = vhost_svq_restore_vdev_host_notifier(dev, idx, svq);
> > +    if (unlikely(r < 0)) {
> > +        error_report("Couldn't restore vq kick fd: %s", strerror(-r));
> > +    }
> > +
> > +    event_notifier_set_handler(&svq->host_notifier, NULL);
> > +}
> > +
> >   /*
> >    * Creates vhost shadow virtqueue, and instruct vhost device to use the 
> > shadow
> >    * methods and file descriptors.
> >    */
> >   VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx)
> >   {
> > +    int vq_idx = dev->vq_index + idx;
> >       g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 
> > 1);
> >       int r;
> >
> > @@ -44,6 +179,7 @@ VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev 
> > *dev, int idx)
> >           goto err_init_call_notifier;
> >       }
> >
> > +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> >       return g_steal_pointer(&svq);
> >
> >   err_init_call_notifier:
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index e0dc7508c3..36c954a779 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -17,6 +17,7 @@
> >   #include "hw/virtio/vhost.h"
> >   #include "hw/virtio/vhost-backend.h"
> >   #include "hw/virtio/virtio-net.h"
> > +#include "hw/virtio/vhost-shadow-virtqueue.h"
> >   #include "hw/virtio/vhost-vdpa.h"
> >   #include "exec/address-spaces.h"
> >   #include "qemu/main-loop.h"
> > @@ -272,6 +273,16 @@ static void vhost_vdpa_add_status(struct vhost_dev 
> > *dev, uint8_t status)
> >       vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> >   }
> >
> > +/**
> > + * Adaptor function to free shadow virtqueue through gpointer
> > + *
> > + * @svq   The Shadow Virtqueue
> > + */
> > +static void vhost_psvq_free(gpointer svq)
> > +{
> > +    vhost_svq_free(svq);
> > +}
> > +
> >   static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error 
> > **errp)
> >   {
> >       struct vhost_vdpa *v;
> > @@ -283,6 +294,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void 
> > *opaque, Error **errp)
> >       dev->opaque =  opaque ;
> >       v->listener = vhost_vdpa_memory_listener;
> >       v->msg_type = VHOST_IOTLB_MSG_V2;
> > +    v->shadow_vqs = g_ptr_array_new_full(dev->nvqs, vhost_psvq_free);
> >       QLIST_INSERT_HEAD(&vhost_vdpa_devices, v, entry);
> >
> >       vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > @@ -373,6 +385,17 @@ err:
> >       return;
> >   }
> >
> > +static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)
> > +{
> > +    struct vhost_vdpa *v = dev->opaque;
> > +    size_t idx;
> > +
> > +    for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
> > +        vhost_svq_stop(dev, idx, g_ptr_array_index(v->shadow_vqs, idx));
> > +    }
> > +    g_ptr_array_free(v->shadow_vqs, true);
> > +}
> > +
> >   static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> >   {
> >       struct vhost_vdpa *v;
> > @@ -381,6 +404,7 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> >       trace_vhost_vdpa_cleanup(dev, v);
> >       vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
> >       memory_listener_unregister(&v->listener);
> > +    vhost_vdpa_svq_cleanup(dev);
> >       QLIST_REMOVE(v, entry);
> >
> >       dev->opaque = NULL;
> > @@ -557,7 +581,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, 
> > bool started)
> >       if (started) {
> >           uint8_t status = 0;
> >           memory_listener_register(&v->listener, &address_space_memory);
> > -        vhost_vdpa_host_notifiers_init(dev);
> > +        if (!v->shadow_vqs_enabled) {
> > +            vhost_vdpa_host_notifiers_init(dev);
> > +        }
>
>
> This looks like a trick, why not check and setup shadow_vqs inside:
>
> 1) vhost_vdpa_host_notifiers_init()
>
> and
>
> 2) vhost_vdpa_set_vring_kick()
>

Ok I will move the checks there.

>
> >           vhost_vdpa_set_vring_ready(dev);
> >           vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> >           vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> > @@ -663,10 +689,96 @@ static bool  vhost_vdpa_force_iommu(struct vhost_dev 
> > *dev)
> >       return true;
> >   }
> >
> > +/*
> > + * Start shadow virtqueue.
> > + */
> > +static bool vhost_vdpa_svq_start_vq(struct vhost_dev *dev, unsigned idx)
> > +{
> > +    struct vhost_vdpa *v = dev->opaque;
> > +    VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, idx);
> > +    return vhost_svq_start(dev, idx, svq);
> > +}
> > +
> > +static unsigned vhost_vdpa_enable_svq(struct vhost_vdpa *v, bool enable)
> > +{
> > +    struct vhost_dev *hdev = v->dev;
> > +    unsigned n;
> > +
> > +    if (enable == v->shadow_vqs_enabled) {
> > +        return hdev->nvqs;
> > +    }
> > +
> > +    if (enable) {
> > +        /* Allocate resources */
> > +        assert(v->shadow_vqs->len == 0);
> > +        for (n = 0; n < hdev->nvqs; ++n) {
> > +            VhostShadowVirtqueue *svq = vhost_svq_new(hdev, n);
> > +            bool ok;
> > +
> > +            if (unlikely(!svq)) {
> > +                g_ptr_array_set_size(v->shadow_vqs, 0);
> > +                return 0;
> > +            }
> > +            g_ptr_array_add(v->shadow_vqs, svq);
> > +
> > +            ok = vhost_vdpa_svq_start_vq(hdev, n);
> > +            if (unlikely(!ok)) {
> > +                /* Free still not started svqs */
> > +                g_ptr_array_set_size(v->shadow_vqs, n);
> > +                enable = false;

[2]

> > +                break;
> > +            }
> > +        }
>
>
> Since there's almost no logic could be shared between enable and
> disable. Let's split those logic out into dedicated functions where the
> codes looks more easy to be reviewed (e.g have a better error handling etc).
>

Maybe it could be more clear in the code, but the reused logic is the
disabling of SVQ and the fallback in case it cannot be enabled with
[2]. But I'm not against splitting in two different functions if it
makes review easier.

>
> > +    }
> > +
> > +    v->shadow_vqs_enabled = enable;
> > +
> > +    if (!enable) {
> > +        /* Disable all queues or clean up failed start */
> > +        for (n = 0; n < v->shadow_vqs->len; ++n) {
> > +            unsigned vq_idx = vhost_vdpa_get_vq_index(hdev, n);
> > +            VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, 
> > n);
> > +            vhost_svq_stop(hdev, n, svq);
> > +            vhost_virtqueue_start(hdev, hdev->vdev, &hdev->vqs[n], vq_idx);
> > +        }
> > +
> > +        /* Resources cleanup */
> > +        g_ptr_array_set_size(v->shadow_vqs, 0);
> > +    }
> > +
> > +    return n;
> > +}
> >
> >   void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error 
> > **errp)
> >   {
> > -    error_setg(errp, "Shadow virtqueue still not implemented");
> > +    struct vhost_vdpa *v;
> > +    const char *err_cause = NULL;
> > +    bool r;
> > +
> > +    QLIST_FOREACH(v, &vhost_vdpa_devices, entry) {
> > +        if (v->dev->vdev && 0 == strcmp(v->dev->vdev->name, name)) {
> > +            break;
> > +        }
> > +    }
>
>
> I think you can iterate the NetClientStates to ge tthe vhost-vdpa backends.
>

Right, I missed it.

>
> > +
> > +    if (!v) {
> > +        err_cause = "Device not found";
> > +        goto err;
> > +    } else if (v->notifier[0].addr) {
> > +        err_cause = "Device has host notifiers enabled";
>
>
> I don't get this.
>

At this moment of the series you can enable guest -> SVQ -> 'vdpa
device' if the device is not using the host notifiers memory region.
The right solution is to disable it for the guest, and to handle it in
SVQ. Otherwise, guest kick will bypass SVQ and

It can be done in the same patch, or at least to disable (as unmap)
them at this moment and handle them in a posterior patch. but for
prototyping the solution I just ignored it in this series. It will be
handled some way or another in the next one. I prefer the last one, to
handle in a different patch, but let me know if you think it is better
otherwise.

> Btw this function should be implemented in an independent patch after
> svq is fully functional.
>

(Reasons for that are already commented at the top of this mail :) ).

Thanks!

> Thanks
>
>
> > +        goto err;
> > +    }
> > +
> > +    r = vhost_vdpa_enable_svq(v, enable);
> > +    if (unlikely(!r)) {
> > +        err_cause = "Error enabling (see monitor)";
> > +        goto err;
> > +    }
> > +
> > +err:
> > +    if (err_cause) {
> > +        error_setg(errp, "Can't enable shadow vq on %s: %s", name, 
> > err_cause);
> > +    }
> >   }
> >
> >   const VhostOps vdpa_ops = {
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]