qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/7] vhost-vdpa multiqueue fixes


From: Si-Wei Liu
Subject: Re: [PATCH 0/7] vhost-vdpa multiqueue fixes
Date: Fri, 29 Apr 2022 19:07:10 -0700
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1



On 4/28/2022 7:30 PM, Jason Wang wrote:
On Wed, Apr 27, 2022 at 5:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:


On 4/27/2022 1:38 AM, Jason Wang wrote:
On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:

On 4/26/2022 9:28 PM, Jason Wang wrote:
在 2022/3/30 14:33, Si-Wei Liu 写道:
Hi,

This patch series attempt to fix a few issues in vhost-vdpa
multiqueue functionality.

Patch #1 is the formal submission for RFC patch in:
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$

Patch #2 and #3 were taken from a previous patchset posted on
qemu-devel:
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$

albeit abandoned, two patches in that set turn out to be useful for
patch #4, which is to fix a QEMU crash due to race condition.

Patch #5 through #7 are obviously small bug fixes. Please find the
description of each in the commit log.

Thanks,
-Siwei
Hi Si Wei:

I think we need another version of this series?
Hi Jason,

Apologies for the long delay. I was in the middle of reworking the patch
"virtio: don't read pending event on host notifier if disabled", but
found out that it would need quite some code change for the userspace
fallback handler to work properly (for the queue no. change case
specifically).
We probably need this fix for -stable, so I wonder if we can have a
workaround first and do refactoring on top?
Hmmm, a nasty fix but may well address the segfault is something like this:

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 8ca0b80..3ac93a4 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -368,6 +368,10 @@ static void virtio_net_set_status(struct
VirtIODevice *vdev, uint8_t status)
       int i;
       uint8_t queue_status;

+    if (n->status_pending)
+        return;
+
+    n->status_pending = true;
       virtio_net_vnet_endian_status(n, status);
       virtio_net_vhost_status(n, status);

@@ -416,6 +420,7 @@ static void virtio_net_set_status(struct
VirtIODevice *vdev, uint8_t status)
               }
           }
       }
+    n->status_pending = false;
   }

   static void virtio_net_set_link_status(NetClientState *nc)
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index eb87032..95efea8 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -216,6 +216,7 @@ struct VirtIONet {
       VirtioNetRssData rss_data;
       struct NetRxPkt *rx_pkt;
       struct EBPFRSSContext ebpf_rss;
+    bool status_pending;
   };

   void virtio_net_set_netclient_name(VirtIONet *n, const char *name,

To be honest, I am not sure if this is worth a full blown fix to make it
completely work. Without applying vq suspend patch (the one I posted in
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/__;!!ACWV5N9M2RV99hQ!L4qque3YpPr-CGp12NrNdMMT1HROfEY_Juw2vnfZXHjOhtT0XJCR9GB8cvWEbJL9Aeh-WhDogBVArJn91P0$
 ),
it's very hard for me to effectively verify my code change - it's very
easy for the guest vq index to be out of sync if not stopping the vq
once the vhost is up and running (I tested it with repeatedly set_link
off and on).
Can we test via vmstop?
Yes, of coz, that's where the segfault happened. The tight loop of set_link on/off doesn't even work for the single queue case, hence that's why I doubted it ever worked for vhost-vdpa.


I am not sure if there's real chance we can run into issue
in practice due to the incomplete fix, if we don't fix the vq
stop/suspend issue first. Anyway I will try, as other use case e.g, live
migration is likely to get stumbled on it, too.
Ok, so I think we probably don't need the "nasty" fix above. Let's fix
it with the issue of stop/resume.
Ok, then does below tentative code change suffice the need? i.e. it would fail the request of changing queue pairs when the vhost-vdpa backend falls back to the userspace handler, but it's probably the easiest way to fix the vmstop segfault.

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index ed231f9..8ba9f09 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1177,6 +1177,7 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
     struct virtio_net_ctrl_mq mq;
     size_t s;
     uint16_t queue_pairs;
+    NetClientState *nc = qemu_get_queue(n->nic);

     s = iov_to_buf(iov, iov_cnt, 0, &mq, sizeof(mq));
     if (s != sizeof(mq)) {
@@ -1196,6 +1197,13 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
         return VIRTIO_NET_ERR;
     }

+    /* avoid changing the number of queue_pairs for vdpa device in
+     * userspace handler.
+     * TODO: get userspace fallback to work with future patch */
+    if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
+        return VIRTIO_NET_ERR;
+    }
+
     n->curr_queue_pairs = queue_pairs;
     /* stop the backend before changing the number of queue_pairs to avoid handling a
      * disabled queue */

Thanks,
-Siwei

Thanks

-Siwei


I have to drop it from the series and posted it later on
when ready. Will post a v2 with relevant patches removed.
Thanks

Regards,
-Siwei

Thanks


---

Eugenio Pérez (2):
     virtio-net: Fix indentation
     virtio-net: Only enable userland vq if using tap backend

Si-Wei Liu (5):
     virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa
     virtio: don't read pending event on host notifier if disabled
     vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa
     vhost-net: fix improper cleanup in vhost_net_start
     vhost-vdpa: backend feature should set only once

    hw/net/vhost_net.c         |  4 +++-
    hw/net/virtio-net.c        | 25 +++++++++++++++++++++----
    hw/virtio/vhost-vdpa.c     |  2 +-
    hw/virtio/virtio-bus.c     |  3 ++-
    hw/virtio/virtio.c         | 21 +++++++++++++--------
    include/hw/virtio/virtio.h |  2 ++
    net/vhost-vdpa.c           |  4 +++-
    7 files changed, 45 insertions(+), 16 deletions(-)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]