On 4/27/2022 1:38 AM, Jason Wang wrote:
On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
On 4/26/2022 9:28 PM, Jason Wang wrote:
在 2022/3/30 14:33, Si-Wei Liu 写道:
Hi,
This patch series attempt to fix a few issues in vhost-vdpa
multiqueue functionality.
Patch #1 is the formal submission for RFC patch in:
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$
Patch #2 and #3 were taken from a previous patchset posted on
qemu-devel:
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$
albeit abandoned, two patches in that set turn out to be useful for
patch #4, which is to fix a QEMU crash due to race condition.
Patch #5 through #7 are obviously small bug fixes. Please find the
description of each in the commit log.
Thanks,
-Siwei
Hi Si Wei:
I think we need another version of this series?
Hi Jason,
Apologies for the long delay. I was in the middle of reworking the patch
"virtio: don't read pending event on host notifier if disabled", but
found out that it would need quite some code change for the userspace
fallback handler to work properly (for the queue no. change case
specifically).
We probably need this fix for -stable, so I wonder if we can have a
workaround first and do refactoring on top?
Hmmm, a nasty fix but may well address the segfault is something like this:
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 8ca0b80..3ac93a4 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -368,6 +368,10 @@ static void virtio_net_set_status(struct
VirtIODevice *vdev, uint8_t status)
int i;
uint8_t queue_status;
+ if (n->status_pending)
+ return;
+
+ n->status_pending = true;
virtio_net_vnet_endian_status(n, status);
virtio_net_vhost_status(n, status);
@@ -416,6 +420,7 @@ static void virtio_net_set_status(struct
VirtIODevice *vdev, uint8_t status)
}
}
}
+ n->status_pending = false;
}
static void virtio_net_set_link_status(NetClientState *nc)
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index eb87032..95efea8 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -216,6 +216,7 @@ struct VirtIONet {
VirtioNetRssData rss_data;
struct NetRxPkt *rx_pkt;
struct EBPFRSSContext ebpf_rss;
+ bool status_pending;
};
void virtio_net_set_netclient_name(VirtIONet *n, const char *name,
To be honest, I am not sure if this is worth a full blown fix to make it
completely work. Without applying vq suspend patch (the one I posted in
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/__;!!ACWV5N9M2RV99hQ!L4qque3YpPr-CGp12NrNdMMT1HROfEY_Juw2vnfZXHjOhtT0XJCR9GB8cvWEbJL9Aeh-WhDogBVArJn91P0$
),
it's very hard for me to effectively verify my code change - it's very
easy for the guest vq index to be out of sync if not stopping the vq
once the vhost is up and running (I tested it with repeatedly set_link
off and on).