[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support
From: |
Yuanhan Liu |
Subject: |
[Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support |
Date: |
Tue, 8 Sep 2015 15:38:46 +0800 |
From: Ouyang Changchun <address@hidden>
This patch is initially based a patch from Nikolay Nikolaev.
Here is the latest version for adding vhost-user multiple queue support,
by creating a nc and vhost_net pair for each queue.
What differs from last version is that this patch addresses two major
concerns from Michael, and fixes one hidden bug.
- Concern #1: no feedback when the backend can't support # of
requested queues(by providing queues=# option).
Here we address this issue by querying VHOST_USER_PROTOCOL_F_MQ
protocol features first, if not set, it means the backend don't
support mq feature, and let max_queues be 1. Otherwise, we send
another message, VHOST_USER_GET_QUEUE_NUM, for getting the max_queues
the backend supports.
At vhost-user initiation stage(net_vhost_user_start), we then initiate
one queue first, which, in the meantime, also gets the max_queues.
We then do a simple compare: if requested_queues > max_queues, we
exit(I guess it's safe to exit here, as the vm is not running yet).
- Concern #2: some messages are sent more times than necessary.
We came an agreement with Michael that we could categorize vhost
user messages to 2 types: none-vring specific messages, which should
be sent only once, and vring specific messages, which should be sent
per queue.
Here I introduced a helper function vhost_user_one_time_request(),
which lists following messages as none-vring specific messages:
VHOST_USER_GET_FEATURES
VHOST_USER_SET_FEATURES
VHOST_USER_GET_PROTOCOL_FEATURES
VHOST_USER_SET_PROTOCOL_FEATURES
VHOST_USER_SET_OWNER
VHOST_USER_RESET_DEVICE
VHOST_USER_SET_MEM_TABLE
VHOST_USER_GET_QUEUE_NUM
For above messages, we simply ignore them when they are not sent the first
time.
I also observed a hidden bug from last version. We register the char dev
event handler N times, which is not necessary, as well as buggy: A later
register overwrites the former one, as qemu_chr_add_handlers() will not
chain those handlers, but instead overwrites the old one. So, in theory,
invoking qemu_chr_add_handlers N times will not end up with calling the
handler N times.
However, the reason the handler is invoked N times is because we start
the backend(the connection server) first, and hence when net_vhost_user_init()
is executed, the connection is already established, and qemu_chr_add_handlers()
then invoke the handler immediately, which just looks like we invoke the
handler(net_vhost_user_event) directly from net_vhost_user_init().
The solution I came up with is to make VhostUserState as an upper level
structure, making it includes N nc and vhost_net pairs:
struct VhostUserNetPeer {
NetClientState *nc;
VHostNetState *vhost_net;
};
typedef struct VhostUserState {
CharDriverState *chr;
bool running;
int queues;
struct VhostUserNetPeer peers[];
} VhostUserState;
Signed-off-by: Nikolay Nikolaev <address@hidden>
Signed-off-by: Changchun Ouyang <address@hidden>
Signed-off-by: Yuanhan Liu <address@hidden>
---
docs/specs/vhost-user.txt | 13 +++++
hw/virtio/vhost-user.c | 31 ++++++++++-
include/net/net.h | 1 +
net/vhost-user.c | 136 ++++++++++++++++++++++++++++++++--------------
qapi-schema.json | 6 +-
qemu-options.hx | 5 +-
6 files changed, 146 insertions(+), 46 deletions(-)
diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 43db9b4..99d79be 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -135,6 +135,19 @@ As older slaves don't support negotiating protocol
features,
a feature bit was dedicated for this purpose:
#define VHOST_USER_F_PROTOCOL_FEATURES 30
+Multiple queue support
+-------------------
+Multiple queue is treated as a protocal extension, hence the slave has to
+implement protocol features first. Multiple queues is supported only when
+the protocol feature VHOST_USER_PROTOCOL_F_MQ(bit 0) is set.
+
+The max # of queues the slave support can be queried with message
+VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the # of requested
+queues is bigger than that.
+
+As all queues share one connection, the master use a unique index for each
+queue in the sent message to identify one specified queue.
+
Message types
-------------
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8046bc0..11e46b5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -187,6 +187,23 @@ static int vhost_user_write(struct vhost_dev *dev,
VhostUserMsg *msg,
0 : -1;
}
+static bool vhost_user_one_time_request(VhostUserRequest request)
+{
+ switch (request) {
+ case VHOST_USER_GET_FEATURES:
+ case VHOST_USER_SET_FEATURES:
+ case VHOST_USER_GET_PROTOCOL_FEATURES:
+ case VHOST_USER_SET_PROTOCOL_FEATURES:
+ case VHOST_USER_SET_OWNER:
+ case VHOST_USER_RESET_DEVICE:
+ case VHOST_USER_SET_MEM_TABLE:
+ case VHOST_USER_GET_QUEUE_NUM:
+ return true;
+ default:
+ return false;
+ }
+}
+
static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
void *arg)
{
@@ -206,6 +223,14 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned
long int request,
else
msg_request = request;
+ /*
+ * For none-vring specific requests, like VHOST_USER_GET_FEATURES,
+ * we just need send it once in the first time. For later such
+ * request, we just ignore it.
+ */
+ if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0)
+ return 0;
+
msg.request = msg_request;
msg.flags = VHOST_USER_VERSION;
msg.size = 0;
@@ -268,17 +293,20 @@ static int vhost_user_call(struct vhost_dev *dev,
unsigned long int request,
case VHOST_USER_SET_VRING_NUM:
case VHOST_USER_SET_VRING_BASE:
memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+ msg.addr.index += dev->vq_index;
msg.size = sizeof(m.state);
break;
case VHOST_USER_GET_VRING_BASE:
memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+ msg.addr.index += dev->vq_index;
msg.size = sizeof(m.state);
need_reply = 1;
break;
case VHOST_USER_SET_VRING_ADDR:
memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
+ msg.addr.index += dev->vq_index;
msg.size = sizeof(m.addr);
break;
@@ -286,7 +314,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned
long int request,
case VHOST_USER_SET_VRING_CALL:
case VHOST_USER_SET_VRING_ERR:
file = arg;
- msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
+ msg.u64 = (file->index + dev->vq_index) & VHOST_USER_VRING_IDX_MASK;
msg.size = sizeof(m.u64);
if (ioeventfd_enabled() && file->fd > 0) {
fds[fd_num++] = file->fd;
@@ -330,6 +358,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned
long int request,
error_report("Received bad msg size.");
return -1;
}
+ msg.state.index -= dev->vq_index;
memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
break;
default:
diff --git a/include/net/net.h b/include/net/net.h
index 6a6cbef..6f20656 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -92,6 +92,7 @@ struct NetClientState {
NetClientDestructor *destructor;
unsigned int queue_index;
unsigned rxfilter_notify_enabled:1;
+ void *opaque;
};
typedef struct NICState {
diff --git a/net/vhost-user.c b/net/vhost-user.c
index 2d6bbe5..7d4ac69 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -15,10 +15,16 @@
#include "qemu/config-file.h"
#include "qemu/error-report.h"
+struct VhostUserNetPeer {
+ NetClientState *nc;
+ VHostNetState *vhost_net;
+};
+
typedef struct VhostUserState {
- NetClientState nc;
CharDriverState *chr;
- VHostNetState *vhost_net;
+ bool running;
+ int queues;
+ struct VhostUserNetPeer peers[];
} VhostUserState;
typedef struct VhostUserChardevProps {
@@ -29,48 +35,75 @@ typedef struct VhostUserChardevProps {
VHostNetState *vhost_user_get_vhost_net(NetClientState *nc)
{
- VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+ VhostUserState *s = nc->opaque;
assert(nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
- return s->vhost_net;
-}
-
-static int vhost_user_running(VhostUserState *s)
-{
- return (s->vhost_net) ? 1 : 0;
+ return s->peers[nc->queue_index].vhost_net;
}
static int vhost_user_start(VhostUserState *s)
{
VhostNetOptions options;
+ VHostNetState *vhost_net;
+ int max_queues;
+ int i = 0;
- if (vhost_user_running(s)) {
+ if (s->running)
return 0;
- }
options.backend_type = VHOST_BACKEND_TYPE_USER;
- options.net_backend = &s->nc;
options.opaque = s->chr;
- s->vhost_net = vhost_net_init(&options);
+ options.net_backend = s->peers[i].nc;
+ vhost_net = s->peers[i++].vhost_net = vhost_net_init(&options);
+
+ max_queues = vhost_net_get_max_queues(vhost_net);
+ if (s->queues >= max_queues) {
+ error_report("you are asking more queues than supported: %d\n",
+ max_queues);
+ return -1;
+ }
+
+ for (; i < s->queues; i++) {
+ options.net_backend = s->peers[i].nc;
+
+ s->peers[i].vhost_net = vhost_net_init(&options);
+ if (!s->peers[i].vhost_net)
+ return -1;
+ }
+ s->running = true;
- return vhost_user_running(s) ? 0 : -1;
+ return 0;
}
static void vhost_user_stop(VhostUserState *s)
{
- if (vhost_user_running(s)) {
- vhost_net_cleanup(s->vhost_net);
+ int i;
+ VHostNetState *vhost_net;
+
+ if (!s->running)
+ return;
+
+ for (i = 0; i < s->queues; i++) {
+ vhost_net = s->peers[i].vhost_net;
+ if (vhost_net)
+ vhost_net_cleanup(vhost_net);
}
- s->vhost_net = 0;
+ s->running = false;
}
static void vhost_user_cleanup(NetClientState *nc)
{
- VhostUserState *s = DO_UPCAST(VhostUserState, nc, nc);
+ VhostUserState *s = nc->opaque;
+ VHostNetState *vhost_net = s->peers[nc->queue_index].vhost_net;
+
+ if (vhost_net)
+ vhost_net_cleanup(vhost_net);
- vhost_user_stop(s);
qemu_purge_queued_packets(nc);
+
+ if (nc->queue_index == s->queues - 1)
+ free(s);
}
static bool vhost_user_has_vnet_hdr(NetClientState *nc)
@@ -89,7 +122,7 @@ static bool vhost_user_has_ufo(NetClientState *nc)
static NetClientInfo net_vhost_user_info = {
.type = NET_CLIENT_OPTIONS_KIND_VHOST_USER,
- .size = sizeof(VhostUserState),
+ .size = sizeof(NetClientState),
.cleanup = vhost_user_cleanup,
.has_vnet_hdr = vhost_user_has_vnet_hdr,
.has_ufo = vhost_user_has_ufo,
@@ -97,18 +130,25 @@ static NetClientInfo net_vhost_user_info = {
static void net_vhost_link_down(VhostUserState *s, bool link_down)
{
- s->nc.link_down = link_down;
+ NetClientState *nc;
+ int i;
- if (s->nc.peer) {
- s->nc.peer->link_down = link_down;
- }
+ for (i = 0; i < s->queues; i++) {
+ nc = s->peers[i].nc;
- if (s->nc.info->link_status_changed) {
- s->nc.info->link_status_changed(&s->nc);
- }
+ nc->link_down = link_down;
+
+ if (nc->peer) {
+ nc->peer->link_down = link_down;
+ }
- if (s->nc.peer && s->nc.peer->info->link_status_changed) {
- s->nc.peer->info->link_status_changed(s->nc.peer);
+ if (nc->info->link_status_changed) {
+ nc->info->link_status_changed(nc);
+ }
+
+ if (nc->peer && nc->peer->info->link_status_changed) {
+ nc->peer->info->link_status_changed(nc->peer);
+ }
}
}
@@ -118,7 +158,8 @@ static void net_vhost_user_event(void *opaque, int event)
switch (event) {
case CHR_EVENT_OPENED:
- vhost_user_start(s);
+ if (vhost_user_start(s) < 0)
+ exit(1);
net_vhost_link_down(s, false);
error_report("chardev \"%s\" went up", s->chr->label);
break;
@@ -131,24 +172,28 @@ static void net_vhost_user_event(void *opaque, int event)
}
static int net_vhost_user_init(NetClientState *peer, const char *device,
- const char *name, CharDriverState *chr)
+ const char *name, VhostUserState *s)
{
NetClientState *nc;
- VhostUserState *s;
+ CharDriverState *chr = s->chr;
+ int i;
- nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
+ for (i = 0; i < s->queues; i++) {
+ nc = qemu_new_net_client(&net_vhost_user_info, peer, device, name);
- snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
- chr->label);
+ snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
+ i, chr->label);
- s = DO_UPCAST(VhostUserState, nc, nc);
+ /* We don't provide a receive callback */
+ nc->receive_disabled = 1;
- /* We don't provide a receive callback */
- s->nc.receive_disabled = 1;
- s->chr = chr;
- nc->queue_index = 0;
+ nc->queue_index = i;
+ nc->opaque = s;
- qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, s);
+ s->peers[i].nc = nc;
+ }
+
+ qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, s);
return 0;
}
@@ -227,8 +272,10 @@ static int net_vhost_check_net(void *opaque, QemuOpts
*opts, Error **errp)
int net_init_vhost_user(const NetClientOptions *opts, const char *name,
NetClientState *peer, Error **errp)
{
+ int queues;
const NetdevVhostUserOptions *vhost_user_opts;
CharDriverState *chr;
+ VhostUserState *s;
assert(opts->kind == NET_CLIENT_OPTIONS_KIND_VHOST_USER);
vhost_user_opts = opts->vhost_user;
@@ -244,6 +291,11 @@ int net_init_vhost_user(const NetClientOptions *opts,
const char *name,
return -1;
}
+ queues = vhost_user_opts->has_queues ? vhost_user_opts->queues : 1;
+ s = g_malloc0(sizeof(VhostUserState) +
+ queues * sizeof(struct VhostUserNetPeer));
+ s->queues = queues;
+ s->chr = chr;
- return net_vhost_user_init(peer, "vhost_user", name, chr);
+ return net_vhost_user_init(peer, "vhost_user", name, s);
}
diff --git a/qapi-schema.json b/qapi-schema.json
index 67fef37..55c33db 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2480,12 +2480,16 @@
#
# @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
#
+# @queues: #optional number of queues to be created for multiqueue vhost-user
+# (default: 1) (Since 2.5)
+#
# Since 2.1
##
{ 'struct': 'NetdevVhostUserOptions',
'data': {
'chardev': 'str',
- '*vhostforce': 'bool' } }
+ '*vhostforce': 'bool',
+ '*queues': 'int' } }
##
# @NetClientOptions
diff --git a/qemu-options.hx b/qemu-options.hx
index 77f5853..5bfa7a3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1963,13 +1963,14 @@ The hubport netdev lets you connect a NIC to a QEMU
"vlan" instead of a single
netdev. @code{-net} and @code{-device} with parameter @option{vlan} create the
required hub automatically.
address@hidden -netdev vhost-user,address@hidden,vhostforce=on|off]
address@hidden -netdev vhost-user,address@hidden,vhostforce=on|off][,queues=n]
Establish a vhost-user netdev, backed by a chardev @var{id}. The chardev should
be a unix domain socket backed one. The vhost-user uses a specifically defined
protocol to pass vhost ioctl replacement messages to an application on the
other
end of the socket. On non-MSIX guests, the feature can be forced with
address@hidden
address@hidden Use 'address@hidden' to specify the number of queues to
+be created for multiqueue vhost-user.
Example:
@example
--
1.9.0
- Re: [Qemu-devel] [PATCH 5/7] vhost_net: move vhost_net_set_vq_index ahead at vhost_net_init, (continued)
- [Qemu-devel] [PATCH 2/7] vhost-user: add protocol feature negotiation, Yuanhan Liu, 2015/09/08
- [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support,
Yuanhan Liu <=
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Eric Blake, 2015/09/08
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Ouyang, Changchun, 2015/09/09
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Michael S. Tsirkin, 2015/09/09
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Michael S. Tsirkin, 2015/09/09
- Re: [Qemu-devel] [PATCH 6/7] vhost-user: add multiple queue support, Jason Wang, 2015/09/14