qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] hw/rdma: Add support for GID state changes for


From: Marcel Apfelbaum
Subject: Re: [Qemu-devel] [PATCH] hw/rdma: Add support for GID state changes for non-qmp frameworks
Date: Fri, 24 May 2019 08:24:30 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1


Hi Yuval,

On 5/5/19 1:55 PM, Yuval Shaia wrote:
Any GID change in guest must be propogate to host. This is already done
by firing QMP event to managment system such as libvirt which in turn
will update the host with the relevant change.

Agreed, *any* management software can do that.


When qemu is executed on non-qmp framework (ex from command-line) we
need to update the host instead.
Fix it by adding support to update the RoCE device's Ethernet function
IP list from qemu via netlink.

I am not sure this is the right approach. I don't think QEMU should actively change
the host network configuration.
As you pointed out yourself, the management software should make such changes.

I agree you cannot always assume the QEMU instance is managed by libvirt,
what about adding this functionality to rdma-multiplexer? The multiplexer may
register to the same QMP event.

Even if you think the multiplexer is not the right way to do it, even a simple bash script
disguised  as a systemd service can subscribe to the QMP event and make the
change on the host.

What do you think?

Thanks,
Marcel


Signed-off-by: Yuval Shaia <address@hidden>
---
  configure              |  6 ++++
  hw/rdma/rdma_backend.c | 74 +++++++++++++++++++++++++++++++++++++++++-
  2 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 5b183c2e39..1f707b1a62 100755
--- a/configure
+++ b/configure
@@ -3132,6 +3132,8 @@ fi
cat > $TMPC <<EOF &&
  #include <sys/mman.h>
+#include <libmnl/libmnl.h>
+#include <linux/rtnetlink.h>
int
  main(void)
@@ -3144,10 +3146,13 @@ main(void)
  }
  EOF
+pvrdma_libs="-lmnl"
+
  if test "$rdma" = "yes" ; then
      case "$pvrdma" in
      "")
          if compile_prog "" ""; then
+            libs_softmmu="$libs_softmmu $pvrdma_libs"
              pvrdma="yes"
          else
              pvrdma="no"
@@ -3156,6 +3161,7 @@ if test "$rdma" = "yes" ; then
      "yes")
          if ! compile_prog "" ""; then
              error_exit "PVRDMA is not supported since mremap is not 
implemented"
+                        " or libmnl-devel is not installed"
          fi
          pvrdma="yes"
          ;;
diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index 05f6b03221..bc57b1a624 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -16,6 +16,11 @@
  #include "qemu/osdep.h"
  #include "qapi/qapi-events-rdma.h"
+#include "linux/if_addr.h"
+#include "libmnl/libmnl.h"
+#include "linux/rtnetlink.h"
+#include "net/if.h"
+
  #include <infiniband/verbs.h>
#include "contrib/rdmacm-mux/rdmacm-mux.h"
@@ -47,6 +52,61 @@ static void dummy_comp_handler(void *ctx, struct ibv_wc *wc)
      rdma_error_report("No completion handler is registered");
  }
+static int netlink_route_update(const char *ifname, union ibv_gid *gid,
+                                __u16 type)
+{
+    char buf[MNL_SOCKET_BUFFER_SIZE];
+    struct nlmsghdr *nlh;
+    struct ifaddrmsg *ifm;
+    struct mnl_socket *nl;
+    int ret;
+    uint32_t ipv4;
+
+    nl = mnl_socket_open(NETLINK_ROUTE);
+    if (!nl) {
+        rdma_error_report("Fail to connect to netlink\n");
+        return -EIO;
+    }
+
+    ret = mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID);
+    if (ret < 0) {
+        rdma_error_report("Fail to bind to netlink\n");
+        goto out;
+    }
+
+    nlh = mnl_nlmsg_put_header(buf);
+    nlh->nlmsg_type = type;
+    nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;
+    nlh->nlmsg_seq = 1;
+
+    ifm = mnl_nlmsg_put_extra_header(nlh, sizeof(*ifm));
+    ifm->ifa_index = if_nametoindex(ifname);
+    if (gid->global.subnet_prefix) {
+        ifm->ifa_family = AF_INET6;
+        ifm->ifa_prefixlen = 64;
+        ifm->ifa_flags = IFA_F_PERMANENT;
+        ifm->ifa_scope = RT_SCOPE_UNIVERSE;
+        mnl_attr_put(nlh, IFA_ADDRESS, sizeof(*gid), gid);
+    } else {
+        ifm->ifa_family = AF_INET;
+        ifm->ifa_prefixlen = 24;
+        memcpy(&ipv4, (char *)&gid->global.interface_id + 4, sizeof(ipv4));
+        mnl_attr_put(nlh, IFA_LOCAL, 4, &ipv4);
+    }
+
+    ret = mnl_socket_sendto(nl, nlh, nlh->nlmsg_len);
+    if (ret < 0) {
+        rdma_error_report("Fail to send msg to to netlink\n");
+        goto out;
+    }
+
+    ret = 0;
+
+out:
+    mnl_socket_close(nl);
+    return ret;
+}
+
  static inline void complete_work(enum ibv_wc_status status, uint32_t 
vendor_err,
                                   void *ctx)
  {
@@ -1123,7 +1183,13 @@ int rdma_backend_add_gid(RdmaBackendDev *backend_dev, 
const char *ifname,
                                              gid->global.subnet_prefix,
                                              gid->global.interface_id);
- return ret;
+    /*
+     * We ignore return value since operation might completed sucessfully
+     * by the QMP consumer
+     */
+    netlink_route_update(ifname, gid, RTM_NEWADDR);
+
+    return 0;
  }
int rdma_backend_del_gid(RdmaBackendDev *backend_dev, const char *ifname,
@@ -1149,6 +1215,12 @@ int rdma_backend_del_gid(RdmaBackendDev *backend_dev, 
const char *ifname,
                                              gid->global.subnet_prefix,
                                              gid->global.interface_id);
+ /*
+     * We ignore return value since operation might completed sucessfully
+     * by the QMP consumer
+     */
+    netlink_route_update(ifname, gid, RTM_DELADDR);
+
      return 0;
  }




reply via email to

[Prev in Thread] Current Thread [Next in Thread]