qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5] net: add support of mac-programming over mac


From: Luiz Capitulino
Subject: Re: [Qemu-devel] [PATCH v5] net: add support of mac-programming over macvtap in QEMU side
Date: Fri, 7 Jun 2013 09:46:12 -0400

On Wed,  5 Jun 2013 18:42:13 +0800
Amos Kong <address@hidden> wrote:

> Currently macvtap based macvlan device is working in promiscuous
> mode, we want to implement mac-programming over macvtap through
> Libvirt for better performance.
> 
> Design:
> QEMU notifies Libvirt when rx-filter config is changed in guest,
> then Libvirt query the rx-filter information by a monitor command,
> and sync the change to macvtap device. Related rx-filter config
> of the nic contains main mac, rx-mode items and vlan table.
> 
> This patch adds a QMP event to notify management of rx-filter change,
> and adds a monitor command for management to query rx-filter
> information.
> 
> For reducing length of output, we just return the entries of vlan
> filter table that have active vlan.
> 
> Event_throttle API can avoid the events to flood QMP client, but it
> could cause an unexpected delay. So a flag for each nic is used to
> avoid events flooding, if management doesn't query rx-filter after
> it receives one event, new events won't be emitted to QMP monitor.
> 
> There maybe exist an uncontrollable delay if we let Libvirt do the
> real change, guests normally expect rx-filter updates immediately.
> But it's another separate issue, we can investigate it when the
> work in Libvirt side is done.

I think I completely misunderstood your testing results.

I had understood that: 1. changing the mac often & quick enough to
be a problem was a corner case and 2. you actually can overflow mngt

I hope I really got it wrong, otherwise you'll be using that
flag as a replacement for the event throttle API, which would be
a big mistake.

Can you please add your test results & analysis to this commit message?

Two small comments below.

> Signed-off-by: Amos Kong <address@hidden>
> ---
> v2: add argument to filter mac-table info of single nic (Stefan)
>     update the document, add event notification
> v3: rename to rx-filter, add main mac, avoid events flooding (MST)
>     fix error process (Stefan), fix qmp interface (Eric)
> v4: process qerror in hmp, cleanup (Luiz)
>     set flag for each device, add device path in event, add
>     helper for g_strdup_printf (MST)
>     fix qmp document (Eric)
> v5: add path in doc, define notify flag to unsigned (Eric)
>     add vlan table (Jason), drop monitor cmd
> ---
>  QMP/qmp-events.txt        |  20 +++++++++
>  hw/net/virtio-net.c       | 112 
> ++++++++++++++++++++++++++++++++++++++++++++++
>  include/monitor/monitor.h |   1 +
>  include/net/net.h         |   3 ++
>  monitor.c                 |   1 +
>  net/net.c                 |  47 +++++++++++++++++++
>  qapi-schema.json          |  89 ++++++++++++++++++++++++++++++++++++
>  qmp-commands.hx           |  66 +++++++++++++++++++++++++++
>  8 files changed, 339 insertions(+)
> 
> diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
> index 92fe5fb..885230e 100644
> --- a/QMP/qmp-events.txt
> +++ b/QMP/qmp-events.txt
> @@ -172,6 +172,26 @@ Data:
>    },
>    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
>  
> +NIC_RX_FILTER_CHANGED
> +-----------------
> +
> +Emitted when rx-filter configuration of nic is changed by the guest.
> +Each nic has a flag to control event emit, the flag is set to false
> +when it emits one event of the nic, the flag is set to true when
> +management queries the rx-filter of the nic. This is used to avoid
> +events flooding.

Having this flag is an implementation detail. I think you should only
say that the event is emitted once until the query command is executed.

> +
> +Data:
> +
> +- "name": net client name (json-string)
> +- "path": device path (json-string)
> +
> +{ "event": "NIC_RX_FILTER_CHANGED",
> +  "data": { "name": "vnet0",
> +            "path": "/machine/peripheral/vnet0/virtio-backend" },
> +  "timestamp": { "seconds": 1368697518, "microseconds": 326866 } }
> +}
> +
>  RESET
>  -----
>  
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 1ea9556..ae1eab6 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -21,6 +21,8 @@
>  #include "hw/virtio/virtio-net.h"
>  #include "net/vhost_net.h"
>  #include "hw/virtio/virtio-bus.h"
> +#include "qapi/qmp/qjson.h"
> +#include "monitor/monitor.h"
>  
>  #define VIRTIO_NET_VM_VERSION    11
>  
> @@ -192,6 +194,104 @@ static void virtio_net_set_link_status(NetClientState 
> *nc)
>      virtio_net_set_status(vdev, vdev->status);
>  }
>  
> +static void rxfilter_notify(NetClientState *nc)
> +{
> +    QObject *event_data;
> +    VirtIONet *n = qemu_get_nic_opaque(nc);
> +
> +    if (nc->rxfilter_notify_enabled) {
> +        event_data = qobject_from_jsonf("{ 'name': %s, 'path': %s }",
> +                           n->netclient_name,
> +                           object_get_canonical_path(OBJECT(n->qdev)));
> +        monitor_protocol_event(QEVENT_NIC_RX_FILTER_CHANGED, event_data);
> +        qobject_decref(event_data);
> +        /* disable event notification to avoid events flooding */
> +        nc->rxfilter_notify_enabled = 0;
> +    }
> +}
> +
> +static char *mac_strdup_printf(uint8_t *mac)
> +{

mac can be const. Is there more code in QEMU that could use this
function? If there's, then it's better to move this function to a
more generic .c file.

> +    return g_strdup_printf("%.2x:%.2x:%.2x:%.2x:%.2x:%.2x", mac[0],
> +                            mac[1], mac[2], mac[3], mac[4], mac[5]);
> +}
> +
> +static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc)
> +{
> +    VirtIONet *n = qemu_get_nic_opaque(nc);
> +    RxFilterInfo *info;
> +    strList *str_list = NULL;
> +    strList *entry;
> +    VlanTableEntryList *vlan_list = NULL;
> +    VlanTableEntryList *vlan_entry;
> +    VlanTableEntry *table_entry;
> +    int i;
> +
> +    info = g_malloc0(sizeof(*info));
> +    info->name = g_strdup(nc->name);
> +    info->promiscuous = n->promisc;
> +
> +    if (n->nouni) {
> +        info->unicast = RX_STATE_NONE;
> +    } else if (n->alluni) {
> +        info->unicast = RX_STATE_ALL;
> +    } else {
> +        info->unicast = RX_STATE_NORMAL;
> +    }
> +
> +    if (n->nomulti) {
> +        info->multicast = RX_STATE_NONE;
> +    } else if (n->allmulti) {
> +        info->multicast = RX_STATE_ALL;
> +    } else {
> +        info->multicast = RX_STATE_NORMAL;
> +    }
> +
> +    info->broadcast_allowed = n->nobcast;
> +    info->multicast_overflow = n->mac_table.multi_overflow;
> +    info->unicast_overflow = n->mac_table.uni_overflow;
> +
> +    info->main_mac = mac_strdup_printf(n->mac);
> +
> +    for (i = 0; i < n->mac_table.first_multi; i++) {
> +        entry = g_malloc0(sizeof(*entry));
> +        entry->value = mac_strdup_printf(n->mac_table.macs + i * ETH_ALEN);
> +        entry->next = str_list;
> +        str_list = entry;
> +    }
> +    info->unicast_table = str_list;
> +
> +    str_list = NULL;
> +    for (i = n->mac_table.first_multi; i < n->mac_table.in_use; i++) {
> +        entry = g_malloc0(sizeof(*entry));
> +        entry->value = mac_strdup_printf(n->mac_table.macs + i * ETH_ALEN);
> +        entry->next = str_list;
> +        str_list = entry;
> +    }
> +    info->multicast_table = str_list;
> +
> +    i = 0;
> +    while (i < MAX_VLAN >> 5) {
> +        /* ignore the entries that have no active vlan */
> +        if (n->vlans[i] == 0) {
> +            i++;
> +            continue;
> +        }
> +        vlan_entry = g_malloc0(sizeof(*vlan_entry));
> +        vlan_entry->value = g_malloc0(sizeof(*table_entry));
> +        vlan_entry->value->index = i;
> +        vlan_entry->value->value = n->vlans[i++];
> +        vlan_entry->next = vlan_list;
> +        vlan_list = vlan_entry;
> +    }
> +    info->vlan_table = vlan_list;
> +
> +    /* enable event notification after query */
> +    nc->rxfilter_notify_enabled = 1;
> +
> +    return info;
> +}
> +
>  static void virtio_net_reset(VirtIODevice *vdev)
>  {
>      VirtIONet *n = VIRTIO_NET(vdev);
> @@ -420,6 +520,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, 
> uint8_t cmd,
>  {
>      uint8_t on;
>      size_t s;
> +    NetClientState *nc = qemu_get_queue(n->nic);
>  
>      s = iov_to_buf(iov, iov_cnt, 0, &on, sizeof(on));
>      if (s != sizeof(on)) {
> @@ -442,6 +543,8 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, 
> uint8_t cmd,
>          return VIRTIO_NET_ERR;
>      }
>  
> +    rxfilter_notify(nc);
> +
>      return VIRTIO_NET_OK;
>  }
>  
> @@ -487,6 +590,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t 
> cmd,
>  {
>      struct virtio_net_ctrl_mac mac_data;
>      size_t s;
> +    NetClientState *nc = qemu_get_queue(n->nic);
>  
>      if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) {
>          if (iov_size(iov, iov_cnt) != sizeof(n->mac)) {
> @@ -495,6 +599,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t 
> cmd,
>          s = iov_to_buf(iov, iov_cnt, 0, &n->mac, sizeof(n->mac));
>          assert(s == sizeof(n->mac));
>          qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
> +        rxfilter_notify(nc);
> +
>          return VIRTIO_NET_OK;
>      }
>  
> @@ -559,6 +665,8 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t 
> cmd,
>          n->mac_table.multi_overflow = 1;
>      }
>  
> +    rxfilter_notify(nc);
> +
>      return VIRTIO_NET_OK;
>  }
>  
> @@ -567,6 +675,7 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
> uint8_t cmd,
>  {
>      uint16_t vid;
>      size_t s;
> +    NetClientState *nc = qemu_get_queue(n->nic);
>  
>      s = iov_to_buf(iov, iov_cnt, 0, &vid, sizeof(vid));
>      vid = lduw_p(&vid);
> @@ -584,6 +693,8 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
> uint8_t cmd,
>      else
>          return VIRTIO_NET_ERR;
>  
> +    rxfilter_notify(nc);
> +
>      return VIRTIO_NET_OK;
>  }
>  
> @@ -1312,6 +1423,7 @@ static NetClientInfo net_virtio_info = {
>      .receive = virtio_net_receive,
>          .cleanup = virtio_net_cleanup,
>      .link_status_changed = virtio_net_set_link_status,
> +    .query_rx_filter = virtio_net_query_rxfilter,
>  };
>  
>  static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
> diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> index 1a6cfcf..1942cc4 100644
> --- a/include/monitor/monitor.h
> +++ b/include/monitor/monitor.h
> @@ -41,6 +41,7 @@ typedef enum MonitorEvent {
>      QEVENT_BLOCK_JOB_READY,
>      QEVENT_DEVICE_DELETED,
>      QEVENT_DEVICE_TRAY_MOVED,
> +    QEVENT_NIC_RX_FILTER_CHANGED,
>      QEVENT_SUSPEND,
>      QEVENT_SUSPEND_DISK,
>      QEVENT_WAKEUP,
> diff --git a/include/net/net.h b/include/net/net.h
> index 43d85a1..30e4b04 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -49,6 +49,7 @@ typedef ssize_t (NetReceiveIOV)(NetClientState *, const 
> struct iovec *, int);
>  typedef void (NetCleanup) (NetClientState *);
>  typedef void (LinkStatusChanged)(NetClientState *);
>  typedef void (NetClientDestructor)(NetClientState *);
> +typedef RxFilterInfo *(QueryRxFilter)(NetClientState *);
>  
>  typedef struct NetClientInfo {
>      NetClientOptionsKind type;
> @@ -59,6 +60,7 @@ typedef struct NetClientInfo {
>      NetCanReceive *can_receive;
>      NetCleanup *cleanup;
>      LinkStatusChanged *link_status_changed;
> +    QueryRxFilter *query_rx_filter;
>      NetPoll *poll;
>  } NetClientInfo;
>  
> @@ -74,6 +76,7 @@ struct NetClientState {
>      unsigned receive_disabled : 1;
>      NetClientDestructor *destructor;
>      unsigned int queue_index;
> +    unsigned rxfilter_notify_enabled:1;
>  };
>  
>  typedef struct NICState {
> diff --git a/monitor.c b/monitor.c
> index 6ce2a4e..5e64fe8 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = {
>      [QEVENT_BLOCK_JOB_READY] = "BLOCK_JOB_READY",
>      [QEVENT_DEVICE_DELETED] = "DEVICE_DELETED",
>      [QEVENT_DEVICE_TRAY_MOVED] = "DEVICE_TRAY_MOVED",
> +    [QEVENT_NIC_RX_FILTER_CHANGED] = "NIC_RX_FILTER_CHANGED",
>      [QEVENT_SUSPEND] = "SUSPEND",
>      [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK",
>      [QEVENT_WAKEUP] = "WAKEUP",
> diff --git a/net/net.c b/net/net.c
> index 43a74e4..33abffe 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -961,6 +961,53 @@ void print_net_client(Monitor *mon, NetClientState *nc)
>                     nc->info_str);
>  }
>  
> +RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name,
> +                                      Error **errp)
> +{
> +    NetClientState *nc;
> +    RxFilterInfoList *filter_list = NULL, *last_entry = NULL;
> +
> +    QTAILQ_FOREACH(nc, &net_clients, next) {
> +        RxFilterInfoList *entry;
> +        RxFilterInfo *info;
> +
> +        /* only query rx-filter information of nic */
> +        if (nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC) {
> +            continue;
> +        }
> +        if (has_name && strcmp(nc->name, name) != 0) {
> +            continue;
> +        }
> +
> +        if (nc->info->query_rx_filter) {
> +            info = nc->info->query_rx_filter(nc);
> +            entry = g_malloc0(sizeof(*entry));
> +            entry->value = info;
> +
> +            if (!filter_list) {
> +                filter_list = entry;
> +            } else {
> +                last_entry->next = entry;
> +            }
> +            last_entry = entry;
> +        } else if (has_name) {
> +            error_setg(errp, "net client(%s) doesn't support"
> +                       " rx-filter querying", name);
> +            break;
> +        }
> +    }
> +
> +    if (filter_list == NULL && !error_is_set(errp)) {
> +        if (has_name) {
> +            error_setg(errp, "invalid net client name: %s", name);
> +        } else {
> +            error_setg(errp, "no net client supports rx-filter querying");
> +        }
> +    }
> +
> +    return filter_list;
> +}
> +
>  void do_info_network(Monitor *mon, const QDict *qdict)
>  {
>      NetClientState *nc, *peer;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index ef1f657..fb6f12c 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -3618,3 +3618,92 @@
>              '*cpuid-input-ecx': 'int',
>              'cpuid-register': 'X86CPURegister32',
>              'features': 'int' } }
> +
> +##
> +# @RxState:
> +#
> +# Packets receiving state
> +#
> +# @normal: filter assigned packets according to the mac-table
> +#
> +# @none: don't receive any assigned packet
> +#
> +# @all: receive all assigned packets
> +#
> +##
> +{ 'enum': 'RxState', 'data': [ 'normal', 'none', 'all' ] }
> +
> +##
> +# @VlanTableEntry
> +#
> +# Entry detail of vlan filter table
> +#
> +# @index: index of vlan filter table
> +#
> +# @value: 32 bit of indicator for active vlans
> +#
> +# Since 1.6
> +##
> +
> +{ 'type': 'VlanTableEntry', 'data': { 'index': 'int', 'value': 'uint32' } }
> +
> +##
> +# @RxFilterInfo:
> +#
> +# Rx-filter information for a nic.
> +#
> +# @name: net client name
> +#
> +# @promiscuous: whether promiscuous mode is enabled
> +#
> +# @multicast: multicast receive state
> +#
> +# @unicast: unicast receive state
> +#
> +# @broadcast-allowed: whether to receive broadcast
> +#
> +# @multicast-overflow: multicast table is overflowed or not
> +#
> +# @unicast-overflow: unicast table is overflowed or not
> +#
> +# @main-mac: the main macaddr string
> +#
> +# @vlan-table: a list of @VlanTableEntry
> +#
> +# @unicast-table: a list of unicast macaddr string
> +#
> +# @multicast-table: a list of multicast macaddr string
> +#
> +# Since 1.6
> +##
> +
> +{ 'type': 'RxFilterInfo',
> +  'data': {
> +    'name':               'str',
> +    'promiscuous':        'bool',
> +    'multicast':          'RxState',
> +    'unicast':            'RxState',
> +    'broadcast-allowed':  'bool',
> +    'multicast-overflow': 'bool',
> +    'unicast-overflow':   'bool',
> +    'main-mac':           'str',
> +    'vlan-table':         ['VlanTableEntry'],
> +    'unicast-table':      ['str'],
> +    'multicast-table':    ['str'] }}
> +
> +##
> +# @query-rx-filter:
> +#
> +# Return rx-filter information for all nics (or for the given nic).
> +#
> +# @name: #optional net client name
> +#
> +# Returns: list of @RxFilterInfo for all nics (or for the given nic).
> +#          Returns an error if the given @name doesn't exist, or given
> +#          nic doesn't support rx-filter querying, or no net client
> +#          supports rx-filter querying
> +#
> +# Since: 1.6
> +##
> +{ 'command': 'query-rx-filter', 'data': { '*name': 'str' },
> +  'returns': ['RxFilterInfo'] }
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index ffd130e..e76e9a2 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -2932,3 +2932,69 @@ Example:
>  <- { "return": {} }
>  
>  EQMP
> +    {
> +        .name       = "query-rx-filter",
> +        .args_type  = "name:s?",
> +        .mhandler.cmd_new = qmp_marshal_input_query_rx_filter,
> +    },
> +
> +SQMP
> +query-rx-filter
> +---------------
> +
> +Show rx-filter information.
> +
> +Returns a json-array of rx-filter information for all nics (or for the
> +given nic), returning an error if the given nic doesn't exist, or
> +given nic doesn't support rx-filter querying, or no net client
> +supports rx-filter querying.
> +
> +The query will clear the event notification flag of each nic, then qemu
> +will start to emit event to QMP monitor.
> +
> +Each array entry contains the following:
> +
> +- "name": net client name (json-string)
> +- "promiscuous": promiscuous mode is enabled (json-bool)
> +- "multicast": multicast receive state (one of 'normal', 'none', 'all')
> +- "unicast": unicast receive state  (one of 'normal', 'none', 'all')
> +- "broadcast-allowed": allow to receive broadcast (json-bool)
> +- "multicast-overflow": multicast table is overflowed (json-bool)
> +- "unicast-overflow": unicast table is overflowed (json-bool)
> +- "main-mac": main macaddr string (json-string)
> +- "vlan-table": a json-array of vlan-table entry
> +    - "index": index of vlan filter table (json-int)
> +    - "value": 32 bit of indicator for active vlans
> +- "unicast-table": a json-array of unicast macaddr string
> +- "multicast-table": a json-array of multicast macaddr string
> +
> +Example:
> +
> +-> { "execute": "query-rx-filter", "arguments": { "name": "vnet0" } }
> +<- { "return": [
> +        {
> +            "promiscuous": true,
> +            "name": "vnet0",
> +            "main-mac": "52:54:00:12:34:56",
> +            "unicast": "normal",
> +            "vlan-table": [
> +                { "index": 0,
> +                  "entry": 1
> +                }
> +            ],
> +            "unicast-table": [
> +            ],
> +            "multicast": "normal",
> +            "multicast-overflow": false,
> +            "unicast-overflow": false,
> +            "multicast-table": [
> +                "01:00:5e:00:00:01",
> +                "33:33:00:00:00:01",
> +                "33:33:ff:12:34:56"
> +            ],
> +            "broadcast-allowed": false
> +        }
> +      ]
> +   }
> +
> +EQMP




reply via email to

[Prev in Thread] Current Thread [Next in Thread]