[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest |
Date: |
Sun, 18 May 2014 12:04:06 +0300 |
On Fri, May 16, 2014 at 01:02:51PM +0800, Jason Wang wrote:
> On 05/15/2014 05:45 PM, Michael S. Tsirkin wrote:
> > On Thu, May 15, 2014 at 05:22:28PM +0800, Jason Wang wrote:
> >> On 05/15/2014 04:28 PM, Michael S. Tsirkin wrote:
> >>> Thanks, looks good.
> >>> Some minor comments below,
> >>>
> >>> On Thu, May 15, 2014 at 03:16:47PM +0800, Jason Wang wrote:
> >>>> It's hard to track all mac addresses and their configurations (e.g
> >>>> vlan or ipv6) in qemu. Without those informations, it's impossible to
> >>> s/those informations/this information/
> >>>
> >>>> build proper garp packet after migration. The only possible solution
> >>>> to this is let guest (who knew all configurations) to do this.
> >>> s/knew/knows/
> >>>
> >>>> So, this patch introduces a new readonly config status bit of virtio-net,
> >>>> VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
> >>>> presence of its link through config update interrupt.When guest has
> >>>> done the announcement, it should ack the notification through
> >>>> VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
> >>>> feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
> >>>> Linux guest).
> >>>>
> >>>> During load, a counter of announcing rounds were set so that the after
> >>> s/were/is/
> >>> s/the after/after/
> >> Will correct those typos.
> >>>> the vm is running it can trigger rounds of config interrupts to notify
> >>>> the guest to build and send the correct garps.
> >>>>
> >>>> Tested with ping to guest with vlan during migration. Without the
> >>>> patch, lots of the packets were lost after migration. With the patch,
> >>>> could not notice packet loss after migration.
> >>> below changelog should go after ---, until the ack list.
> >>>
> >> Ok.
> >>>> Reference:
> >>>> RFC v2:
> >>>> https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
> >>>> RFC v1:
> >>>> https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
> >>>> V7:
> >>>> https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html
> >>>>
> >>>> Changes from RFC v2:
> >>>> - use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
> >>>> - compat self announce for 2.0 machine type
> >>>>
> >>>> Changes from RFC v1:
> >>>> - clean VIRTIO_NET_S_ANNOUNCE bit during reset
> >>>> - free announce timer during clean
> >>>> - make announce work for non-vhost case
> >>>>
> >>>> Changes from V7:
> >>>> - Instead of introducing a global method for each kind of nic, this
> >>>> version limits the changes to virtio-net itself.
> >>>>
> >>>> Cc: Liuyongan <address@hidden>
> >>>> Cc: Amos Kong <address@hidden>
> >>>> Signed-off-by: Jason Wang <address@hidden>
> >>>> ---
> >>>> hw/net/virtio-net.c | 48
> >>>> ++++++++++++++++++++++++++++++++++++++++
> >>>> include/hw/i386/pc.h | 5 ++++
> >>>> include/hw/virtio/virtio-net.h | 16 +++++++++++++
> >>>> 3 files changed, 69 insertions(+), 0 deletions(-)
> >>>>
> >>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >>>> index 940a7cf..98d59e9 100644
> >>>> --- a/hw/net/virtio-net.c
> >>>> +++ b/hw/net/virtio-net.c
> >>>> @@ -25,6 +25,7 @@
> >>>> #include "monitor/monitor.h"
> >>>>
> >>>> #define VIRTIO_NET_VM_VERSION 11
> >>>> +#define VIRTIO_NET_ANNOUNCE_ROUNDS 3
> >>>>
> >>>> #define MAC_TABLE_ENTRIES 64
> >>>> #define MAX_VLAN (1 << 12) /* Per 802.1Q definition */
> >>> I would make it 5 to be consistent with SELF_ANNOUNCE_ROUNDS
> >>> in savevm.c
> >>>
> >>> in fact, let's move SELF_ANNOUNCE_ROUNDS to include/migration/vmstate.h
> >>> and reuse it.
> >> Ok.
> >>>> @@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet *n, uint8_t
> >>>> status)
> >>>> (n->status & VIRTIO_NET_S_LINK_UP) && vdev->vm_running;
> >>>> }
> >>>>
> >>>> +static void virtio_net_announce_timer(void *opaque)
> >>>> +{
> >>>> + VirtIONet *n = opaque;
> >>>> + VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >>>> +
> >>>> + if (n->announce &&
> >>> I would make it > 0 here, just in case it becomes negative as a result
> >>> of some bug.
> >> Sure.
> >>>> + virtio_net_started(n, vdev->status) &&
> >>>> + vdev->guest_features & (0x1 << VIRTIO_NET_F_GUEST_ANNOUNCE) &&
> >>>> + vdev->guest_features & (0x1 << VIRTIO_NET_F_CTRL_VQ)) {
> >>>> +
> >>>> + n->announce--;
> >>>> + n->status |= VIRTIO_NET_S_ANNOUNCE;
> >>>> + virtio_notify_config(vdev);
> >>>> + } else {
> >>>> + timer_del(n->announce_timer);
> >>> why do this here?
> >>>
> >>>> + n->announce = 0;
> >>> why is this here?
> >>>
> >> If guest shuts down the device, there's no need to do the announcing.
> > It's still weird.
> > Either announce is 0 and then timer was not running
> > or it's > 0 and then this won't trigger.
>
> Right, the logic here is for QEMU_CLOCK_REALTIME. But there's another
> question, we use QEMU_CLOCK_VIRTUAL while qemu is using
> QEMU_CLOCK_REALTIME for its announcing. This looks fine but not sure
> whether this is safe.
Meaning QEMU_CLOCK_REALTIME that qemu uses?
Not sure either but it doesn't modify guest state so it seems safe.
> And if the link was down, it's better for us to
> stop the announcing?
I think that it doesn't matter: guest won't announce when
link is down, anyway.
Not worth it to write extra logic here in the host.
> >
> >>>> + }
> >>>> +}
> >>>> +
> >>>> static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
> >>>> {
> >>>> VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >>>> @@ -147,6 +167,8 @@ static void virtio_net_set_status(struct
> >>>> VirtIODevice *vdev, uint8_t status)
> >>>>
> >>>> virtio_net_vhost_status(n, status);
> >>>>
> >>>> + virtio_net_announce_timer(n);
> >>>> +
> >>> why do this here? why not right after we set announce counter?
> >> The reasons are:
> >>
> >> - The counters were set in load, but the device is not running so we
> >> could not inject the interrupt at that time.
> > I see. This makes sense - but this isn't intuitive.
> > Why don't we simply start timer with current time?
> > Need to make sure it runs fine if time passes, but
> > I think it's fine.
>
> Not sure I get the point, I didn't see any differences except for an
> extra timer fire.
The only reason you call virtio_net_announce_timer from set_status
is because it gets run on vm start/stop.
It's true but not intuitive.
Just run timer always from timer, it's clearer this way :)
> >
> >> - We can stop the progress when guest is shutting down the device.
> > On shut down guest will reset device stopping timer - this seems enough.
>
> Yes, I see.
> >>>> for (i = 0; i < n->max_queues; i++) {
> >>>> q = &n->vqs[i];
> >>>>
> >>>> @@ -322,6 +344,9 @@ static void virtio_net_reset(VirtIODevice *vdev)
> >>>> n->nobcast = 0;
> >>>> /* multiqueue is disabled by default */
> >>>> n->curr_queues = 1;
> >>>> + timer_del(n->announce_timer);
> >>>> + n->announce = 0;
> >>>> + n->status &= ~VIRTIO_NET_S_ANNOUNCE;
> >>>>
> >>>> /* Flush any MAC and VLAN filter table state */
> >>>> n->mac_table.in_use = 0;
> >>>> @@ -731,6 +756,22 @@ static int virtio_net_handle_vlan_table(VirtIONet
> >>>> *n, uint8_t cmd,
> >>>> return VIRTIO_NET_OK;
> >>>> }
> >>>>
> >>>> +static int virtio_net_handle_announce(VirtIONet *n, uint8_t cmd,
> >>>> + struct iovec *iov, unsigned int
> >>>> iov_cnt)
> >>>> +{
> >>>> + if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
> >>>> + n->status &= ~VIRTIO_NET_S_ANNOUNCE;
> >>>> + if (n->announce) {
> >>> I would make it > 0 here, just in case it becomes negative as a result
> >>> of some bug.
> >> Ok.
> >>>> + timer_mod(n->announce_timer,
> >>>> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 50 +
> >>>> + (VIRTIO_NET_ANNOUNCE_ROUNDS - n->announce - 1) *
> >>>> 100);
> >>> savevm.c has this code, and a comment:
> >>> /* delay 50ms, 150ms, 250ms, ... */
> >>> 50 + (SELF_ANNOUNCE_ROUNDS - count - 1) * 100
> >>>
> >>> how about an API in include/migration/vmstate.h
> >>>
> >>> static inline
> >>> int64_t self_announce_delay(int round)
> >>> {
> >>> assert(round < SELF_ANNOUNCE_ROUNDS && round > 0);
> >>> /* delay 50ms, 150ms, 250ms, ... */
> >>> return 50 + (SELF_ANNOUNCE_ROUNDS - round - 1) * 100;
> >>> }
> >>>
> >>> or something to this end.
> >>>
> >> Good idea, will do this.
> >>>> + }
> >>>> + return VIRTIO_NET_OK;
> >>>> + } else {
> >>>> + return VIRTIO_NET_ERR;
> >>>> + }
> >>>> +}
> >>>> +
> >>>> static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
> >>>> struct iovec *iov, unsigned int iov_cnt)
> >>>> {
> >>>> @@ -794,6 +835,8 @@ static void virtio_net_handle_ctrl(VirtIODevice
> >>>> *vdev, VirtQueue *vq)
> >>>> status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
> >>>> } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
> >>>> status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov,
> >>>> iov_cnt);
> >>>> + } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
> >>>> + status = virtio_net_handle_announce(n, ctrl.cmd, iov,
> >>>> iov_cnt);
> >>>> } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
> >>>> status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
> >>>> } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
> >>>> @@ -1451,6 +1494,7 @@ static int virtio_net_load(QEMUFile *f, void
> >>>> *opaque, int version_id)
> >>>> qemu_get_subqueue(n->nic, i)->link_down = link_down;
> >>>> }
> >>>>
> >>>> + n->announce = VIRTIO_NET_ANNOUNCE_ROUNDS;
> >>> Well if virtio_net_handle_announce runs now it will start timer
> >>> in the past, right?
> >>> This seems wrong.
> >> Not sure I get the case. When in virtio_net_load() the vm is not even
> >> running so looks like virtio_net_handle_announce() could not run in the
> >> same time.
> > I see, this works because you decrement it when VM starts running.
> > I think it's not a good idea to rely on this though,
> > better do everything from timer, and handle
> > the case of command arriving too early.
> >
>
> Right, if QEMU_CLOCK_VIRTUAL is fine, we can do everything in a timer.
>
> For the case of command arriving too early. Is there anything else we
> need to do? Since we only start the next timer when we get ack command.
>
> Thanks
I think we need to make sure we don't set the timer in the past or
very far in the future.
- [Qemu-devel] [PATCH] virtio-net: announce self by guest, Jason Wang, 2014/05/15
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Michael S. Tsirkin, 2014/05/15
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Jason Wang, 2014/05/15
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Michael S. Tsirkin, 2014/05/15
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Jason Wang, 2014/05/16
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest,
Michael S. Tsirkin <=
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Jason Wang, 2014/05/19
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Michael S. Tsirkin, 2014/05/19
- Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest, Jason Wang, 2014/05/20