qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/2] add VirtIONet vhost_stopped flag to prevent


From: Dan Streetman
Subject: Re: [Qemu-devel] [PATCH 1/2] add VirtIONet vhost_stopped flag to prevent multiple stops
Date: Tue, 23 Apr 2019 04:49:57 -0400

On Mon, Apr 22, 2019 at 10:59 PM Jason Wang <address@hidden> wrote:
>
>
> On 2019/4/23 上午4:14, Dan Streetman wrote:
> > On Sun, Apr 21, 2019 at 10:50 PM Jason Wang <address@hidden> wrote:
> >>
> >> On 2019/4/17 上午2:46, Dan Streetman wrote:
> >>> From: Dan Streetman <address@hidden>
> >>>
> >>> Buglink: https://launchpad.net/bugs/1823458
> >>>
> >>> There is a race condition when using the vhost-user driver, between a 
> >>> guest
> >>> shutdown and the vhost-user interface being closed.  This is explained in
> >>> more detail at the bug link above; the short explanation is the vhost-user
> >>> device can be closed while the main thread is in the middle of stopping
> >>> the vhost_net.  In this case, the main thread handling shutdown will
> >>> enter virtio_net_vhost_status() and move into the n->vhost_started (else)
> >>> block, and call vhost_net_stop(); while it is running that function,
> >>> another thread is notified that the vhost-user device has been closed,
> >>> and (indirectly) calls into virtio_net_vhost_status() also.
> >>
> >> I think we need figure out why there are multiple vhost_net_stop() calls
> >> simultaneously. E.g vhost-user register fd handlers like:
> >>
> >>           qemu_chr_fe_set_handlers(&s->chr, NULL, NULL,
> >>                                    net_vhost_user_event, NULL, nc0->name,
> >> NULL,
> >>                                    true);
> >>
> >> which uses default main context, so it should only be called only in
> >> main thread.
> > net_vhost_user_event() schedules chr_closed_bh() to do its bottom half
> > work; does aio_bh_schedule_oneshot() execute its events from the main
> > thread?
>
>
> I think so if net_vhost_user_event() was called in main thread (it calls
> qemu_get_current_aio_context()).

ok, I'll check that, thanks!

I think my other patch, to remove the vhost_user_stop() call
completely from the net_vhost_user_event() handler for
CHR_EVENT_CLOSED, is still relevant; do you have thoughts on that?

>
>
> >
> > For reference, the call chain is:
> >
> > chr_closed_bh()
> >    qmp_set_link()
> >      nc->info->link_status_changed() -> virtio_net_set_link_status()
> >        virtio_net_set_status()
> >          virtio_net_vhost_status()
>
>
> The code was added by Marc since:
>
> commit e7c83a885f865128ae3cf1946f8cb538b63cbfba
> Author: Marc-André Lureau <address@hidden>
> Date:   Mon Feb 27 14:49:56 2017 +0400
>
>      vhost-user: delay vhost_user_stop
>
> Cc him for more thoughts.
>
> Thanks
>
>
> >> Thanks
> >>
> >>
> >>>    Since the
> >>> vhost_net status hasn't yet changed, the second thread also enters
> >>> the n->vhost_started block, and also calls vhost_net_stop().  This
> >>> causes problems for the second thread when it tries to stop the network
> >>> that's already been stopped.
> >>>
> >>> This adds a flag to the struct that's atomically set to prevent more than
> >>> one thread from calling vhost_net_stop().  The atomic_fetch_inc() is 
> >>> likely
> >>> overkill and probably could be done with a simple check-and-set, but
> >>> since it's a race condition there would still be a (very, very) small
> >>> window without using an atomic to set it.
> >>>
> >>> Signed-off-by: Dan Streetman <address@hidden>
> >>> ---
> >>>    hw/net/virtio-net.c            | 3 ++-
> >>>    include/hw/virtio/virtio-net.h | 1 +
> >>>    2 files changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >>> index ffe0872fff..d36f50d5dd 100644
> >>> --- a/hw/net/virtio-net.c
> >>> +++ b/hw/net/virtio-net.c
> >>> @@ -13,6 +13,7 @@
> >>>
> >>>    #include "qemu/osdep.h"
> >>>    #include "qemu/iov.h"
> >>> +#include "qemu/atomic.h"
> >>>    #include "hw/virtio/virtio.h"
> >>>    #include "net/net.h"
> >>>    #include "net/checksum.h"
> >>> @@ -240,7 +241,7 @@ static void virtio_net_vhost_status(VirtIONet *n, 
> >>> uint8_t status)
> >>>                             "falling back on userspace virtio", -r);
> >>>                n->vhost_started = 0;
> >>>            }
> >>> -    } else {
> >>> +    } else if (atomic_fetch_inc(&n->vhost_stopped) == 0) {
> >>>            vhost_net_stop(vdev, n->nic->ncs, queues);
> >>>            n->vhost_started = 0;
> >>>        }
> >>> diff --git a/include/hw/virtio/virtio-net.h 
> >>> b/include/hw/virtio/virtio-net.h
> >>> index b96f0c643f..d03fd933d0 100644
> >>> --- a/include/hw/virtio/virtio-net.h
> >>> +++ b/include/hw/virtio/virtio-net.h
> >>> @@ -164,6 +164,7 @@ struct VirtIONet {
> >>>        uint8_t nouni;
> >>>        uint8_t nobcast;
> >>>        uint8_t vhost_started;
> >>> +    int vhost_stopped;
> >>>        struct {
> >>>            uint32_t in_use;
> >>>            uint32_t first_multi;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]