qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/3] virtio-net: Only enable userland vq if using tap backend


From: Jason Wang
Subject: Re: [PATCH 2/3] virtio-net: Only enable userland vq if using tap backend
Date: Mon, 22 Nov 2021 14:30:04 +0800

On Mon, Nov 22, 2021 at 2:24 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Mon, Nov 22, 2021 at 3:39 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Nov 19, 2021 at 3:50 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Fri, Nov 19, 2021 at 3:44 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Thu, Nov 18, 2021 at 3:57 PM Eugenio Perez Martin
> > > > <eperezma@redhat.com> wrote:
> > > > >
> > > > > On Thu, Nov 18, 2021 at 6:06 AM Jason Wang <jasowang@redhat.com> 
> > > > > wrote:
> > > > > >
> > > > > > On Thu, Nov 18, 2021 at 3:29 AM Eugenio Pérez <eperezma@redhat.com> 
> > > > > > wrote:
> > > > > > >
> > > > > > > Qemu falls back on userland handlers even if vhost-user and 
> > > > > > > vhost-vdpa
> > > > > > > cases. These assumes a tap device can handle the packets.
> > > > > > >
> > > > > > > If a vdpa device fail to start, it can trigger a sigsegv because 
> > > > > > > of
> > > > > > > that. Do not resort on them unless actually possible.
> > > > > >
> > > > > > It would be better to show the calltrace here then we can see the 
> > > > > > root cause.
> > > > > >
> > > > >
> > > > > Sure, I'll paste here and I'll resend to the next version:
> > > > > #1  0x000055955f696e92 in nc_sendv_compat (flags=<optimized out>,
> > > > > iovcnt=2, iov=0x7ffe73abe300, nc=0x7fcf22d6d010) at ../net/net.c:756
> > > > > #2  qemu_deliver_packet_iov (sender=<optimized out>,
> > > > > opaque=0x7fcf22d6d010, iovcnt=2, iov=0x7ffe73abe300, flags=<optimized
> > > > > out>) at ../net/net.c:784
> > > > > #3  qemu_deliver_packet_iov (sender=<optimized out>, flags=<optimized
> > > > > out>, iov=0x7ffe73abe300, iovcnt=2, opaque=0x7fcf22d6d010) at
> > > > > ../net/net.c:763
> > > > > #4  0x000055955f69a078 in qemu_net_queue_deliver_iov (iovcnt=2,
> > > > > iov=0x7ffe73abe300, flags=0, sender=0x5595631f5ac0,
> > > > >     queue=0x559561c7baa0) at ../net/queue.c:179
> > > > > #5  qemu_net_queue_send_iov (queue=0x559561c7baa0,
> > > > > sender=0x5595631f5ac0, flags=flags@entry=0,
> > > > > iov=iov@entry=0x7ffe73abe300,
> > > > >     iovcnt=iovcnt@entry=2, sent_cb=sent_cb@entry=0x55955f82ae60
> > > > > <virtio_net_tx_complete>) at ../net/queue.c:246
> > > > > #6  0x000055955f697d43 in qemu_sendv_packet_async
> > > > > (sent_cb=0x55955f82ae60 <virtio_net_tx_complete>, iovcnt=2,
> > > > > iov=0x7ffe73abe300,
> > > > >     sender=<optimized out>) at ../net/net.c:825
> > > > > #7  qemu_sendv_packet_async (sender=<optimized out>,
> > > > > iov=iov@entry=0x7ffe73abe300, iovcnt=iovcnt@entry=1662966768,
> > > > >     sent_cb=sent_cb@entry=0x55955f82ae60 <virtio_net_tx_complete>) at
> > > > > ../net/net.c:794
> > > > > #8  0x000055955f82aba9 in virtio_net_flush_tx (q=0x0,
> > > > > q@entry=0x5595631edbf0) at ../hw/net/virtio-net.c:2577
> > > > > #9  0x000055955f82ade8 in virtio_net_tx_bh (opaque=0x5595631edbf0) at
> > > > > ../hw/net/virtio-net.c:2694
> > > > > #10 0x000055955f9e847d in aio_bh_call (bh=0x559561c7e590) at 
> > > > > ../util/async.c:169
> > > > > #11 aio_bh_poll (ctx=ctx@entry=0x559561c81650) at ../util/async.c:169
> > > > > #12 0x000055955f9d6912 in aio_dispatch (ctx=0x559561c81650) at
> > > > > ../util/aio-posix.c:381
> > > > > #13 0x000055955f9e8322 in aio_ctx_dispatch (source=<optimized out>,
> > > > > callback=<optimized out>, user_data=<optimized out>)
> > > > >     at ../util/async.c:311
> > > > > #14 0x00007fcf20a5495d in g_main_context_dispatch () from
> > > > > /lib64/libglib-2.0.so.0
> > > > > #15 0x000055955f9f2fc0 in glib_pollfds_poll () at 
> > > > > ../util/main-loop.c:232
> > > > > #16 os_host_main_loop_wait (timeout=<optimized out>) at 
> > > > > ../util/main-loop.c:255
> > > > > #17 main_loop_wait (nonblocking=nonblocking@entry=0) at 
> > > > > ../util/main-loop.c:531
> > > > > #18 0x000055955f7eee49 in qemu_main_loop () at 
> > > > > ../softmmu/runstate.c:726
> > > > > #19 0x000055955f6235c2 in main (argc=<optimized out>, argv=<optimized
> > > > > out>, envp=<optimized out>) at ../softmmu/main.c:50
> > > > >
> > > > > In nc_sendv_compat, nc->info is net_vhost_vdpa_info, so
> > > > > nc->info->receive is NULL.
> > > > >
> > > > > > >
> > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > ---
> > > > > > >  include/hw/virtio/virtio.h |  2 ++
> > > > > > >  hw/net/virtio-net.c        |  4 ++++
> > > > > > >  hw/virtio/virtio.c         | 21 +++++++++++++--------
> > > > > > >  3 files changed, 19 insertions(+), 8 deletions(-)
> > > > > > >
> > > > > > > diff --git a/include/hw/virtio/virtio.h 
> > > > > > > b/include/hw/virtio/virtio.h
> > > > > > > index 8bab9cfb75..1712ba0b4c 100644
> > > > > > > --- a/include/hw/virtio/virtio.h
> > > > > > > +++ b/include/hw/virtio/virtio.h
> > > > > > > @@ -105,6 +105,8 @@ struct VirtIODevice
> > > > > > >      VMChangeStateEntry *vmstate;
> > > > > > >      char *bus_name;
> > > > > > >      uint8_t device_endian;
> > > > > > > +    /* backend does not support userspace handler */
> > > > > > > +    bool disable_ioeventfd_handler;
> > > > > > >      bool use_guest_notifier_mask;
> > > > > > >      AddressSpace *dma_as;
> > > > > > >      QLIST_HEAD(, VirtQueue) *vector_queues;
> > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > > > index 004acf858f..8c5c4e5a9d 100644
> > > > > > > --- a/hw/net/virtio-net.c
> > > > > > > +++ b/hw/net/virtio-net.c
> > > > > > > @@ -3501,6 +3501,10 @@ static void 
> > > > > > > virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > > > > >      nc = qemu_get_queue(n->nic);
> > > > > > >      nc->rxfilter_notify_enabled = 1;
> > > > > > >
> > > > > > > +    if (!nc->peer || nc->peer->info->type != 
> > > > > > > NET_CLIENT_DRIVER_TAP) {
> > > > > > > +        /* Only tap can use userspace networking */
> > > > > > > +        vdev->disable_ioeventfd_handler = true;
> > > > > > > +    }
> > > > > > >      if (nc->peer && nc->peer->info->type == 
> > > > > > > NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > > > > >          struct virtio_net_config netcfg = {};
> > > > > > >          memcpy(&netcfg.mac, &n->nic_conf.macaddr, ETH_ALEN);
> > > > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > > > > > index ea7c079fb0..1e04db6650 100644
> > > > > > > --- a/hw/virtio/virtio.c
> > > > > > > +++ b/hw/virtio/virtio.c
> > > > > > > @@ -3734,17 +3734,22 @@ static int 
> > > > > > > virtio_device_start_ioeventfd_impl(VirtIODevice *vdev)
> > > > > > >              err = r;
> > > > > > >              goto assign_error;
> > > > > > >          }
> > > > > > > -        event_notifier_set_handler(&vq->host_notifier,
> > > > > > > -                                   
> > > > > > > virtio_queue_host_notifier_read);
> > > > > > > +
> > > > > > > +        if (!vdev->disable_ioeventfd_handler) {
> > > > > > > +            event_notifier_set_handler(&vq->host_notifier,
> > > > > > > +                                       
> > > > > > > virtio_queue_host_notifier_read);
> > > > > >
> > > > > > This is just about not responding to ioeventfd. Does this happen 
> > > > > > only
> > > > > > when ioeventfd is enabled? If yes, we probably need a consistent way
> > > > > > to deal with that. Will having a dummy receiver be more simpler?
> > > > > >
> > > > >
> > > > > If you mean NetClientInfo receiver, that would make qemu to actually
> > > > > read from the virtqueue, I'm not sure if that is the right behavior
> > > > > even for net devices. I see way simpler for qemu not to monitor
> > > > > virtqueue kicks at all, isn't it?
> > > >
> > > > It looks not easy, the ioeventfd or vmexit monitoring is set up by the
> > > > virtio-pci. As you've seen, even if you disable ioeventfd it can still
> > > > come from the slow vmexit path.
> > > >
> > >
> > > Yes, but there are only two paths of that
> > >
> > > > And virtio-pci is loosely coupled with its peer, which makes it even
> > > > more tricky to do that.
> > > >
> > >
> > > That's right, but vhost_user already does that to the guest notifier
> > > with use_guest_notifier_mask.
> >
> > I don't get this. Maybe you can point out the codes? (I think we only
> > need to deal with kicks instead of interrupts).
> >
>
> Sorry for being unclear, what I meant is that I agree it's loosely
> coupled, but that coupling is already done for interrupts the same way
> it did this patch.

Ok.

>
> > >
> > > I agree that introducing a dummy receiver is way less change, but I
> > > still feel that qemu can do nothing with that notification, so the
> > > right change is to totally disable it.
> >
> > It depends on how to define "disabling".
> >
>
> I meant to qemu to be totally unaware of the guest notification, since
> qemu can do nothing about it. So it should not poll the ioeventfd, and
> it should not vmexit to it.

As discussed, it would be hard since the nic (virito-pci) is loosely
coupled with it's peer. But maybe I was wrong.

>
> > And what's more important I think vhost-vDPA should deal with RARP as
> > what vhost-user did, otherwise we breaks the migration without
> > GUEST_ANNOUNCE. Any idea on how to fix this?
> >
>
> But that's a related but different topic, because the RAPR is not sent
> because of a guest kick: Qemu directly queues it.

The point I think is that, if we know we will support RARP, we'd
better stick to a method via .receive() now.

>
> I think we can send the RARP reusing some of the driver part of SVQ
> code actually :).

Exactly, but then the SVQ needs to be done at the destination until
the last round of RARP is sent.

>
> > > However, it's not like it's
> > > going to be on the hot path anyway...
> > >
> > > > >
> > > > > net_vhost_user_info has a receiver to treat the special case of
> > > > > reverse ARP. But I think vhost-user can't fall back to qemu userspace
> > > > > networking at all.
> > > > >
> > > > > But the crash is still reproducible with ioeventfd=off, so I need to
> > > > > improve the patch either way.
> > > >
> > > > So I wonder if we can simply use receive_disabled of the netclient.
> > > >
> > >
> > > I missed that, but in a second review receive_disabled can be clear if
> > > the code calls to qemu_flush_or_purge_queued_packets.
> >
> > To say the truth I don't get why receive_disabled needs to be clear in
> > that case.
> >
>
> I think it is for optimization: The backend tells it cannot accept
> more packets. But I might be wrong, since the backend actually tell
> "didn't send all the packets", that is not exactly the same.

Yes, just a note, without RARP support, vhost-use sticks to a method
with received_disabled only.

Thanks

>
> Thanks!
>
> > > To that point,
> > > it would be better the dummy receiver that always return 0.
> >
> > That's fine.
> >
> > Thanks
> >
> > >
> > > Thanks!
> > >
> > > > Thanks
> > > >
> > > > >
> > > > > Thanks!
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > > +        }
> > > > > > >      }
> > > > > > >
> > > > > > > -    for (n = 0; n < VIRTIO_QUEUE_MAX; n++) {
> > > > > > > -        /* Kick right away to begin processing requests already 
> > > > > > > in vring */
> > > > > > > -        VirtQueue *vq = &vdev->vq[n];
> > > > > > > -        if (!vq->vring.num) {
> > > > > > > -            continue;
> > > > > > > +    if (!vdev->disable_ioeventfd_handler) {
> > > > > > > +        for (n = 0; n < VIRTIO_QUEUE_MAX; n++) {
> > > > > > > +            /* Kick right away to begin processing requests 
> > > > > > > already in vring */
> > > > > > > +            VirtQueue *vq = &vdev->vq[n];
> > > > > > > +            if (!vq->vring.num) {
> > > > > > > +                continue;
> > > > > > > +            }
> > > > > > > +            event_notifier_set(&vq->host_notifier);
> > > > > > >          }
> > > > > > > -        event_notifier_set(&vq->host_notifier);
> > > > > > >      }
> > > > > > >      memory_region_transaction_commit();
> > > > > > >      return 0;
> > > > > > > --
> > > > > > > 2.27.0
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]