qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 18/31] vhost: Shadow virtqueue buffers forwarding


From: Eugenio Perez Martin
Subject: Re: [PATCH 18/31] vhost: Shadow virtqueue buffers forwarding
Date: Tue, 22 Feb 2022 09:55:49 +0100

On Tue, Feb 22, 2022 at 8:26 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/2/21 下午4:15, Eugenio Perez Martin 写道:
> > On Mon, Feb 21, 2022 at 8:44 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> 在 2022/2/17 下午8:48, Eugenio Perez Martin 写道:
> >>> On Tue, Feb 8, 2022 at 9:16 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>> 在 2022/2/1 下午7:25, Eugenio Perez Martin 写道:
> >>>>> On Sun, Jan 30, 2022 at 7:47 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>> 在 2022/1/22 上午4:27, Eugenio Pérez 写道:
> >>>>>>> @@ -272,6 +590,28 @@ void 
> >>>>>>> vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd)
> >>>>>>>      void vhost_svq_stop(VhostShadowVirtqueue *svq)
> >>>>>>>      {
> >>>>>>>          event_notifier_set_handler(&svq->svq_kick, NULL);
> >>>>>>> +    g_autofree VirtQueueElement *next_avail_elem = NULL;
> >>>>>>> +
> >>>>>>> +    if (!svq->vq) {
> >>>>>>> +        return;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    /* Send all pending used descriptors to guest */
> >>>>>>> +    vhost_svq_flush(svq, false);
> >>>>>> Do we need to wait for all the pending descriptors to be completed 
> >>>>>> here?
> >>>>>>
> >>>>> No, this function does not wait, it only completes the forwarding of
> >>>>> the *used* descriptors.
> >>>>>
> >>>>> The best example is the net rx queue in my opinion. This call will
> >>>>> check SVQ's vring used_idx and will forward the last used descriptors
> >>>>> if any, but all available descriptors will remain as available for
> >>>>> qemu's VQ code.
> >>>>>
> >>>>> To skip it would miss those last rx descriptors in migration.
> >>>>>
> >>>>> Thanks!
> >>>> So it's probably to not the best place to ask. It's more about the
> >>>> inflight descriptors so it should be TX instead of RX.
> >>>>
> >>>> I can imagine the migration last phase, we should stop the vhost-vDPA
> >>>> before calling vhost_svq_stop(). Then we should be fine regardless of
> >>>> inflight descriptors.
> >>>>
> >>> I think I'm still missing something here.
> >>>
> >>> To be on the same page. Regarding tx this could cause repeated tx
> >>> frames (one at source and other at destination), but never a missed
> >>> buffer not transmitted. The "stop before" could be interpreted as "SVQ
> >>> is not forwarding available buffers anymore". Would that work?
> >>
> >> Right, but this only work if
> >>
> >> 1) a flush to make sure TX DMA for inflight descriptors are all completed
> >>
> >> 2) just mark all inflight descriptor used
> >>
> > It currently trusts on the reverse: Buffers not marked as used (by the
> > device) will be available in the destination, so expect
> > retransmissions.
>
>
> I may miss something but I think we do migrate last_avail_idx. So there
> won't be a re-transmission, since we depend on qemu virtqueue code to
> deal with vring base?
>

On stop, vhost_virtqueue_stop calls vhost_vdpa_get_vring_base. In SVQ
mode, it returns last_used_idx. After that, vhost.c code set VirtQueue
last_avail_idx == last_used_idx, and it's migrated after that if I'm
not wrong.

vhost kernel migrates last_avail_idx, but it makes rx buffers
available on-demand, unlike SVQ. So it does not need to unwind buffers
or anything like that. Because of how SVQ works with the rx queue,
this is not possible, since the destination will find no available
buffers for rx. And for tx you already have described the scenario.

In other words, we cannot see SVQ as a vhost device in that regard:
SVQ looks for total drain (as "make all guest's buffers available for
the device ASAP") vs the vhost device which can live with a lot of
available ones and it will use them on demand. Same problem as
masking. So the difference in behavior is justified in my opinion, and
it can be improved in the future with the vdpa in-flight descriptors.

If we restore the state that way in a virtio-net device, it will see
the available ones as expected, not as in-flight.

Another possibility is to transform all of these into in-flight ones,
but I feel it would create problems. Can we migrate all rx queues as
in-flight, with 0 bytes written? Is it worth it? I didn't investigate
that path too much, but I think the virtio-net emulated device does
not support that at the moment. If I'm not wrong, we should copy
something like the body of virtio_blk_load_device if we want to go
that route.

The current approach might be too net-centric, so let me know if this
behavior is unexpected or we can do better otherwise.

Thanks!

> Thanks
>
>
> >
> > Thanks!
> >
> >> Otherwise there could be buffers that is inflight forever.
> >>
> >> Thanks
> >>
> >>
> >>> Thanks!
> >>>
> >>>> Thanks
> >>>>
> >>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>>
> >>>>>>> +
> >>>>>>> +    for (unsigned i = 0; i < svq->vring.num; ++i) {
> >>>>>>> +        g_autofree VirtQueueElement *elem = NULL;
> >>>>>>> +        elem = g_steal_pointer(&svq->ring_id_maps[i]);
> >>>>>>> +        if (elem) {
> >>>>>>> +            virtqueue_detach_element(svq->vq, elem, elem->len);
> >>>>>>> +        }
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    next_avail_elem = g_steal_pointer(&svq->next_guest_avail_elem);
> >>>>>>> +    if (next_avail_elem) {
> >>>>>>> +        virtqueue_detach_element(svq->vq, next_avail_elem,
> >>>>>>> +                                 next_avail_elem->len);
> >>>>>>> +    }
> >>>>>>>      }
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]