[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK
From: |
Eugenio Perez Martin |
Subject: |
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK |
Date: |
Fri, 13 Jan 2023 11:03:17 +0100 |
On Fri, Jan 13, 2023 at 10:51 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Fri, Jan 13, 2023 at 09:19:00AM +0100, Eugenio Perez Martin wrote:
> >On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >> >
> >> > To restore the device at the destination of a live migration we send the
> >> > commands through control virtqueue. For a device to read CVQ it must
> >> > have received the DRIVER_OK status bit.
> >>
> >> This probably requires the support from the parent driver and requires
> >> some changes or fixes in the parent driver.
> >>
> >> Some drivers did:
> >>
> >> parent_set_status():
> >> if (DRIVER_OK)
> >> if (queue_enable)
> >> write queue_enable to the device
> >>
> >> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
> >>
> >
> >I don't get your point here. No device should start reading CVQ (or
> >any other VQ) without having received DRIVER_OK.
> >
> >Some parent drivers do not support sending the queue enable command
> >after DRIVER_OK, usually because they clean part of the state like the
> >set by set_vring_base. Even vdpa_net_sim needs fixes here.
> >
> >But my understanding is that it should be supported so I consider it a
> >bug. Especially after queue_reset patches. Is that what you mean?
> >
> >> >
> >> > However this opens a window where the device could start receiving
> >> > packets in rx queue 0 before it receives the RSS configuration. To avoid
> >> > that, we will not send vring_enable until all configuration is used by
> >> > the device.
> >> >
> >> > As a first step, run vhost_set_vring_ready for all vhost_net backend
> >> > after
> >> > all of them are started (with DRIVER_OK). This code should not affect
> >> > vdpa.
> >> >
> >> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >> > ---
> >> > hw/net/vhost_net.c | 17 ++++++++++++-----
> >> > 1 file changed, 12 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >> > index c4eecc6f36..3900599465 100644
> >> > --- a/hw/net/vhost_net.c
> >> > +++ b/hw/net/vhost_net.c
> >> > @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev,
> >> > NetClientState *ncs,
> >> > } else {
> >> > peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> > }
> >> > + r = vhost_net_start_one(get_vhost_net(peer), dev);
> >> > + if (r < 0) {
> >> > + goto err_start;
> >> > + }
> >> > + }
> >> > +
> >> > + for (int j = 0; j < nvhosts; j++) {
> >> > + if (j < data_queue_pairs) {
> >> > + peer = qemu_get_peer(ncs, j);
> >> > + } else {
> >> > + peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> > + }
> >>
> >> I fail to understand why we need to change the vhost_net layer? This
> >> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
> >> vhost_vdpa_dev_start()?
> >>
> >
> >The vhost-net layer explicitly calls vhost_set_vring_enable before
> >vhost_dev_start, and this is exactly the behavior we want to avoid.
> >Even if we make changes to vhost_dev, this change is still needed.
>
> I'm working on something similar since I'd like to re-work the following
> commit we merged just before 7.2 release:
> 4daa5054c5 vhost: enable vrings in vhost_dev_start() for vhost-user
> devices
>
> vhost-net wasn't the only one who enabled vrings independently, but it
> was easy enough for others devices to avoid it and enable them in
> vhost_dev_start().
>
> Do you think can we avoid in some way this special behaviour of
> vhost-net and enable the vrings in vhost_dev_start?
>
Actually looking forward to it :). If that gets merged before this
series, I think we could drop this patch.
If I'm not wrong the enable/disable dance is used just by vhost-user
at the moment.
Maxime, could you give us some hints about the tests to use to check
that changes do not introduce regressions in vhost-user?
Thanks!
> Thanks,
> Stefano
>
> >
> >And we want to explicitly enable CVQ first, which "only" vhost_net
> >knows which is. To perform that in vhost_vdpa_dev_start would require
> >quirks, involving one or more of:
> >* Ignore vq enable calls if the device is not the CVQ one. How to
> >signal what is the CVQ? Can we trust it will be the last one for all
> >kind of devices?
> >* Enable queues that do not belong to the last vhost_dev from the enable
> >call.
> >* Enable the rest of the queues from the last enable in reverse order.
> >* Intercalate the "net load" callback between enabling the last
> >vhost_vdpa device and enabling the rest of devices.
> >* Add an "enable priority" order?
> >
> >Thanks!
> >
> >> Thanks
> >>
> >> >
> >> > if (peer->vring_enable) {
> >> > /* restore vring enable state */
> >> > @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev,
> >> > NetClientState *ncs,
> >> > goto err_start;
> >> > }
> >> > }
> >> > -
> >> > - r = vhost_net_start_one(get_vhost_net(peer), dev);
> >> > - if (r < 0) {
> >> > - goto err_start;
> >> > - }
> >> > }
> >> >
> >> > return 0;
> >> > --
> >> > 2.31.1
> >> >
> >>
> >
>
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, (continued)
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, Jason Wang, 2023/01/12
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, Eugenio Perez Martin, 2023/01/13
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, Jason Wang, 2023/01/15
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, Michael S. Tsirkin, 2023/01/16
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, Eugenio Perez Martin, 2023/01/16
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, Jason Wang, 2023/01/17
[RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Eugenio Pérez, 2023/01/12
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Jason Wang, 2023/01/16
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Eugenio Perez Martin, 2023/01/16
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Jason Wang, 2023/01/17
[RFC v2 07/13] vdpa: delay set_vring_ready after DRIVER_OK, Eugenio Pérez, 2023/01/12
[RFC v2 08/13] vdpa: Negotiate _F_SUSPEND feature, Eugenio Pérez, 2023/01/12