[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK
From: |
Eugenio Perez Martin |
Subject: |
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK |
Date: |
Mon, 16 Jan 2023 17:16:24 +0100 |
On Mon, Jan 16, 2023 at 7:37 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2023/1/13 16:19, Eugenio Perez Martin 写道:
> > On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>> To restore the device at the destination of a live migration we send the
> >>> commands through control virtqueue. For a device to read CVQ it must
> >>> have received the DRIVER_OK status bit.
> >> This probably requires the support from the parent driver and requires
> >> some changes or fixes in the parent driver.
> >>
> >> Some drivers did:
> >>
> >> parent_set_status():
> >> if (DRIVER_OK)
> >> if (queue_enable)
> >> write queue_enable to the device
> >>
> >> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
> >>
> > I don't get your point here. No device should start reading CVQ (or
> > any other VQ) without having received DRIVER_OK.
>
>
> If I understand the code correctly:
>
> For CVQ, we do SET_VRING_ENABLE before DRIVER_OK, that's fine.
>
> For datapath_vq, we do SET_VRING_ENABLE after DRIVER_OK, this requires
> parent driver support (explained above)
>
>
> >
> > Some parent drivers do not support sending the queue enable command
> > after DRIVER_OK, usually because they clean part of the state like the
> > set by set_vring_base. Even vdpa_net_sim needs fixes here.
>
>
> Yes, so the question is:
>
> Do we need another backend feature for this? (otherwise thing may break
> silently)
>
>
> >
> > But my understanding is that it should be supported so I consider it a
> > bug.
>
>
> Probably, we need fine some proof in the spec, e.g in 3.1.1:
>
> """
>
> 7.Perform device-specific setup, including discovery of virtqueues for
> the device, optional per-bus setup, reading and possibly writing the
> device’s virtio configuration space, and population of virtqueues.
> 8.Set the DRIVER_OK status bit. At this point the device is “live”.
>
> """
>
> So if my understanding is correct, "discovery of virtqueues for the
> device" implies queue_enable here which is expected to be done before
> DRIVER_OK. But it doesn't say anything regrading to the behaviour of
> setting queue ready after DRIVER_OK.
>
> I'm not sure it's a real bug or not, may Michael and comment on this.
>
Right, input on this topic would be really appreciated.
>
> > Especially after queue_reset patches. Is that what you mean?
>
>
> We haven't supported queue_reset yet in Qemu. But it allows to write 1
> to queue_enable after DRIVER_OK for sure.
>
I was not clear, I meant in the emulated device. I'm testing this
series with the proposal of _F_STATE.
>
> >
> >>> However this opens a window where the device could start receiving
> >>> packets in rx queue 0 before it receives the RSS configuration. To avoid
> >>> that, we will not send vring_enable until all configuration is used by
> >>> the device.
> >>>
> >>> As a first step, run vhost_set_vring_ready for all vhost_net backend after
> >>> all of them are started (with DRIVER_OK). This code should not affect
> >>> vdpa.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>> hw/net/vhost_net.c | 17 ++++++++++++-----
> >>> 1 file changed, 12 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >>> index c4eecc6f36..3900599465 100644
> >>> --- a/hw/net/vhost_net.c
> >>> +++ b/hw/net/vhost_net.c
> >>> @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev,
> >>> NetClientState *ncs,
> >>> } else {
> >>> peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >>> }
> >>> + r = vhost_net_start_one(get_vhost_net(peer), dev);
> >>> + if (r < 0) {
> >>> + goto err_start;
> >>> + }
> >>> + }
> >>> +
> >>> + for (int j = 0; j < nvhosts; j++) {
> >>> + if (j < data_queue_pairs) {
> >>> + peer = qemu_get_peer(ncs, j);
> >>> + } else {
> >>> + peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >>> + }
> >> I fail to understand why we need to change the vhost_net layer? This
> >> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
> >> vhost_vdpa_dev_start()?
> >>
> > The vhost-net layer explicitly calls vhost_set_vring_enable before
> > vhost_dev_start, and this is exactly the behavior we want to avoid.
> > Even if we make changes to vhost_dev, this change is still needed.
>
>
> Note that the only user of vhost_set_vring_enable() is vhost-user where
> the semantic is different:
>
> It uses that to changes the number of active queues:
>
> static int peer_attach(VirtIONet *n, int index)
>
> if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> => vhost_set_vring_enable(nc->peer, 1);
> }
>
> This is not the semantic of vhost-vDPA that tries to be complaint with
> virtio-spec. So I'm not sure how it can help here.
>
Right, but previous changes use enable callback to delay the enable of
the datapath virtqueues. I'll try to fit the changes in
virtio/vhost-vdpa though.
Thanks!
>
> >
> > And we want to explicitly enable CVQ first, which "only" vhost_net
> > knows which is.
>
>
> This should be known by net/vhost-vdpa.c.
>
>
> > To perform that in vhost_vdpa_dev_start would require
> > quirks, involving one or more of:
> > * Ignore vq enable calls if the device is not the CVQ one. How to
> > signal what is the CVQ? Can we trust it will be the last one for all
> > kind of devices?
> > * Enable queues that do not belong to the last vhost_dev from the enable
> > call.
> > * Enable the rest of the queues from the last enable in reverse order.
> > * Intercalate the "net load" callback between enabling the last
> > vhost_vdpa device and enabling the rest of devices.
> > * Add an "enable priority" order?
>
>
> Haven't had time in thinking through, but it would be better if we can
> limit the changes in vhost-vdpa layer. E.g currently the
> VHOST_VDPA_SET_VRING_ENABLE is done at vhost_dev_start().
>
> Thanks
>
>
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>> if (peer->vring_enable) {
> >>> /* restore vring enable state */
> >>> @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev,
> >>> NetClientState *ncs,
> >>> goto err_start;
> >>> }
> >>> }
> >>> -
> >>> - r = vhost_net_start_one(get_vhost_net(peer), dev);
> >>> - if (r < 0) {
> >>> - goto err_start;
> >>> - }
> >>> }
> >>>
> >>> return 0;
> >>> --
> >>> 2.31.1
> >>>
>
- Re: [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq, (continued)
[RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Eugenio Pérez, 2023/01/12
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Jason Wang, 2023/01/16
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK,
Eugenio Perez Martin <=
Re: [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK, Jason Wang, 2023/01/17
[RFC v2 07/13] vdpa: delay set_vring_ready after DRIVER_OK, Eugenio Pérez, 2023/01/12
[RFC v2 08/13] vdpa: Negotiate _F_SUSPEND feature, Eugenio Pérez, 2023/01/12
[RFC v2 09/13] vdpa: add feature_log parameter to vhost_vdpa, Eugenio Pérez, 2023/01/12
[RFC v2 10/13] vdpa net: allow VHOST_F_LOG_ALL, Eugenio Pérez, 2023/01/12