qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 6/7] multifd: Send header packet without flags if zero-cop


From: Leonardo Bras Soares Passos
Subject: Re: [PATCH v9 6/7] multifd: Send header packet without flags if zero-copy-send is enabled
Date: Tue, 26 Apr 2022 18:59:40 -0300

Hello Daniel, thank you for the feedback!

On Tue, Apr 26, 2022 at 5:11 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Mon, Apr 25, 2022 at 06:50:55PM -0300, Leonardo Bras wrote:
> > Since d48c3a0445 ("multifd: Use a single writev on the send side"),
> > sending the header packet and the memory pages happens in the same
> > writev, which can potentially make the migration faster.
> >
> > Using channel-socket as example, this works well with the default copying
> > mechanism of sendmsg(), but with zero-copy-send=true, it will cause
> > the migration to often break.
> >
> > This happens because the header packet buffer gets reused quite often,
> > and there is a high chance that by the time the MSG_ZEROCOPY mechanism get
> > to send the buffer, it has already changed, sending the wrong data and
> > causing the migration to abort.
> >
> > It means that, as it is, the buffer for the header packet is not suitable
> > for sending with MSG_ZEROCOPY.
> >
> > In order to enable zero copy for multifd, send the header packet on an
> > individual write(), without any flags, and the remanining pages with a
> > writev(), as it was happening before. This only changes how a migration
> > with zero-copy-send=true works, not changing any current behavior for
> > migrations with zero-copy-send=false.
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > ---
> >  migration/multifd.c | 29 ++++++++++++++++++++++++++---
> >  1 file changed, 26 insertions(+), 3 deletions(-)
> >
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 15fb668e64..6c940aaa98 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -639,6 +639,8 @@ static void *multifd_send_thread(void *opaque)
> >          if (p->pending_job) {
> >              uint64_t packet_num = p->packet_num;
> >              uint32_t flags = p->flags;
> > +            int iov_offset = 0;
> > +
>
> No need for this if you change:
>
> >              p->iovs_num = 1;
>
>    if (!migrate_use_zero_copy_send()) {
>       p->iovs_num = 1;
>    }
>

I understand the point now: setting p->iovs_num = 0 before
multifd_send_state->ops->send_prepare() causes p->iov[0] to be used for
pages instead of the header. I was not aware, so thanks for pointing that out!

But it's also necessary to have an else clause with p->iovs_num = 0, right?
It seems like the variable is not set anywhere else, and it would keep growing
after the second loop iteration, causing prepare() to access p->iov[]
outside bounds.

Am I missing something here?

>
> >              p->normal_num = 0;
> >
> > @@ -665,15 +667,36 @@ static void *multifd_send_thread(void *opaque)
> >              trace_multifd_send(p->id, packet_num, p->normal_num, flags,
> >                                 p->next_packet_size);
> >
> > -            p->iov[0].iov_len = p->packet_len;
> > -            p->iov[0].iov_base = p->packet;
> > +            if (migrate_use_zero_copy_send()) {
> > +                /* Send header without zerocopy */
> > +                ret = qio_channel_write_all(p->c, (void *)p->packet,
> > +                                            p->packet_len, &local_err);
> > +                if (ret != 0) {
> > +                    break;
> > +                }
> > +
> > +                if (!p->normal_num) {
> > +                    /* No pages will be sent */
> > +                    goto skip_send;
> > +                }
>
> Don't need this AFAIK, because the qio_channel_writev_all
> call will be a no-op if  iovs_num is zero
>

Oh, I see:
qio_channel_writev_all() will call qio_channel_writev_full_all() where
niov == 0 and thus nlocal_iov == 0, avoiding the loop that calls
qio_channel_writev_full().

I will remove that in v10


> >
> > -            ret = qio_channel_writev_all(p->c, p->iov, p->iovs_num,
> > +                /* Skip first iov : header */
> > +                iov_offset = 1;
>
> Don't need to set this

Agree, that makes sense since the offset part is discontinued.

>
> > +            } else {
> > +                /* Send header using the same writev call */
> > +                p->iov[0].iov_len = p->packet_len;
> > +                p->iov[0].iov_base = p->packet;
> > +            }
> > +
> > +            ret = qio_channel_writev_all(p->c, p->iov + iov_offset,
> > +                                         p->iovs_num - iov_offset,
> >                                           &local_err);
>
> This wouldn't need changing if we don't reserve iovs[0] when
> not required.

Agree.

>
> > +
> >              if (ret != 0) {
> >                  break;
> >              }
> >
> > +skip_send:
> >              qemu_mutex_lock(&p->mutex);
> >              p->pending_job--;
> >              qemu_mutex_unlock(&p->mutex);
> > --
> > 2.36.0
> >
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>

I will probably send a v10 shortly.

Thanks for reviewing!

Best regards,
Leo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]