qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 2/3] io: Add zerocopy and errqueue


From: Daniel P . Berrangé
Subject: Re: [PATCH v1 2/3] io: Add zerocopy and errqueue
Date: Wed, 8 Sep 2021 22:57:06 +0100
User-agent: Mutt/2.0.7 (2021-05-04)

On Wed, Sep 08, 2021 at 05:09:33PM -0400, Peter Xu wrote:
> On Wed, Sep 08, 2021 at 05:25:50PM -0300, Leonardo Bras Soares Passos wrote:
> > On Tue, Sep 7, 2021 at 8:06 AM Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > > > Possibly, yes. This really need David G's input since he understands
> > > > the code in way more detail than me.
> > >
> > > Hmm I'm not entirely sure why we have the sync after each iteration;
> > > the case I can think of is if we're doing async sending, we could have
> > > two versions of the same page in flight (one from each iteration) -
> > > you'd want those to get there in the right order.
> > >
> > > Dave
> > 
> > Well, that's the thing: as we don't copy the buffer in MSG_ZEROCOPY,  we 
> > will in
> > fact have the same page in flight twice, instead of two versions,
> > given the buffer is
> > sent as it is during transmission.
> 
> That's an interesting point, which looks even valid... :)
> 
> There can still be two versions depending on when the page is read and feed to
> the NICs as the page can be changing during the window, but as long as the
> latter sent page always lands later than the former page on dest node then it
> looks ok.

The really strange scenario is around TCP re-transmits. The kernel may
transmit page version 1, then we have version 2 written. The kerenl now
needs to retransmit a packet due to network loss. The re-transmission will
contain version 2 payload which differs from the originally lost packet's
payload.

IOW, the supposed "reliable" TCP stream is no longer reliable and actually
changes its data retrospectively because we've intentionally violated the
rules the kernel documented for use of MSG_ZEROCOPY.

We think we're probably ok with migration as we are going to rely on the
face that we eventually pause the guest to stop page changes during the
final switchover. None the less I really strongly dislike the idea of
not honouring the kernel API contract, despite the potential performance
benefits it brings.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]