qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 3/3] nbd/server: Allow MULTI_CONN for shared writable expo


From: Eric Blake
Subject: Re: [PATCH v3 3/3] nbd/server: Allow MULTI_CONN for shared writable exports
Date: Wed, 27 Apr 2022 16:39:07 -0500
User-agent: NeoMutt/20220415-26-c08bba

On Wed, Apr 27, 2022 at 05:52:09PM +0200, Kevin Wolf wrote:
> Am 14.03.2022 um 21:38 hat Eric Blake geschrieben:
> > According to the NBD spec, a server that advertises
> > NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
> > not see any cache inconsistencies: when properly separated by a single
> > flush, actions performed by one client will be visible to another
> > client, regardless of which client did the flush.  We satisfy these
> > conditions in qemu when our block layer is backed by the local
> > filesystem (by virtue of the semantics of fdatasync(), and the fact
> > that qemu itself is not buffering writes beyond flushes).  It is
> > harder to state whether we satisfy these conditions for network-based
> > protocols, so the safest course of action is to allow users to opt-in
> > to advertising multi-conn.
> 
> Do you have an example of how this could be unsafe?

Nothing direct.  I tried to turn this on unconditionally in an earlier
version, and we waffled about whether we could prove that network
block backends (such as gluster) provide us the safety that the NBD
spec demands:

https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html
https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html

> 
> As I understand it, the NBD server has a single BlockBackend and
> therefore is a single client for the backend, be it file-posix or any
> network-based protocol. It doesn't really make a difference for the
> storage from how many different NBD clients the requests are coming.
> 
> I would have expected that cache coherency of the protocol level driver
> would only matter if you had two QEMU processes accessing the same file
> concurrently.

Or a multi-pathed connection to network storage, where one QEMU
process accesses the network device, but those accesses may
round-robin which server they reach, and where any caching at an
individual server may be inconsistent with what is seen on another
server unless flushing is used to force the round-robin access to
synchronize between the multi-path views.

> 
> In fact, I don't think we even need the flush restriction from the NBD
> spec. All clients see the same state (that of the NBD server
> BlockBackend) even without anyone issuing any flush. The flush is only
> needed to make sure that cached data is written to the backing storage
> when writeback caches are involved.
> 
> Please correct me if I'm misunderstanding something here.

Likewise me, if I'm being overly cautious.

I can certainly write a simpler v4 that just always advertises
MULTI_CONN if we allow more than one client, without any knob to
override it; it's just that it is harder to write a commit message
justifying why I think it is safe to do so.

> 
> > We may later tweak defaults to advertise
> > by default when the block layer can confirm that the underlying
> > protocol driver is cache consistent between multiple writers, but for
> > now, this at least allows savvy users (such as virt-v2v or nbdcopy) to
> > explicitly start qemu-nbd (new -m command-line option) or
> > qemu-storage-daemon (new qapi field 'multi-conn') with multi-conn
> > advertisement in a known-safe setup where the client end can then
> > benefit from parallel clients.
> > 
> > Note, however, that we don't want to advertise MULTI_CONN when we know
> > that a second client cannot connect (for historical reasons, qemu-nbd
> > defaults to a single connection while nbd-server-add and QMP commands
> > default to unlimited connections; but we already have existing means
> > to let either style of NBD server creation alter those defaults).  The
> > harder part of this patch is setting up an iotest to demonstrate
> > behavior of multiple NBD clients to a single server.  It might be
> > possible with parallel qemu-io processes, but I found it easier to do
> > in python with the help of libnbd, and help from Nir and Vladimir in
> > writing the test.
> > 
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> > Suggested-by: Nir Soffer <nsoffer@redhat.com>
> > Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@ya.ru>
> 
> > @@ -709,6 +714,17 @@ int main(int argc, char **argv)
> >                  exit(EXIT_FAILURE);
> >              }
> >              break;
> > +        case 'm':
> > +        {
> > +            Error *err = NULL;
> > +            multi_conn = qapi_enum_parse(&OnOffAuto_lookup, optarg,
> > +                                         ON_OFF_AUTO_AUTO, &err);
> > +            if (err) {
> > +                error_report_err(err);
> > +                exit(EXIT_FAILURE);
> > +            }
> 
> I think this is the same as passing &error_fatal.

Yes, sounds right.

> 
> > +            break;
> > +        }
> >          case 'f':
> >              fmt = optarg;
> >              break;
> > diff --git a/tests/qemu-iotests/tests/nbd-multiconn 
> > b/tests/qemu-iotests/tests/nbd-multiconn
> > new file mode 100755
> > index 000000000000..7d1179b33b05
> > --- /dev/null
> > +++ b/tests/qemu-iotests/tests/nbd-multiconn
> > @@ -0,0 +1,157 @@
> > +#!/usr/bin/env python3

> > +
> > +import os
> > +from contextlib import contextmanager
> > +import iotests
> > +from iotests import qemu_img_create, qemu_io_silent
> 
> qemu_io_silent() doesn't exist any more, commit 72cfb937 removed it.

Sounds like I need to rebase and post v4 anyways, then.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]