[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss af
From: |
Niels de Vos |
Subject: |
Re: [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error |
Date: |
Wed, 6 Apr 2016 16:47:32 +0200 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Wed, Apr 06, 2016 at 08:44:18AM -0400, Jeff Cody wrote:
> On Wed, Apr 06, 2016 at 01:02:16PM +0200, Kevin Wolf wrote:
> > [ Adding some CCs ]
> >
> > Am 06.04.2016 um 05:29 hat Jeff Cody geschrieben:
> > > Upon receiving an I/O error after an fsync, by default gluster will
> > > dump its cache. However, QEMU will retry the fsync, which is especially
> > > useful when encountering errors such as ENOSPC when using the werror=stop
> > > option. When using caching with gluster, however, the last written data
> > > will be lost upon encountering ENOSPC. Using the cache xlator option of
Use "write-behind xlator" instead of "cache xlator". There are different
caches in Gluster.
> > > 'resync-failed-syncs-after-fsync' should cause gluster to retain the
> > > cached data after a failed fsync, so that ENOSPC and other transient
> > > errors are recoverable.
> > >
> > > Signed-off-by: Jeff Cody <address@hidden>
> > > ---
> > > block/gluster.c | 27 +++++++++++++++++++++++++++
> > > configure | 8 ++++++++
> > > 2 files changed, 35 insertions(+)
> > >
> > > diff --git a/block/gluster.c b/block/gluster.c
> > > index 30a827e..b1cf71b 100644
> > > --- a/block/gluster.c
> > > +++ b/block/gluster.c
> > > @@ -330,6 +330,23 @@ static int qemu_gluster_open(BlockDriverState *bs,
> > > QDict *options,
> > > goto out;
> > > }
> > >
> > > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > > + /* Without this, if fsync fails for a recoverable reason (for
> > > instance,
> > > + * ENOSPC), gluster will dump its cache, preventing retries. This
> > > means
> > > + * almost certain data loss. Not all gluster versions support the
> > > + * 'resync-failed-syncs-after-fsync' key value, but there is no way
> > > to
> > > + * discover during runtime if it is supported (this api returns
> > > success for
> > > + * unknown key/value pairs) */
> >
> > Honestly, this sucks. There is apparently no way to operate gluster so
> > we can safely recover after a failed fsync. "We hope everything is fine,
> > but depending on your gluster version, we may now corrupt your image"
> > isn't very good.
> >
> > We need to consider very carefully if this is good enough to go on after
> > an error. I'm currently leaning towards "no". That is, we should only
> > enable this after Gluster provides us a way to make sure that the option
> > is really set.
Unfortunately it is also possible to disable the write-behind xlator as
well. This would cause setting the option to fail too :-/ At the moment
there is no real useful way to detect if write-behind is disabled (it is
enabled by default).
> > > + ret = glfs_set_xlator_option (s->glfs, "*-write-behind",
> > > +
> > > "resync-failed-syncs-after-fsync",
> > > + "on");
> > > + if (ret < 0) {
> > > + error_setg_errno(errp, errno, "Unable to set xlator key/value
> > > pair");
> > > + ret = -errno;
> > > + goto out;
> > > + }
> > > +#endif
> >
> > We also need to consider the case without CONFIG_GLUSTERFS_XLATOR_OPT.
> > In this case (as well as theoretically in the case that the option
> > didn't take effect - if only we could know about it), a failed
> > glfs_fsync_async() is fatal and we need to stop operating on the image,
> > i.e. set bs->drv = NULL like when we detect corruption in qcow2 images.
> > The guest will see a broken disk that fails all I/O requests, but that's
> > better than corrupting data.
> >
>
> Gluster versions that don't support CONFIG_GLUSTERFS_XLATOR_OPT will
> also not have the gluster patch that removes the file descriptor
> invalidation upon error (unless that was a relatively new
> bug/feature). But if that is the case, every operation on the file
> descriptor in those versions will return error. But it is also rather
> old versions that don't support glfs_set_xlator_option() I believe.
Indeed, glfs_set_xlator_option() was introduced with glusterfs-3.4.0. We
are currently on glusterfs-3.7, with the oldest supported version of
3.5. In ~2 months we hopefully have a 3.8 release and that will cause
the end-of-life of 3.5. 3.4 has been EOL for about a year now, hopefully
all our users have upgraded, but we know that some users will stay on
unsupported versions for a long time...
However, the "resync-failed-syncs-after-fsync" option was only
introduced recently, with glusterfs-3.7.9. You could detect this with
pkg-config glusterfs-api >= 7.3.7.9 if need to be.
More details about the problem the option addresses are in the commit
message on http://review.gluster.org/13057 .
HTH,
Niels
- [Qemu-devel] [PATCH for-2.6 0/2] Bug fixes for gluster, Jeff Cody, 2016/04/05
- [Qemu-devel] [PATCH for-2.6 1/2] block/gluster: return correct error value, Jeff Cody, 2016/04/05
- [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Jeff Cody, 2016/04/05
- Re: [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Ric Wheeler, 2016/04/06
- Re: [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Kevin Wolf, 2016/04/06
- Re: [Qemu-devel] [Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Kevin Wolf, 2016/04/06
- Re: [Qemu-devel] [Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Jeff Cody, 2016/04/06
- Re: [Qemu-devel] [Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Kevin Wolf, 2016/04/06
- Re: [Qemu-devel] [Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Pranith Kumar Karampuri, 2016/04/07
- Re: [Qemu-devel] [Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Raghavendra Gowdappa, 2016/04/11
- Re: [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error, Jeff Cody, 2016/04/07