qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: fdatasync semantics and block device backup


From: Bryan S Rosenburg
Subject: RE: fdatasync semantics and block device backup
Date: Tue, 28 Apr 2020 09:58:33 -0400

Kevin Wolf <address@hidden> wrote on 04/28/2020 07:11:24 AM:
>
> Am 27.04.2020 um 21:49 hat Bryan S Rosenburg geschrieben:
> > Blockdev community,
> >
> > Our group would like to write block device backups directly to an object
> > store, using an interface such as s3fs or rclone-mount. We've run into
> > problems with both interfaces, and in both cases the problems revolve
> > around fdatasync system calls. With s3fs, fdatasync calls are painfully
> > slow. With rclone-mount, the calls are very fast but don't do anything.
> >
> > Syncing files to an object store is inherently problematic, as a proper
> > sync requires finalizing the object that holds the file. After
> > finalization, additional writes to the file require a new object to be
> > created and the old object to be copied and destroyed. This process
> > results in an N-squared performance problem for files that are synced
> > periodically as they are written, as is the case for qemu backups.
> >
> > Empirically, s3fs implements fdatasync, and hence backups written to s3fs
> > take an untenably long time. I can provide data and straces, if needed.
> >
> > Backups written to rclone-mount are much faster, but there are obvious
> > semantic problems. The backup job completes successfully before the file
> > is actually stable in the object store. And in fact, a lot of the work of
> > finalizing the file occurs during the "close" system call that is invoked
> > as part of the qmp_blockdev_del operation.The syscall causes that
> > operation to take so long that other commands time out waiting to "acquire
> > state change lock (held by monitor qemuProcessEventHandler)".
> >
> > My questions for the group are: Has anyone else tried writing backups to
> > file systems that don't have good support for fdatasync, and do you have
> > any advice other than "Don't do that." ?
>
> I think "don't do that" is a good answer actually.
>
> You may want to put an NBD indirection between QEMU and your object
> store, so that the close() syscall will just block a qemu-nbd process
> that has already closed its connection to QEMU instead of blocking all
> of QEMU.
>
> It is possible to disable fdatasync() by specifying cache=unsafe for
> the block device, so you could avoid the penalty of repeated syncs on
> s3fs.
>
> Of course, if s3fs requires an fsync before data is actually stable, in
> this case you couldn't consider your backup completed when the backup
> block job finishes successfully, but you would have to issue an fsync
> manually and wait for its result before you can consider the backup
> successful.
>
> Kevin


Thanks, Kevin.

It sounds like we should be specifying cache=unsafe when using rclone-mount, at least, so qemu won't think the file system is implementing fdatasyncs when it's not.

- Bryan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]