[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [

From: Dave Chinner
Subject: Re: [Qemu-block] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)
Date: Fri, 22 Jul 2016 18:58:57 +1000
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Jul 21, 2016 at 10:23:48AM -0400, Paolo Bonzini wrote:
> > > 1) avoid copying zero data, to keep the copy process efficient.  For this,
> > > SEEK_HOLE/SEEK_DATA are enough.
> > > 
> > > 2) copy file contents while preserving the allocation state of the file's
> > > extents.
> > 
> > Which is /very difficult/ to do safely and reliably.
> > i.e. the use of fiemap to duplicate the exact layout of a file
> > from userspace is only posisble if you can /guarantee/ the source
> > file has not changed in any way during the copy operation at the
> > pointin time you finalise the destination data copy.
> We don't do exactly that, exactly because it's messy when you have
> concurrent accesses (which shouldn't be done but you never know).

Which means you *cannot make the assumption it won't happen*.

FIEMAP is not guaranteed to tell you exactly where the data in the
file is that you need to copy is and that nothing you can do from
userspace changes that. I can't say it any clearer than that.

> When
> doing a copy, we use(d to use) FIEMAP the same way as you'd use lseek,
> querying one extent at a time.  If you proceed this way, all of these
> can cause the same races:
> - pread(ofs=10MB, len=10MB) returns all zeroes, so the 10MB..20MB is
> not copied
> - pread(ofs=10MB, len=10MB) returns non-zero data, so the 10MB..20MB is
> copied
> - lseek(SEEK_DATA, 10MB) returns 20MB, so the 10MB..20MB area is not
> copied
> - lseek(SEEK_HOLE, 10MB) returns 20MB, so the 10MB..20MB area is
> copied
> - ioctl(FIEMAP at 10MB) returns an extent starting at 20MB, so
> the 10MB..20MB area is not copied

No, FIEMAP is not guaranteed to behave like this. what is returned
is filesystem dependent. Fielsystems that don't support holes will
return data extents. Filesystems that support compression might
return a compressed data extent rather than a hole. Encrypted files
might not expose holes at all, so people can't easily find known
plain text regions in the encrypted data. Filesystems could report
holes as deduplicated data, etc.  What do you do when FIEMAP returns
"OFFLINE" to indicate that the data is located elsewhere and will
need to be retrieved by the HSM operating on top of the filesystem
before layout can be determined?

All of the above are *valid* and *correct*, because the filesytem
defines what FIEMAP returns for a given file offset. just because
ext4 and XFS have mostly the same behaviour, it doesn't mean that
every other filesystem behaves the same way.

The assumptions being made about FIEMAP behaviour will only lead to
user data corruption, as they already have several times in the past.


Dave Chinner

reply via email to

[Prev in Thread] Current Thread [Next in Thread]