[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] copy: adjust fiemap handling of sparse files
From: |
Jim Meyering |
Subject: |
Re: [PATCH] copy: adjust fiemap handling of sparse files |
Date: |
Thu, 10 Feb 2011 11:53:38 +0100 |
Pádraig Brady wrote:
> On 09/02/11 23:35, Pádraig Brady wrote:
>>>> Subject: [PATCH] copy: adjust fiemap handling of sparse files
>>>>
>>>> Don't depend on heuristics to detect sparse files
>>>> if fiemap is available. Also don't introduce new
>>>> holes unless --sparse=always has been specified.
>
>>>> * src/copy.c (extent_copy): Pass the user specified
>>>> sparse mode, and handle as described above.
>>>> Also a redundant lseek has been suppressed when
>>>> there is no hole between two extents.
>>>
>>> Could that be done in two separate patches?
>
> Here is the first patch split out to suppress redundant lseek()s.
> It's expanded now to suppress lseek in the source as well as dest.
>
> $ fallocate -l $((1<<29)) /tmp/t.f
> $ strace -e _llseek cp-before /tmp/t.f /dev/null 2>&1 | wc -l
> 180
> $ strace -e _llseek cp /tmp/t.f /dev/null 2>&1 | wc -l
> 0
>
> Hmm, the above suggests to me that we could use the
> FIEMAP_EXTENT_UNWRITTEN flag, so as to skip the read()
> for these allocated but unwritten (zero) extents.
> We could treat such extents like holes, except we
> would only generate holes in the dest with SPARSE_ALWAYS.
> I'll do patch for that this evening.
>
> Anyway the following small reorg will also simplify
> possible future changes to introduce fallocate().
>
> Note the attached mostly changes indenting,
> with the significant change being just:
>
> $ git diff -w --no-ext-diff HEAD~
>
> diff --git a/src/copy.c b/src/copy.c
> index 9182c16..5b6ffe3 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -334,7 +334,10 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t
> buf_size,
> {
> off_t ext_start = scan.ext_info[i].ext_logical;
> uint64_t ext_len = scan.ext_info[i].ext_length;
> + uint64_t hole_size = ext_start - last_ext_start - last_ext_len;
>
> + if (hole_size)
> + {
> if (lseek (src_fd, ext_start, SEEK_SET) < 0)
> {
> error (0, errno, _("cannot lseek %s"), quote (src_name));
> @@ -356,11 +359,6 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t
> buf_size,
> /* When not inducing holes and when there is a hole between
> the end of the previous extent and the beginning of the
> current one, write zeros to the destination file. */
> - if (last_ext_start + last_ext_len < ext_start)
> - {
> - uint64_t hole_size = (ext_start
> - - last_ext_start
> - - last_ext_len);
> if (! write_zeros (dest_fd, hole_size))
> {
> error (0, errno, _("%s: write failed"), quote
> (dst_name));
Thanks! that does the job and passes the tests.
While using --sparse=always is a lot less common,
it looks like there's room for improvement there, too.
I.e., we should be able to eliminate all of these
lseek calls on the output descriptor there, too:
$ strace -e lseek cp --sparse=always /tmp/t.f /tmp/t2 2>&1|wc -l
16384