[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 4/8] add copy_file_range syscall support
From: |
Luis Henriques |
Subject: |
Re: [PATCH 4/8] add copy_file_range syscall support |
Date: |
Thu, 18 Mar 2021 14:57:58 +0000 |
On Fri, Mar 12, 2021 at 02:30:15AM +0100, David Disseldorp wrote:
> From: Luis Henriques <lhenriques@suse.de>
I guess you can drop the above as you've added quite a bit of work on top
of my original patch (which was just an RFC) ;-)
> This helper function uses the copy_file_range(2) Linux syscall for
> in-kernel transfer between two file descriptors. Copy-on-write
> enabled filesystems may optimize I/O by using reflinks.
>
> Signed-off-by: David Disseldorp <ddiss@suse.de>
> ---
> configure.ac | 6 ++++++
> src/extern.h | 4 ++++
> src/util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 60 insertions(+)
>
> diff --git a/configure.ac b/configure.ac
> index 0a35e4c..ed6e3d2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -51,6 +51,12 @@ AC_COMPILE_CHECK_RETTYPE([minor], [0])
> AC_CHECK_FUNCS([fchmod fchown])
> # This is needed for mingw build
> AC_CHECK_FUNCS([setmode getpwuid getpwnam getgrgid getgrnam pipe fork getuid
> geteuid])
> +AC_CHECK_FUNC([copy_file_range],
> + [AC_CHECK_SIZEOF(off_t)
> + AC_CHECK_SIZEOF(loff_t)
> + if (test "$ac_cv_sizeof_off_t" -eq "$ac_cv_sizeof_loff_t");
> then
> + AC_DEFINE(HAVE_CPIO_REFLINK, [1], [cpio --reflink
> support])
> + fi])
>
> # gnulib modules
> gl_INIT
> diff --git a/src/extern.h b/src/extern.h
> index 3b59053..6be95cf 100644
> --- a/src/extern.h
> +++ b/src/extern.h
> @@ -171,6 +171,10 @@ void tape_toss_input (int in_des, off_t num_bytes);
> void copy_files_tape_to_disk (int in_des, int out_des, off_t num_bytes);
> void copy_files_disk_to_tape (int in_des, int out_des, off_t num_bytes, char
> *filename);
> void copy_files_disk_to_disk (int in_des, int out_des, off_t num_bytes, char
> *filename);
> +#ifdef HAVE_CPIO_REFLINK
> +ssize_t copy_files_range (int in_des, int out_des, off_t num_bytes);
> +#endif
> +
> void warn_if_file_changed (char *file_name, off_t old_file_size,
> time_t old_file_mtime);
> void create_all_directories (char const *name);
> diff --git a/src/util.c b/src/util.c
> index d81ef30..4c0d75f 100644
> --- a/src/util.c
> +++ b/src/util.c
> @@ -516,6 +516,56 @@ copy_files_disk_to_tape (int in_des, int out_des, off_t
> num_bytes,
> in_buff += size;
> }
> }
> +
> +#ifdef HAVE_CPIO_REFLINK
> +/*
> + * Use the copy_file_range(2) syscall for in-kernel transfer between two file
> + * descriptors. Copy-on-write enabled filesystems may optimize I/O by using
> + * reflinks.
> + */
> +ssize_t
> +copy_files_range (int in_des, int out_des, off_t num_bytes)
> +{
> + loff_t in_off = 0, out_off = 0;
> + ssize_t total = 0;
> + ssize_t ret;
> +
> + in_off = lseek (in_des, 0, SEEK_CUR);
> + if (in_off < 0)
> + return -errno;
> +
> + out_off = lseek (out_des, 0, SEEK_CUR);
> + if (out_off < 0)
> + return -errno;
> +
As I've already told you somewhere else, copy_file_range (crf for short!)
implementations may expand any existing holes in the original file.
That's OK, I guess, but it may be worth adding a comment here and possibly
on the cpio(1) manpage, when documenting the --reflink option.
Another option would be to add a bit more complexity here by calling crf
in a loop and lseek'ing SEEK_DATA/SEEK_HOLE. That's what my original RFC
patch was doing, but I agree that it's probably not worth it.
My suggestion is to, at least, add here a small comment mentioning this
decision and referring to the cfr manpage for the details.
Cheers,
--
Luís
> + while (total < num_bytes)
> + {
> + ret = copy_file_range (in_des, &in_off, out_des, &out_off,
> + num_bytes - total, 0);
> + /* simplify read/write fallback: don't partially seek on error */
> + if (ret == 0)
> + return -ENODATA;
> + if (ret < 0)
> + return -errno;
> + total += ret;
> + }
> +
> + /* copy complete. seek to new offset */
> + in_off = lseek (in_des, in_off, SEEK_SET);
> + if (in_off < 0)
> + error(1, errno, "failed final src seek");
> +
> + out_off = lseek (out_des, out_off, SEEK_SET);
> + if (out_off < 0)
> + error(1, errno, "failed final dst seek");
> +
> + output_bytes += total;
> + input_bytes += total;
> +
> + return total;
> +}
> +#endif /* HAVE_CPIO_REFLINK */
> +
> /* Copy a file using the input and output buffers, which may start out
> partly full. After the copy, the files are not closed nor the last
> block flushed to output, and the input buffer may still be partly
> --
> 2.26.2
>
- cpio copy-out reflink and chain support, David Disseldorp, 2021/03/11
- [PATCH 1/8] open device with O_NONBLOCK option, David Disseldorp, 2021/03/11
- [PATCH 5/8] add --reflink option, David Disseldorp, 2021/03/11
- [PATCH 7/8] ensure uniform post-trailer-zeros length, David Disseldorp, 2021/03/11
- [PATCH 3/8] add --chain test coverage, David Disseldorp, 2021/03/11
- [PATCH 6/8] attempt copy_file_range if explicitly requested, David Disseldorp, 2021/03/11
- [PATCH 4/8] add copy_file_range syscall support, David Disseldorp, 2021/03/11
- Re: [PATCH 4/8] add copy_file_range syscall support,
Luis Henriques <=
- [PATCH 2/8] add --chain option, David Disseldorp, 2021/03/11
- [PATCH 8/8] add --reflink test coverage, David Disseldorp, 2021/03/11
- Re: cpio copy-out reflink and chain support, Luis Henriques, 2021/03/18