bug-cpio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 4/8] add copy_file_range syscall support


From: Luis Henriques
Subject: Re: [PATCH 4/8] add copy_file_range syscall support
Date: Thu, 18 Mar 2021 14:57:58 +0000

On Fri, Mar 12, 2021 at 02:30:15AM +0100, David Disseldorp wrote:
> From: Luis Henriques <lhenriques@suse.de>

I guess you can drop the above as you've added quite a bit of work on top
of my original patch (which was just an RFC) ;-)

> This helper function uses the copy_file_range(2) Linux syscall for
> in-kernel transfer between two file descriptors. Copy-on-write
> enabled filesystems may optimize I/O by using reflinks.
> 
> Signed-off-by: David Disseldorp <ddiss@suse.de>
> ---
>  configure.ac |  6 ++++++
>  src/extern.h |  4 ++++
>  src/util.c   | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 60 insertions(+)
> 
> diff --git a/configure.ac b/configure.ac
> index 0a35e4c..ed6e3d2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -51,6 +51,12 @@ AC_COMPILE_CHECK_RETTYPE([minor], [0])
>  AC_CHECK_FUNCS([fchmod fchown])
>  # This is needed for mingw build
>  AC_CHECK_FUNCS([setmode getpwuid getpwnam getgrgid getgrnam pipe fork getuid 
> geteuid])
> +AC_CHECK_FUNC([copy_file_range],
> +              [AC_CHECK_SIZEOF(off_t)
> +               AC_CHECK_SIZEOF(loff_t)
> +               if (test "$ac_cv_sizeof_off_t" -eq "$ac_cv_sizeof_loff_t"); 
> then
> +                   AC_DEFINE(HAVE_CPIO_REFLINK, [1], [cpio --reflink 
> support])
> +               fi])
>  
>  # gnulib modules
>  gl_INIT
> diff --git a/src/extern.h b/src/extern.h
> index 3b59053..6be95cf 100644
> --- a/src/extern.h
> +++ b/src/extern.h
> @@ -171,6 +171,10 @@ void tape_toss_input (int in_des, off_t num_bytes);
>  void copy_files_tape_to_disk (int in_des, int out_des, off_t num_bytes);
>  void copy_files_disk_to_tape (int in_des, int out_des, off_t num_bytes, char 
> *filename);
>  void copy_files_disk_to_disk (int in_des, int out_des, off_t num_bytes, char 
> *filename);
> +#ifdef HAVE_CPIO_REFLINK
> +ssize_t copy_files_range (int in_des, int out_des, off_t num_bytes);
> +#endif
> +
>  void warn_if_file_changed (char *file_name, off_t old_file_size,
>                             time_t old_file_mtime);
>  void create_all_directories (char const *name);
> diff --git a/src/util.c b/src/util.c
> index d81ef30..4c0d75f 100644
> --- a/src/util.c
> +++ b/src/util.c
> @@ -516,6 +516,56 @@ copy_files_disk_to_tape (int in_des, int out_des, off_t 
> num_bytes,
>        in_buff += size;
>      }
>  }
> +
> +#ifdef HAVE_CPIO_REFLINK
> +/*
> + * Use the copy_file_range(2) syscall for in-kernel transfer between two file
> + * descriptors. Copy-on-write enabled filesystems may optimize I/O by using
> + * reflinks.
> + */
> +ssize_t
> +copy_files_range (int in_des, int out_des, off_t num_bytes)
> +{
> +  loff_t in_off = 0, out_off = 0;
> +  ssize_t total = 0;
> +  ssize_t ret;
> +
> +  in_off = lseek (in_des, 0, SEEK_CUR);
> +  if (in_off < 0)
> +    return -errno;
> +
> +  out_off = lseek (out_des, 0, SEEK_CUR);
> +  if (out_off < 0)
> +    return -errno;
> +

As I've already told you somewhere else, copy_file_range (crf for short!)
implementations may expand any existing holes in the original file.
That's OK, I guess, but it may be worth adding a comment here and possibly
on the cpio(1) manpage, when documenting the --reflink option.

Another option would be to add a bit more complexity here by calling crf
in a loop and lseek'ing SEEK_DATA/SEEK_HOLE.  That's what my original RFC
patch was doing, but I agree that it's probably not worth it.

My suggestion is to, at least, add here a small comment mentioning this
decision and referring to the cfr manpage for the details.

Cheers,
--
Luís

> +  while (total < num_bytes)
> +    {
> +      ret = copy_file_range (in_des, &in_off, out_des, &out_off,
> +                          num_bytes - total, 0);
> +      /* simplify read/write fallback: don't partially seek on error */
> +      if (ret == 0)
> +        return -ENODATA;
> +      if (ret < 0)
> +     return -errno;
> +      total += ret;
> +    }
> +
> +  /* copy complete. seek to new offset */
> +  in_off = lseek (in_des, in_off, SEEK_SET);
> +  if (in_off < 0)
> +    error(1, errno, "failed final src seek");
> +
> +  out_off = lseek (out_des, out_off, SEEK_SET);
> +  if (out_off < 0)
> +    error(1, errno, "failed final dst seek");
> +
> +  output_bytes += total;
> +  input_bytes += total;
> +
> +  return total;
> +}
> +#endif /* HAVE_CPIO_REFLINK */
> +
>  /* Copy a file using the input and output buffers, which may start out
>     partly full.  After the copy, the files are not closed nor the last
>     block flushed to output, and the input buffer may still be partly
> -- 
> 2.26.2
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]