bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6131: [PATCH]: fiemap support for efficient sparse file copy


From: jeff.liu
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Fri, 07 May 2010 22:13:19 +0800
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Hello All,

Add fiemap ioctl(2) feature to cp(1) for efficient sparse files copy has been 
discussed a few times
in the past few months, thanks Jim for the review comments.

I just work out a new patchsets against the latest upstream code and run some 
tests which I have
shown before, they all works well.

There is minor code change in this post in case of the ioctl(2) fails in the 
middle of fiemap copy
process.
My thought is, if this is the first time we met, go back to the standard copy 
as usual.
Otherwise, we should abort the copy process to avoid corrupting the dest file.
In order to determine if it was the first time ioctl(2) fails , make use of the 
variable 'i' which
is the fiemap extent counter, to check if it is equal to '0',  its value should 
be increased if the
previous call ioctl(2) succeeds.

Would you guys please review and consider apply the patches if there is no 
other issue?

>From f8c78794a70f1fb45a2c61c8bbeca344087287ab Mon Sep 17 00:00:00 2001
From: Jie Liu <address@hidden>
Date: Fri, 7 May 2010 20:48:45 +0800
Subject: [PATCH 1/3] Add fiemap.h for fiemap ioctl(2) support.
 It does not shipped by default, so I copy it from kernel at the moment.
 I have update its code style respect to GNU coding style.

Signed-off-by: Jie Liu <address@hidden>
---
 gnulib       |    2 +-
 src/fiemap.h |  102 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+), 1 deletions(-)
 create mode 100644 src/fiemap.h

diff --git a/gnulib b/gnulib
index e6addf8..8df7efd 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit e6addf84d6331d634b5d76db03f59851f3de8894
+Subproject commit 8df7efddc8ffe398cde4106d32b39848e5948df9
diff --git a/src/fiemap.h b/src/fiemap.h
new file mode 100644
index 0000000..d33293b
--- /dev/null
+++ b/src/fiemap.h
@@ -0,0 +1,102 @@
+/* FS_IOC_FIEMAP ioctl infrastructure.
+   Some portions copyright (C) 2007 Cluster File Systems, Inc
+   Authors: Mark Fasheh <address@hidden>
+            Kalpak Shah <address@hidden>
+            Andreas Dilger <address@hidden>.  */
+
+/* Copy from kernel, modified to respect GNU code style by Jie Liu.  */
+
+#ifndef _LINUX_FIEMAP_H
+# define _LINUX_FIEMAP_H
+
+# include <linux/types.h>
+
+struct fiemap_extent
+{
+  /* Logical offset in bytes for the start of the extent
+     from the beginning of the file.  */
+  uint64_t fe_logical;
+
+  /* Physical offset in bytes for the start of the extent
+     from the beginning of the disk.  */
+  uint64_t fe_physical;
+
+  /* Length in bytes for this extent.  */
+  uint64_t fe_length;
+
+  uint64_t fe_reserved64[2];
+
+  /* FIEMAP_EXTENT_* flags for this extent.  */
+  uint32_t fe_flags;
+
+  uint32_t fe_reserved[3];
+};
+
+struct fiemap
+{
+  /* Logical offset(inclusive) at which to start mapping(in).  */
+  uint64_t fm_start;
+
+  /* Logical length of mapping which userspace wants(in).  */
+  uint64_t fm_length;
+
+  /* FIEMAP_FLAG_* flags for request(in/out).  */
+  uint32_t fm_flags;
+
+  /* Number of extents that were mapped(out).  */
+  uint32_t fm_mapped_extents;
+
+  /* Size of fm_extents array(in).  */
+  uint32_t fm_extent_count;
+
+  uint32_t fm_reserved;
+
+  /* Array of mapped extents(out).  */
+  struct fiemap_extent fm_extents[0];
+};
+
+/* The maximum offset can be mapped for a file.  */
+# define FIEMAP_MAX_OFFSET       (~0ULL)
+
+/* Sync file data before map.  */
+# define FIEMAP_FLAG_SYNC        0x00000001
+
+/* Map extented attribute tree.  */
+# define FIEMAP_FLAG_XATTR       0x00000002
+
+# define FIEMAP_FLAGS_COMPAT     (FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
+
+/* Last extent in file.  */
+# define FIEMAP_EXTENT_LAST              0x00000001
+
+/* Data location unknown.  */
+# define FIEMAP_EXTENT_UNKNOWN           0x00000002
+
+/* Location still pending, Sets EXTENT_UNKNOWN.  */
+# define FIEMAP_EXTENT_DELALLOC          0x00000004
+
+/* Data can not be read while fs is unmounted.  */
+# define FIEMAP_EXTENT_ENCODED           0x00000008
+
+/* Data is encrypted by fs.  Sets EXTENT_NO_BYPASS.  */
+# define FIEMAP_EXTENT_DATA_ENCRYPTED    0x00000080
+
+/* Extent offsets may not be block aligned.  */
+# define FIEMAP_EXTENT_NOT_ALIGNED       0x00000100
+
+/* Data mixed with metadata.  Sets EXTENT_NOT_ALIGNED.  */
+# define FIEMAP_EXTENT_DATA_INLINE       0x00000200
+
+/* Multiple files in block.  Set EXTENT_NOT_ALIGNED.  */
+# define FIEMAP_EXTENT_DATA_TAIL         0x00000400
+
+/* Space allocated, but not data (i.e. zero).  */
+# define FIEMAP_EXTENT_UNWRITTEN         0x00000800
+
+/* File does not natively support extents.  Result merged for efficiency.  */
+# define FIEMAP_EXTENT_MERGED          0x00001000
+
+/* Space shared with other files.  */
+# define FIEMAP_EXTENT_SHARED            0x00002000
+
+#endif
-- 
1.5.4.3


>From 12618891cf4f3aff6b65463887b689c2ad99aa8e Mon Sep 17 00:00:00 2001
From: Jie Liu <address@hidden>
Date: Fri, 7 May 2010 21:14:05 +0800
Subject: [PATCH 2/3] Add fiemap ioctl(2) support for efficient sparse file copy.

Signed-off-by: Jie Liu <address@hidden>
---
 src/copy.c |  154 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 154 insertions(+), 0 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index c16cef6..960e5fb 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -63,6 +63,10 @@

 #include <sys/ioctl.h>

+#ifndef HAVE_FIEMAP
+# include "fiemap.h"
+#endif
+
 #ifndef HAVE_FCHOWN
 # define HAVE_FCHOWN false
 # define fchown(fd, uid, gid) (-1)
@@ -149,6 +153,136 @@ clone_file (int dest_fd, int src_fd)
 #endif
 }

+#ifdef __linux__
+# ifndef FS_IOC_FIEMAP
+#  define FS_IOC_FIEMAP _IOWR ('f', 11, struct fiemap)
+# endif
+/* Perform FIEMAP(available in mainline 2.6.27) copy if possible.
+   Call ioctl(2) with FS_IOC_FIEMAP to efficiently map file allocation
+   excepts holes.  So the overhead to deal with holes with lseek(2) in
+   normal copy could be saved.  This would result in much faster backups
+   for any kind of sparse file.  */
+static bool
+fiemap_copy_ok (int src_fd, int dest_fd, size_t buf_size,
+                off_t src_total_size, char const *src_name,
+                char const *dst_name, bool *normal_copy_required)
+{
+  bool fail = false;
+  bool last = false;
+  char fiemap_buf[4096];
+  struct fiemap *fiemap = (struct fiemap *)fiemap_buf;
+  struct fiemap_extent *fm_ext = &fiemap->fm_extents[0];
+  uint32_t count = (sizeof (fiemap_buf) - sizeof (*fiemap)) /
+                    sizeof (struct fiemap_extent);
+  off_t last_ext_logical = 0;
+  uint64_t last_ext_len = 0;
+  uint64_t last_read_size = 0;
+  unsigned int i = 0;
+
+  do
+    {
+      fiemap->fm_start = 0ULL;
+      fiemap->fm_length = FIEMAP_MAX_OFFSET;
+      fiemap->fm_extent_count = count;
+
+      /* When ioctl(2) fails, fall back to the normal copy only if it
+         is the first time we met.  */
+      if (ioctl (src_fd, FS_IOC_FIEMAP, (unsigned long) fiemap) < 0)
+        {
+          /* If `i > 0', then at least one ioctl(2) has been performed before. 
 */
+          if (i == 0)
+            *normal_copy_required = true;
+          return false;
+        }
+
+      /* If 0 extents are returned, then more ioctls are not needed.  */
+      if (fiemap->fm_mapped_extents == 0)
+        break;
+
+      for (i = 0; i < fiemap->fm_mapped_extents; i++)
+        {
+          assert (fm_ext[i].fe_logical <= OFF_T_MAX);
+
+          off_t ext_logical = fm_ext[i].fe_logical;
+          uint64_t ext_len = fm_ext[i].fe_length;
+
+          if (lseek (src_fd, ext_logical, SEEK_SET) < 0LL)
+            {
+              error (0, errno, _("cannot lseek %s"), quote (src_name));
+              return fail;
+            }
+
+          if (lseek (dest_fd, ext_logical, SEEK_SET) < 0LL)
+            {
+              error (0, errno, _("cannot lseek %s"), quote (dst_name));
+              return fail;
+            }
+
+          if (fm_ext[i].fe_flags & FIEMAP_EXTENT_LAST)
+            {
+              last_ext_logical = ext_logical;
+              last_ext_len = ext_len;
+              last = true;
+            }
+
+          while (0 < ext_len)
+            {
+              char buf[buf_size];
+
+              /* Avoid reading into the holes if the left extent
+                 length is shorter than the buffer size.  */
+              if (ext_len < buf_size)
+                buf_size = ext_len;
+
+              ssize_t n_read = read (src_fd, buf, buf_size);
+              if (n_read < 0)
+                {
+#ifdef EINTR
+                  if (errno == EINTR)
+                    continue;
+#endif
+                  error (0, errno, _("reading %s"), quote (src_name));
+                  return fail;
+                }
+
+              if (n_read == 0)
+                {
+                  /* Figure out how many bytes read from the last extent.  */
+                  last_read_size = last_ext_len - ext_len;
+                  break;
+                }
+
+              if (full_write (dest_fd, buf, n_read) != n_read)
+                {
+                  error (0, errno, _("writing %s"), quote (dst_name));
+                  return fail;
+                }
+
+              ext_len -= n_read;
+            }
+
+          fiemap->fm_start = (fm_ext[i-1].fe_logical + fm_ext[i-1].fe_length);
+        }
+    } while (! last);
+
+  /* If a file ends up with holes, the sum of the last extent logical offset
+     and the read-returned size will be shorter than the actual size of the
+     file.  Use ftruncate to extend the length of the destination file.  */
+  if (last_ext_logical + last_read_size < src_total_size)
+    {
+      if (ftruncate (dest_fd, src_total_size) < 0)
+        {
+          error (0, errno, _("extending %s"), quote (dst_name));
+          return fail;
+        }
+    }
+
+  return ! fail;
+}
+#else
+static bool fiemap_copy_ok (ignored) { errno == ENOTSUP; return false; }
+#endif
+
 /* FIXME: describe */
 /* FIXME: rewrite this to use a hash table so we avoid the quadratic
    performance hit that's probably noticeable only on trees deeper
@@ -679,6 +813,25 @@ copy_reg (char const *src_name, char const *dst_name,
 #endif
         }

+      if (make_holes)
+        {
+          bool require_normal_copy = false;
+          /* Perform efficient FIEMAP copy for sparse files, fall back to the
+             standard copy only if the ioctl(2) fails.  */
+          if (fiemap_copy_ok (source_desc, dest_desc, buf_size,
+                              src_open_sb.st_size, src_name,
+                              dst_name, &require_normal_copy))
+            goto preserve_metadata;
+          else
+            {
+              if (! require_normal_copy)
+                {
+                  return_val = false;
+                  goto close_src_and_dst_desc;
+                }
+            }
+        }
+
       /* If not making a sparse file, try to use a more-efficient
          buffer size.  */
       if (! make_holes)
@@ -807,6 +960,7 @@ copy_reg (char const *src_name, char const *dst_name,
         }
     }

+preserve_metadata:
   if (x->preserve_timestamps)
     {
       struct timespec timespec[2];
-- 
1.5.4.3


>From 8822b8e3f3ee70b49efb8b8aebff373792956422 Mon Sep 17 00:00:00 2001
From: Jie Liu <address@hidden>
Date: Fri, 7 May 2010 21:31:56 +0800
Subject: [PATCH 3/3] Add test script for cp(1) fiemap copy.

Signed-off-by: Jie Liu <address@hidden>
---
 tests/cp/sparse-fiemap |   58 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 58 insertions(+), 0 deletions(-)
 create mode 100755 tests/cp/sparse-fiemap

diff --git a/tests/cp/sparse-fiemap b/tests/cp/sparse-fiemap
new file mode 100755
index 0000000..25a8fd6
--- /dev/null
+++ b/tests/cp/sparse-fiemap
@@ -0,0 +1,58 @@
+#!/bin/sh
+# Test cp --sparse=always through fiemap copy
+
+# Copyright (C) 2006-2010 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+if test "$VERBOSE" = yes; then
+  set -x
+  cp --version
+fi
+
+. $srcdir/test-lib.sh
+
+cp_orig=cp
+cp_new="$abs_top_builddir/src/cp"
+
+test -d "/ext4"              \
+  && sparse="/ext4/sparse"   \
+  && normal="/ext4/sparse1"  \
+  && fiemap="/ext4/sparse2"  \
+  || skip=1
+
+test $skip = 1 && skip_test_ "/ext4 does not exists"
+
+size=`expr 10 \* 1024`
+dd if=/dev/zero bs=4k count=1 seek=$size of=$sparse > /dev/null 2>&1 || 
framework_failure
+
+# Using time(1) instead of shell built-in `time' command.
+# It support "--format" option which is more convinent to calculate
+# the expense time for different `cp' by combine with bc(1) for
+# the performance measurement.
+TIME=`which time` || skip_test_ "time(1) does not exists"
+
+x=$(echo "1+2" | bc)
+test $x = 3 || skip_test_ "bc(1) does not exists"
+
+t1=$($TIME -f "%U + %S" $cp_orig --sparse=always $sparse $normal 2>&1 | bc) || 
fail=1
+t2=$($TIME -f "%U + %S" $cp_new --sparse=always $sparse $fiemap 2>&1 | bc)  || 
fail=1
+
+test $fail = 1 && skip_test_ "at least one sparse file copy failed"
+
+# Ensure that the sparse file copied through fiemap has the same size in bytes 
as the original.
+test `stat --printf %s $sparse` -eq `stat --printf %s $fiemap` || fail=1
+echo "$t2 < $t1" | bc || fail=1
+
+Exit $fail
-- 
1.5.4.3



Best Regards,
-Jeff

-- 
With Windows 7, Microsoft is asserting legal control over your computer and is 
using this power to
abuse computer users.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]