bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6131: [PATCH]: fiemap support for efficient sparse file copy


From: jeff.liu
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Fri, 21 May 2010 20:59:56 +0800
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Jim Meyering wrote:
> jeff.liu wrote:
>> Hi Jim,
>>
>> Thanks for your kind advise!
>>
>> I'd like to adopt the timeout(1) approach for the test work.
>>
>> My thought is:
>> 1. Create and mount a file-backed ext4 partition rather than relying on the 
>> HARD CODE path.
>> 2. Create a 2gb sparse file without extent allocated for it.
>> 3. It take nearly 30 seconds to transfer this file in normal copy, yet less 
>> than 1 second through
>> FIEMAP-copy, is it a worst-case scenario that makes the difference as large 
>> as possible?
>> 4. run FIEMAP-copy, use timeout(1) to limit it will complete in 1 second, I 
>> hope I understood your
>> opinion correctly ;).
>>
>> The revised patches are shown as following:
>>
>> >From f18e1801d1dfca9fa278572b8172a5f97da2adc1 Mon Sep 17 00:00:00 2001
>> From: Jie Liu <address@hidden>
>> Date: Thu, 13 May 2010 22:17:53 +0800
>> Subject: [PATCH 1/1] tests: add a new test for FIEMAP-copy
>>
>> * tests/cp/sparse-fiemap: Add a new test for FIEMAP-copy against a
>> loopbacked ext4 partition.
>> * tests/Makefile.am (sparse-fiemap): Reference the new test.
>>
>> Signed-off-by: Jie Liu <address@hidden>
>> ---
>>  tests/Makefile.am      |    2 +
>>  tests/cp/sparse-fiemap |   61 
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 63 insertions(+), 0 deletions(-)
>>  create mode 100644 tests/cp/sparse-fiemap
>>
>> diff --git a/tests/Makefile.am b/tests/Makefile.am
>> index 46d388a..a76c6a7 100644
>> --- a/tests/Makefile.am
>> +++ b/tests/Makefile.am
>> @@ -25,6 +25,7 @@ root_tests =                                       \
>>    cp/special-bits                           \
>>    cp/cp-mv-enotsup-xattr                    \
>>    cp/capability                                     \
>> +  cp/sparse-fiemap                              \
>>    dd/skip-seek-past-dev                             \
>>    install/install-C-root                    \
>>    ls/capability                                     \
>> @@ -319,6 +320,7 @@ TESTS =                                          \
>>    cp/same-file                                      \
>>    cp/slink-2-slink                          \
>>    cp/sparse                                 \
>> +  cp/sparse-fiemap                              \
>>    cp/special-f                                      \
>>    cp/src-base-dot                           \
>>    cp/symlink-slash                          \
> 
> I've applied your patches locally and have begun adjusting them.
> First, I removed the addition of cp/sparse-fiemap to the TESTS list above.
> Adding it to the root_tests is sufficient.
Thank you to point it out.

> Then, I've made the following changes to your test script:
>   - the original size of your test file of 2GiB was too small,
>       in that the old (pre-fiemap) cp copied it for me in less than
>       1 second when the backing file was on a tmpfs file system.
>       I've made the new size be 2TiB.  The fiemap copy is still so
>       quick that it completes in < .01 second.[*]
>   - no point in discarding stdout/stderr, since it all goes to the log
>   - raised timeout to 10 seconds to give more leeway on slow systems
>   - remove those "rm -f" uses.  They're not needed, since the test is
>       run in its own temp dir, which is removed automatically when done.
>   - remove the $? = 124 test -- the preceding test for success is sufficient
> 
> [*] I tried to count syscalls with strace but got a segfault.
> Using valgrind I get errors, so debugged enough to get a clean
> run, but possibly at the expense of correctness.  We'll need more
> tests to ensure that the non-sparse blocks in the copy all have
> the same offset/length as in the original.  
Is it make sense if we write a utility in C through FIEMAP to show the extent 
info of a file?
then wrap it in our current test scripts or a new test script to compare the 
non-sparse blocks
offset and length?

filefrag(8) can do such thing(http://e2fsprogs.sourceforge.net/), but maybe we 
can implement a
compacted version focus on furture extent maping related testing only for 
coreutils.

Details below.
> 
> diff --git a/tests/cp/sparse-fiemap b/tests/cp/sparse-fiemap
> old mode 100644
> new mode 100755
> index f9d3a94..814d537
> --- a/tests/cp/sparse-fiemap
> +++ b/tests/cp/sparse-fiemap
> @@ -28,8 +28,7 @@ cwd=`pwd`
>  cleanup_() { cd /; umount "$cwd/mnt"; }
> 
>  # Create an ext4 loopback file system
> -dd if=/dev/zero of=blob bs=8192 count=1000 > /dev/null 2>&1 \
> -                                               || skip=1
> +dd if=/dev/zero of=blob bs=8192 count=1000 || skip=1
>  mkdir mnt
>  mkfs -t ext4 -F blob ||
>    skip_test_ "failed to create ext4 file system"
> @@ -42,20 +41,15 @@ test $skip = 1 &&
> 
>  rm -f mnt/f
> 
> -# Create a 2gb sparse file
> -dd if=/dev/zero of=mnt/sparse bs=1k count=1 seek=2096128 > /dev/null 2>&1 || 
> framework_failure
> +# Create a 2TiB sparse file
> +dd if=/dev/zero of=mnt/sparse bs=1k count=1 seek=2G || framework_failure
> 
> -# It take more than 20 seconds to transfer the created sparse file
> -# through normal copy, by contrast, it take even less than 1 second
> -# through FIEMAP-copy.
> -timeout 1 cp --sparse=always mnt/sparse mnt/sparse_fiemap || fail=1
> -test $? = 124 && fail=1
> +# It takes many minutes to copy this sparse file using the old method.
> +# By contrast, it takes far less than 1 second using FIEMAP-copy.
> +timeout 10 cp --sparse=always mnt/sparse mnt/sparse_fiemap || fail=1
> 
>  # Ensure that the sparse file copied through fiemap has the same size
>  # in bytes as the original.
>  test `stat --printf %s $sparse` = `stat --printf %s $fiemap` || fail=1
> 
> -rm -f mnt/sparse
> -rm -f mnt/sparse_fiemap
> -
>  Exit $fail
> 
> ----------------------------------------
> On F13, x86_64, ext4, I did this:
> 
> dd if=/dev/null of=big bs=1 seek=2G
> valgrind ./cp --sparse=always big big2
> ==4771== Conditional jump or move depends on uninitialised value(s)
> ==4771==    at 0x40465A: fiemap_copy_ok (copy.c:205)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Syscall param lseek(offset) contains uninitialised byte(s)
> ==4771==    at 0x3269CE1540: __lseek_nocancel (syscall-template.S:82)
> ==4771==    by 0x4046D4: fiemap_copy_ok (copy.c:210)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Syscall param lseek(offset) contains uninitialised byte(s)
> ==4771==    at 0x3269CE1540: __lseek_nocancel (syscall-template.S:82)
> ==4771==    by 0x40472D: fiemap_copy_ok (copy.c:216)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Conditional jump or move depends on uninitialised value(s)
> ==4771==    at 0x404792: fiemap_copy_ok (copy.c:222)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Conditional jump or move depends on uninitialised value(s)
> ==4771==    at 0x40492B: fiemap_copy_ok (copy.c:229)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Conditional jump or move depends on uninitialised value(s)
> ==4771==    at 0x4047FA: fiemap_copy_ok (copy.c:235)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Syscall param read(count) contains uninitialised byte(s)
> ==4771==    at 0x3269CD41B0: __read_nocancel (syscall-template.S:82)
> ==4771==    by 0x404821: fiemap_copy_ok (copy.c:238)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==
> ==4771== Invalid read of size 8
> ==4771==    at 0x404952: fiemap_copy_ok (copy.c:265)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> ==4771==  Address 0x3ffeffdd68 is not stack'd, malloc'd or (recently) free'd
> ==4771==
> ==4771==
> ==4771== Process terminating with default action of signal 11 (SIGSEGV)
> ==4771==  Access not within mapped region at address 0x3FFEFFDD68
> ==4771==    at 0x404952: fiemap_copy_ok (copy.c:265)
> ==4771==    by 0x405B61: copy_reg (copy.c:822)
> ==4771==    by 0x408713: copy_internal (copy.c:2163)
> ==4771==    by 0x409237: copy (copy.c:2449)
> ==4771==    by 0x403AC9: do_copy (cp.c:754)
> ==4771==    by 0x4041E4: main (cp.c:1154)
> 
> ===========================================================
> The segv just above is due to hitting this line with i==0:
> 
>     fiemap->fm_start = (fm_ext[i-1].fe_logical + fm_ext[i-1].fe_length);
strange, code should break if there is no extent allocated for a file.
 /* If 0 extents are returned, then more ioctls are not needed.  */
      if (fiemap->fm_mapped_extents == 0)
        break;

> 
> the obvious fix is probably to do this instead:
> 
>     fiemap->fm_start = (fm_ext[i].fe_logical + fm_ext[i].fe_length);
I just found a bug for dealing with the 'fiemap->fm_start', maybe it is the 
root cause of the
segment fault.  above line still need to write as 'fm_ext[i-1].fe_logical 
+....' to calculate the
offset for the next ioctl(2).
> 
> All of the used-uninitialized errors can be papered over by
> clearing the fiemap_buf array, like this:
> 
> +  memset (fiemap_buf, 0, sizeof fiemap_buf);
I recalled why I initialized this buf before when you ask me the reason, I was 
intented to
initialize the 'fiemap->fm_start', so below line 'fiemap->fm_start = 0ULL' 
should be removed from
the loop.

>    do
>      {
>        fiemap->fm_start = 0ULL;
> 
> However, if these are all due solely to F13's valgrind not yet knowing the
> semantics of the FIEMAP ioctl, then that may be adequate.
as what I mentioned above, this line should be removed or remove out of the 
loop if we do not
initialize the fiemap buf.
> 
> Bottom line:
>   - you may consider your test-script patch accepted, with the patch above
>   - I'd like to see a new version of the copy.c-changing patch,
>     including at least a fix for the fm_ext[-1] access bug.
> 
> ===========================================================
> Solely for reference, here's the copy.c patch I used to avoid
> the valgrind-spotted problems:
> 
> diff --git a/src/copy.c b/src/copy.c
> index 960e5fb..e232eaa 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -179,6 +179,7 @@ fiemap_copy_ok (int src_fd, int dest_fd, size_t buf_size,
>    uint64_t last_read_size = 0;
>    unsigned int i = 0;
> 
> +  memset (fiemap_buf, 0, sizeof fiemap_buf);
>    do
>      {
>        fiemap->fm_start = 0ULL;
> @@ -187,7 +188,7 @@ fiemap_copy_ok (int src_fd, int dest_fd, size_t buf_size,
> 
>        /* When ioctl(2) fails, fall back to the normal copy only if it
>           is the first time we met.  */
> -      if (ioctl (src_fd, FS_IOC_FIEMAP, (unsigned long) fiemap) < 0)
> +      if (ioctl (src_fd, FS_IOC_FIEMAP, fiemap) < 0)
>          {
>            /* If `i > 0', then at least one ioctl(2) has been performed 
> before.  */
>            if (i == 0)
> @@ -261,7 +262,7 @@ fiemap_copy_ok (int src_fd, int dest_fd, size_t buf_size,
>                ext_len -= n_read;
>              }
> 
> -          fiemap->fm_start = (fm_ext[i-1].fe_logical + 
> fm_ext[i-1].fe_length);
> +          fiemap->fm_start = (fm_ext[i].fe_logical + fm_ext[i].fe_length);
>          }
>      } while (! last);
> 
> 
> 


-- 
With Windows 7, Microsoft is asserting legal control over your computer and is 
using this power to
abuse computer users.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]