[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#9157: [PATCH] dd: sparse conv flag
From: |
Jim Meyering |
Subject: |
bug#9157: [PATCH] dd: sparse conv flag |
Date: |
Sat, 10 Dec 2011 11:09:18 +0100 |
Roman Rybalko wrote:
> ---
> doc/coreutils.texi | 4 +++
> src/dd.c | 22 +++++++++++++++-
> tests/Makefile.am | 1 +
> tests/dd/sparse | 70
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 96 insertions(+), 1 deletions(-)
> create mode 100755 tests/dd/sparse
Thank you for the patch. It's nice to see accompanying test suite changes.
I see that copyright paperwork is now on file with the FSF, so...
Did you notice that cp now has the capability (using extents)
to preserve holes efficiently on some file system types?
Recent kernels also have SEEK_HOLE/SEEK_DATA support with which
it is more efficient to detect holes.
Here are some things we'll have to consider before
adding a new hole-punching option to dd:
Your patch may create a hole in the destination for each sequence of
length seek_size or greater of zero bytes in the input.
As you may have seen in the cp-related discussion, one may
want different options:
- preserve a file's hole/non-hole structure
- efficiently detect existing holes and fill them with explicit zeros in dest
- efficiently detect existing holes and seek-in-dest for each sequence of
zeros (longer than some minimum) in non-hole input
> diff --git a/doc/coreutils.texi b/doc/coreutils.texi
> index 424446c..761c698 100644
> --- a/doc/coreutils.texi
> +++ b/doc/coreutils.texi
> @@ -8127,6 +8127,10 @@ Pad every input block to size of @samp{ibs} with
> trailing zero bytes.
> When used with @samp{block} or @samp{unblock}, pad with spaces instead of
> zero bytes.
>
> address@hidden sparse
> address@hidden sparse
> +Make sparse output file.
Please say a little more here.
I.e., when might a hole be introduced?
When is this option useful?
> @end table
>
> The following ``conversions'' are really file flags
> diff --git a/src/dd.c b/src/dd.c
> index 0824f6c..0393740 100644
> --- a/src/dd.c
> +++ b/src/dd.c
> @@ -126,7 +126,8 @@ enum
> C_NOCREAT = 010000,
> C_EXCL = 020000,
> C_FDATASYNC = 040000,
> - C_FSYNC = 0100000
> + C_FSYNC = 0100000,
> + C_SPARSE = 0200000
> };
>
> /* Status bit masks. */
> @@ -268,6 +269,7 @@ static struct symbol_value const conversions[] =
> {"sync", C_SYNC}, /* Pad input records to ibs with NULs. */
> {"fdatasync", C_FDATASYNC}, /* Synchronize output data before
> finishing. */
> {"fsync", C_FSYNC}, /* Also synchronize output metadata. */
> + {"sparse", C_SPARSE}, /* Make sparse output file. */
> {"", 0}
> };
>
> @@ -533,6 +535,9 @@ Each CONV symbol may be:\n\
> fsync likewise, but also write metadata\n\
> "), stdout);
> fputs (_("\
> + sparse make sparse output file\n\
> +"), stdout);
> + fputs (_("\
> \n\
> Each FLAG symbol may be:\n\
> \n\
> @@ -985,6 +990,21 @@ iwrite (int fd, char const *buf, size_t size)
> {
> ssize_t nwritten;
> process_signals ();
> + if (conversions_mask & C_SPARSE)
> + {
> + off_t seek_size = 0;
> + while (total_written + seek_size < size && buf[total_written +
> seek_size] == 0)
> + ++seek_size;
> + if (seek_size)
> + {
> + off_t cur_off = 0;
> + cur_off = lseek(fd, seek_size, SEEK_CUR);
> + if (cur_off < 0)
> + break;
dd must not ignore lseek failure.
> + total_written += seek_size;
> + continue;
> + }
> + }
> nwritten = write (fd, buf + total_written, size - total_written);
> if (nwritten < 0)
> {
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index ebd1b11..0f1376a 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -364,6 +364,7 @@ TESTS = \
> dd/skip-seek \
> dd/skip-seek2 \
> dd/skip-seek-past-file \
> + dd/sparse \
> dd/stderr \
> dd/unblock \
> dd/unblock-sync \
> diff --git a/tests/dd/sparse b/tests/dd/sparse
> new file mode 100755
> index 0000000..f0e0806
> --- /dev/null
> +++ b/tests/dd/sparse
> @@ -0,0 +1,70 @@
> +#!/bin/sh
> +# Ensure that dd conv=sparse works.
> +
> +# Copyright (C) 2003, 2005-2011 Free Software Foundation, Inc.
Use only 2011 as the copyright year.
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +. "${srcdir=.}/init.sh"; path_prepend_ ../src
> +print_ver_ dd
> +
> +# sometimes we may read less than 1M
> +dd if=/dev/zero of=sample0 count=1 bs=1M 2> /dev/null || fail=1
> +[ "`stat -c %s sample0`" = "1048576" ] || fail=1
We'd write that like this instead:
(note use of test, not "[...]", use of $(...), not `...`)
test "$(stat -c %s sample0)" = 1048576 || fail=1
> +dd if=/dev/urandom of=sample1 count=1 bs=1M 2> /dev/null || fail=1
> +[ "`stat -c %s sample1`" = "1048576" ] || fail=1
> +
> +# test 1
> +dd if=sample1 of=test1 seek=0 bs=1M 2> /dev/null || fail=1
> +dd if=sample0 of=test1 seek=1 bs=1M 2> /dev/null || fail=1
> +dd if=sample1 of=test1 seek=2 bs=1M 2> /dev/null || fail=1
> +
> +# test 1-1
> +dd if=test1 of=out1-1 conv=sparse bs=1M 2> /dev/null || fail=1
> +dd if=sample1 of=out1-1-check bs=1M 2> /dev/null || fail=1
> +dd if=sample1 of=out1-1-check seek=2 bs=1M 2> /dev/null || fail=1
> +[ "`stat -c '%s %b %B' out1-1`" = "`stat -c '%s %b %B' out1-1-check`" ] ||
> fail=1
- bug#9157: [PATCH] dd: sparse conv flag,
Jim Meyering <=