bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6131: [PATCH]: fiemap support for efficient sparse file copy


From: Joel Becker
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Thu, 15 Jul 2010 15:12:58 -0700
User-agent: Mutt/1.5.20 (2009-06-14)

On Thu, Jul 15, 2010 at 12:51:36AM +0100, Pádraig Brady wrote:
> On 14/07/10 18:45, Paul Eggert wrote:

        First and foremost, I re-concur with the broad strokes of the
--sparse={always,never,auto} conversation.  I think you all knew that,
though ;-)

> > It's not just fiemap.  It's also the Solaris interface with SEEK_HOLE
> > and SEEK_DATA.  The change should involve a module that isolates these
> > low-level details from copy.c.  copy.c should ask the new module for the
> > locations of the holes (or the non-holes: that could be more convenient).
> > On traditional hosts without fiemap or SEEK_DATA, the module should report
> > that it doesn't know where the holes are; this can let copy.c resort to
> > the existing heuristic of looking at the size and the disk usage and
> > using the --sparse=always approach if the file "smells" like it's sparse.

        While I think the final result wants to support both fiemap and
SEEK_HOLE, I think baby steps are in order.  If we just implement fiemap
right now, we can later turn that into init_extent_detection() and 
get_next_extent().

> >> 2. Performance optimization, invoke fallocate(2) if an extent flag is 
> >> UNWRITTEN
> > 
> > This doesn't sound right.  A FIEMAP_EXTENT_UNWRITTEN extent is all zeros, 
> > and
> > so it should act as if it were a hole.  The goal is not to copy the exact
> > fiemap structure of the source (that's impossible): the goal is to use as
> > little time and space as possible.

        What he said.  If you find an FIEMAP_EXTENT_UNWRITTEN extent,
you just skip it.  It is a hole for the purposes of copying.  If someone
really wants to clone the extent layout, they can use reflink(8).

> > It's not clear to me that the fiemap stuff can be cleanly separated
> > from the fallocate stuff.  To some extent they're the same issue.
> > If they can easily be separated, that's better of course.
> 
> I see fiemap as optimizing reads,
> posix_fallocate() as optimizing writing zeros
> and fallocate() as optimizing allocation.
> 
> So not having thought much about implementation details,
> it seems like they could be logically separated.

        I think they should absolutely be separated.  The fiemap patch
doesn't have to do anything with fallocate()/posix_fallocate() on the
write side.
        Let's get a happy fiemap patch.  Then a happy
[posix]_fallocate() patch.  Then a happy SEEK_HOLE patch.

Joel

-- 

"For every complex problem there exists a solution that is brief,
     concise, and totally wrong."
                                        -Unknown

Joel Becker
Consulting Software Developer
Oracle
E-mail: address@hidden
Phone: (650) 506-8127





reply via email to

[Prev in Thread] Current Thread [Next in Thread]