coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] copy: adjust fiemap handling of sparse files


From: Pádraig Brady
Subject: Re: [PATCH] copy: adjust fiemap handling of sparse files
Date: Wed, 30 Mar 2011 12:02:44 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3

On 28/03/11 23:06, Pádraig Brady wrote:
> On 18/03/11 16:04, Pádraig Brady wrote:
>> On 18/03/11 13:48, Chris Mason wrote:
>>> Excerpts from Pádraig Brady's message of 2011-03-18 09:19:44 -0400:
>>>> On 18/03/11 12:04, Chris Mason wrote:
>>>>> Excerpts from Jim Meyering's message of 2011-03-18 07:52:43 -0400:
>>>>>> Pádraig Brady wrote:
>>>>>>>
>>>>>>> So am I right in thinking FIEMAP_FLAG_SYNC is no longer needed with 
>>>>>>> 2.6.38.
>>>>>>> I'm quite worried about implicit syncing in cp actually, where it might
>>>>>>> introduce bad performance regressions.
>>>>>>
>>>>>> Good point.
>>>>>>
>>>>>>> Maybe we could disable this flag if XFS || kernel >= 2.6.38?
>>>>>>> Or maybe we might do as you suggest above and disable extent_copy()
>>>>>>> completely, for kernels before 2.6.38.
>>>>>>> Not an ideal situation at all :(
>>>>>>
>>>>>> If there really is risk of data loss with ext4 on kernels < 2.6.38,
>>>>>> even if it's only on unusual files, then we'll have to consider
>>>>>> using a kernel version check to disable extent_copy.
>>>>>>
>>>>>> Is there a stand-alone ext4 reproducer?
>>>>>
>>>>> dd if=/dev/zero of=/mnt/foo bs=1M seek=1 count=1
>>>>>
>>>>> Without a sync the buggy ext4/btrfs will not find the extent, and report
>>>>> only holes.
>>>>
>>>> OK, FIEMAP_FLAG_SYNC is an effective workaround for that.
>>>> So as mentioned above, we might consider not using this flag for
>>>> XFS || kernel >= 2.6.38
>>>
>>> Btrfs before 2.6.38 may have real trouble though, even with the sync.
>>> We were returning overlapping ranges to you, so the destination would
>>> end up bigger than the original.  This could be fixed in cp by making
>>> sure to never seek backwards based on the extents returned.
>>>
>>> Example bad results:
>>>
>>> logical: [ 0 ... 1MB ] -> physical [ foo .. foo ]
>>> logical  [ 0 ... 2MB ] -> physical [ foo2 .. foo2 ]
>>>
>>> 100% a btrfs bug, but cp would do the 0 .. 1MB bit and then seek back to
>>> zero and do the 0 .. 2MB bit.  If cp had not seeked backwards you would
>>> have covered for me.  2.6.38 final fixed this problem.
>>
>>> I think even XFS was fixed relatively recently, 2.6.36 and later.
>>> Looking at the commit, the simple dd test above wouldn't have triggered
>>> it.  Actually, looking at this commit even the sync wouldn't save xfs
>>> before 2.6.36, Eric am I reading this right?
>>>
>>> So maybe just give in and look for 2.6.38 instead of trying to remember
>>> all the places where individual filesystems didn't suck.  Give the user
>>> a cp --sparse=fiemap-im-really-really-sure to override your assumptions
>>> about our bugs.
>>>
>>> commit 9af25465081480a75824fd7a16a37a5cfebeede9
>>> Author: Tao Ma <address@hidden>
>>> Date:   Mon Aug 30 02:44:03 2010 +0000
>>>
>>>     xfs: Make fiemap work with sparse files
>>>     
>>>     In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
>>>     to return fi_extent_max extents, but actually it won't work for
>>>     a sparse file. The reason is that in xfs_getbmap we will
>>>     calculate holes and set it in 'out', while out is malloced by
>>>     bmv_count(fi_extent_max+1) which didn't consider holes. So in the
>>>     worst case, if 'out' vector looks like
>>>     [hole, extent, hole, extent, hole, ... hole, extent, hole],
>>>     we will only return half of fi_extent_max extents.
>>
>> OK thanks for the info. So:
>>
>> XFS may miss some extents for sparse files before 2.6.36
>> BTRFS and EXT4 need a sync before fiemap() before 2.6.38
>> BTRFS can return overlapping extents before 2.6.38
>>
>> It seems like we should at least detect overlapping extents in our:
>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/extent-scan.c;hb=HEAD
>> and return false (don't use fiemap) in this case.
>> We've already stopped doing the lseeks() but that assumes non overlapping 
>> extents.
> 
> Attached is the proposed patch to address the overlapping extents case,
> which we'll need if we don't bar kernels before 2.6.38,
> but is good to have anyway in case it happens in future.
> 
> I think I'll do a follow up patch to get extent_scan_read()
> to read all extents before returning, which will improve
> the fix (and the previous merge extents change).

I did that change, but was worried about the possible large
mem requirements of scanning all extents up front,
or the inefficiencies of scanning twice.
So instead I took the approach of adjusting extents so that:

|  extent 1 |
|               extent 2  |

is changed to

|  extent 1 |
|           |   extent 2  |

That allows copying the data even in this case.
Chris, will that handle this specific BTRFS issue?

I now also detect and just fail for these cases:

|  extent 1     |
|  extent 2 |

or
              |  extent 1  |
|  extent 2 |

cheers,
Pádraig.

Attachment: fiemap-overlap.diff
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]