[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
sparse file processing improvements
From: |
Pádraig Brady |
Subject: |
sparse file processing improvements |
Date: |
Mon, 06 Oct 2014 15:40:55 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 09/16/2014 06:00 PM, Bernhard Voelker wrote:
> On 09/16/2014 03:38 PM, Pádraig Brady wrote:
>> Bernhard it would be worth mentioning the above two points in the
>> commit message as they're not obvious, and then the patch is fine to push.
>
> Thanks, push with the following commit message:
>
> http://git.sv.gnu.org/cgit/coreutils.git/commit/?id=ed0c3c33c6
>
> tests: fix false du failure on newer XFS
>
> On XFS, when creating the ~2G test file 'big' in a for-loop by
> appending 20M each time, the file ends up using ~4G - visible in
> 'st_blocks'. The unused space would be reclaimed later.
> This feature is called "speculative preallocation" which aims at
> avoiding fragmentation.
> According to the XFS FAQ [1], there are two particular aspects of
> XFS speculative preallocation that are triggering this:
>
> 1. "Applications that repeatedly trigger preallocation and reclaim
> cycles [after file close] can cause fragmentation.
> Therefore, this pattern is detected and causes the preallocation
> to persist beyond the lifecycle of the file descriptor."
>
> 2. "Preallocation sizes grow as files grow larger."
>
> [1] http://xfs.org/index.php/XFS_FAQ
> After all, I consider this a useful feature, however, IMHO it should
> not be visible to the user in 'st_blocks': this is some magic that I,
> as a user, would expect from a file system, but I would never want to
> (have to) know about it.
I agree, but I guess it's to simplify allocation identification/cleanup
in the edge cases of crash etc?
Related, I see that this speculative preallocation can become
permanent, and can break our --sparse scheme in cp for example.
Trying the following on CentOS 7:
# Setup a test XFS file system
$ truncate -s1G xfs.img
$ mkfs.xfs !$
$ mkdir mfs.mnt
$ sudo mount xfs.img xfs.mnt
$ cd xfs.mnt
$ sudo chmod a+w .
# Create 10M file alternating between 1M zeros and random chunks
$ for i in $(yes zero | sed 1~2s/zero/urandom/ | head -n10); do
dd iflag=fullblock if=/dev/$i of=sparse.in conv=notrunc oflag=append bs=1M
count=1 status=none
done
# Punch out the zeros into holes (required updated util-linux)
$ fallocate -d sparse.in
# Note the deallocation is asynchronous!
$ ls -ls sparse.*
10,485,760 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:04 sparse.in
# But the punched allocation is not permanent
$ sudo umount xfs.mnt
$ sudo mount xfs.img xfs.mnt
$ cd xfs.mnt
$ ls -ls sparse.*
5,242,880 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:04 sparse.in
# Now let's check how cp's hole generation requests are handled.
# Note cp will use lseek for the holes and read/write 64K at a time.
$ cp --sparse=always sparse.in sparse.out
$ ls -ls sparse.*
5,242,880 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:04 sparse.in
10,158,080 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:14 sparse.out
# So speculative preallocation has nullified our hole requests :/
# More problematically this is a permanent allocation issue !
$ cd ..
$ sudo umount xfs.mnt
$ sudo mount xfs.img xfs.mnt
$ cd xfs.mnt
$ ls -ls sparse.*
5,242,880 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:04 sparse.in
10,158,080 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:14 sparse.out
# Let's try again but after extending the file with a hole,
# use fallocate(...PUNCH_HOLE...) to avoid any permanency
$ ~/cp-punch --sparse=always sparse.in sparse.out.punch
$ ls -ls sparse.*
5,242,880 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:04 sparse.in
10,158,080 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:14 sparse.out
5,242,880 -rw-rw-r--. 1 padraig 10,485,760 Oct 5 22:56 sparse.out.punch
I've attached 3 patches to improve various sparse file handling.
1. supports detecting holes < internal buffer size (currently 128KiB)
2. employs the punch hole scheme demonstrated above
3. reads sparse files efficiently even if not writing to a regular file
thanks,
Pádraig.
sparse-improvements.patch
Description: Text Data
- sparse file processing improvements,
Pádraig Brady <=