[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] stat() on btrfs reports the st_blocks with delay (data los

From: Chris Mason
Subject: Re: [Bug-tar] stat() on btrfs reports the st_blocks with delay (data loss in archivers)
Date: Mon, 11 Jul 2016 13:30:58 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1

On 07/11/2016 11:16 AM, David Sterba wrote:
On Mon, Jul 11, 2016 at 11:00:55AM -0400, Chris Mason wrote:
So, the real bug is that we're letting some delalloc stat hang around
after the truncate, probably related to IO in progress.  We do already
account for delalloc in what we return to stat, but there's a corner
case involving truncate where we screw it up.

So the original testcase:

    a) some "tool" creates sparse file
    b) that tool does not sync explicitly and exits ..
    c) tar is called immediately after that to archive the sparse file
    d) tar considers [2] the file is completely sparse (because st_blocks is
       zero) and archives no data.  Here comes data loss.

will not happen. The application would basically have to mimick the
provided reproducer script and do the truncate/write loop and be lucky
enough to let tar hit the short race window.

Looking harder there is a race window that can trigger this without the truncate loop:

1) application calls write(), we make the pages delalloc (in-ram st_blocks goes up)
2) VM calls write_cache_pages, we go find a contiguous delalloc range
3) We call cow_file_range on the locked range of pages
4) cow_file_range clears the delalloc bits (in-ram st_blocks goes down)

< ----- race begins here ----->

5) The io is started
6) The IO completes and extents are inserted into the metadata
7) the on disk/in-ram st_blocks goes up

< ---- race ends here ---->

This makes a ton more sense than leaking delalloc bits.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]