|
From: | Austin S. Hemmelgarn |
Subject: | Re: [Bug-tar] stat() on btrfs reports the st_blocks with delay (data loss in archivers) |
Date: | Wed, 6 Jul 2016 07:37:15 -0400 |
User-agent: | Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 |
On 2016-07-05 05:28, Joerg Schilling wrote:
The problem with this is that tar is assuming things that are not guaranteed to be true. There is absolutely nothing that says that st_blocks has to be non-zero if there's data in the file. In fact, the behavior that BTRFS used to have of reporting st_blocks to be 0 for files entirely inlined in the metadata is absolutely correct given the description of the field by POSIX, because there _are_ no blocks allocated to the file (because the metadata block is technically equivalent to the inode, which isn't counted by st_blocks). This is yet another example of an old interface (in this case, sparse file detection) being short-sighted (read in this case as non-existent).Andreas Dilger <address@hidden> wrote:I think in addition to fixing btrfs (because it needs to work with existing tar/rsync/etc. tools) it makes sense to *also* fix the heuristics of tar to handle this situation more robustly. One option is if st_blocks == 0 then tar should also check if st_mtime is less than 60s in the past, and if yes then it should call fsync() on the file to flush any unwritten data to disk, or assume the file is not sparse and read the whole file, so that it doesn't incorrectly assume that the file is sparse and skip archiving the file data.A broken filesystem is a broken filesystem. If you try to change gtar to work around a specific problem, it may fail in other situations.
The proper fix for this is that tar (and anything else that handles sparse files differently) should be parsing the file regardless. It has to anyway for a normal sparse file to figure out where the sparse regions are, and optimizing for a file that's completely sparse (and therefore probably pre-allocated with fallocate) is not all that reasonable considering that this is going to be a very rare case in normal usage.
[Prev in Thread] | Current Thread | [Next in Thread] |