[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-tar] --one-file-system option's interaction with --listed-increment
From: |
Nathan Stratton Treadway |
Subject: |
[Bug-tar] --one-file-system option's interaction with --listed-incremental |
Date: |
Sun, 27 May 2012 17:24:17 -0400 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
Recently on the amanda-users mailing list there was a report from a
system administrator who moved some files from one partition to another,
then mounted the new one on top of the path where those files were
previously located... and then found that on subsequent level-1 backups,
tar crossed into the new filesystem even though the --one-file-system
option was in effect.
I was able to reproduce the issue with a simple test case and the latest
version of tar:
# tar --version | head -n1; lsb_release -d
tar (GNU tar) 1.26
Description: Ubuntu precise (development branch)
# pwd
/root
# mkdir tartest
# echo "hi" > tartest/top-level-file
# mkdir tartest/subdir
# echo "hi" > tartest/subdir/subdir-file
# TARCMD="tar --create --directory /root/tartest --one-file-system --sparse
--ignore-failed-read --totals --verbose ."
# $TARCMD --file tartest_one-fs_lvl0.tar --listed-incremental
tartest_one-fs_lvl0.snar
tar: .: Directory is new
tar: ./subdir: Directory is new
./
./subdir/
./top-level-file
./subdir/subdir-file
Total bytes written: 10240 (10KiB, 2.4MiB/s)
As expected, both regular files and the "subdir" directory are included
in the archive.
Then I move the "subdir" directory to another partition, and mount the
new copy on top of "/root/tartest/subdir". (For this test I used a bind
mount to map a directory from an existing second partition, though in
the case of the original amanda-users report the second partition was a
newly-created one mounted on top of a path within the old one, in the
normal way).
# mkdir /srv/pseudopartition
# mv -v tartest/subdir/* /srv/pseudopartition/
`tartest/subdir/subdir-file' -> `/srv/pseudopartition/subdir-file'
removed `tartest/subdir/subdir-file'
# mount --bind /srv/pseudopartition/ tartest/subdir/
# ls -Rl tartest
tartest:
total 8
drwxr-xr-x 2 root root 4096 May 27 15:09 subdir
-rw-r--r-- 1 root root 3 May 27 14:45 top-level-file
tartest/subdir:
total 4
-rw-r--r-- 1 root root 3 May 27 14:46 subdir-file
After this "partition swap", the device id changes at the "subdir"
boundary:
# stat -c "File: %-30n Device: %Dh/%dd" tartest/ tartest/top-level-file
tartest/subdir/ tartest/subdir/subdir-file
File: tartest/ Device: 900h/2304d
File: tartest/top-level-file Device: 900h/2304d
File: tartest/subdir/ Device: 901h/2305d
File: tartest/subdir/subdir-file Device: 901h/2305d
But when I do a level-1 backup, subdir/subdir-file is nevertheless
included in the new archive:
# cp -p tartest_one-fs_lvl0.snar tartest_split-fs_lvl1.snar
# $TARCMD --file tartest_split-fs_lvl1.tar --listed-incremental
tartest_split-fs_lvl1.snar
tar: ./subdir: Directory has been renamed
./
./subdir/
./subdir/subdir-file
This only happens for files that are mentioned in the snapshot file; if
I create a new file under subdir/, it doesn't get included in the
archive:
# echo "hello" > tartest/subdir/new-subdir-file
# cp -p tartest_one-fs_lvl0.snar tartest_split-fs_lvl1_try2.snar
# $TARCMD --file tartest_split-fs_lvl1_try2.tar --listed-incremental
tartest_split-fs_lvl1_try2.snar
tar: ./subdir: Directory has been renamed
./
./subdir/
./subdir/subdir-file
Total bytes written: 10240 (10KiB, 2.7MiB/s)
Also, I see that that "subdir-file" is continually included in all later
incremental backups (though a file added in the top-level directory is
only included on the first one, as expected):
# echo "hello" > tartest/new-top-level-file
# cp -p tartest_one-fs_lvl0.snar tartest_split-fs_repeattest.snar
# $TARCMD --file tartest_split-fs_repeattest.tar --listed-incremental
tartest_split-fs_repeattest.snar
tar: ./subdir: Directory has been renamed
./
./subdir/
./new-top-level-file
./subdir/subdir-file
Total bytes written: 10240 (10KiB, 2.6MiB/s)
# $TARCMD --file tartest_split-fs_repeattest.tar --listed-incremental
tartest_split-fs_repeattest.snar
./
./subdir/
./subdir/subdir-file
Total bytes written: 10240 (10KiB, 3.2MiB/s)
# $TARCMD --file tartest_split-fs_repeattest.tar --listed-incremental
tartest_split-fs_repeattest.snar
./
./subdir/
./subdir/subdir-file
Total bytes written: 10240 (10KiB, 3.3MiB/s)
I did some tracing with a gdb and traced the issue back to the following
program flow:
* early on, incremen.c:read_incr_db_2() populates the list of
"directory" entries from the snapshot file, including setting the
"dump" element of the "./subdir" node to contain "Ysubdir-file"
* then later, as part of the name.c:collect_and_sort_names()
operation, add_hierarchy_to_namelist() calls incremen.c:scan_directory().
scan_directory() in turn calls procdir(), which (in the case of the
"./subdir" path) notices that the one-file-system option is enabled
and stat_data->st_dev != st->parent->stat.st_dev, so it sets
directory->children to "NO_CHILDREN" -- and that value causes
scan_directory() to skip the loop that updates the dumpdir info for
that directory (so it's just left with the values originally read
in from the snapshot file).
* finally, back in create.c:create_archive(), in the second pass
through the name list, directory_contents() is called on the
"subdir" node, thus setting "q" to point to "Ysubdir-file"... and
since that starts with "Y", dump_file() is then called on
"subdir-file".
(Note that dump_file0() includes its own one_file_system_option
check, but by the time we've reached there the "parent" is already
"subdir", i.e. we've already crossed into the new partition, and
thus the check doesn't trigger.)
So that combination of events seems to explain both why tar crosses the
filesystem boundary for files listed in the snapshot, and why those
files are repeatedly included in all later incremental runs, even though
they haven't changed.
I can think of a few changes that seem like they would fix this issue,
but I am not confident enough in my understanding of all the possible
paths through the directory-tree-scanning routines to know which is best
(or what else I might be missing -- for example, I haven't try to figure
out what would happen if the level-0 snapshot file included multiple
levels of sub-directories under the "subdir" mount point):
* in procdir(), when setting directory->children = NO_CHILDREN because
of the one-file-system check, also delete any existing
directory->dump .
* in scan_directory(), if the returned directory's "children" is set
to NO_CHILDREN, delete directory's existing "dump" (in place of the
makedumpdir()-and-loop that would be done for other "children"
values).
* in create_archive() loop just before
q = directory_contents (gnu_list_name->directory)
, add either add a check to see of gnu_list_name->directory->children
has NO_CHILDREN set, or add an explicit one-file-system check
against gnu_list_name->directory and its parent.
Anyway, let me know if I can provide any other information.
Thanks.
Nathan
----------------------------------------------------------------------------
Nathan Stratton Treadway - address@hidden - Mid-Atlantic region
Ray Ontko & Co. - Software consulting services - http://www.ontko.com/
GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt ID: 1023D/ECFB6239
Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239
- [Bug-tar] --one-file-system option's interaction with --listed-incremental,
Nathan Stratton Treadway <=