rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cross-platform backup tool Same files from different source dir caus


From: Robert Nichols
Subject: Re: cross-platform backup tool Same files from different source dir causes spurious diff files
Date: Tue, 8 Feb 2022 20:03:44 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0

On 2/8/22 6:44 PM, Mr. Clif wrote:
ok cool, good info,

I was just digging into it again, and the date I switched to the snapshot was 
recorded as Feb 1st. Here is a list of the mirror_metadata files leading up to 
that:

-rw------- 1 root root 2.7M Jan 21 05:25 
mirror_metadata.2022-01-21T05:20:05-09:00.snapshot.gz
-rw------- 1 root root  632 Jan 23 05:25 
mirror_metadata.2022-01-22T05:20:26-09:00.diff.gz
-rw------- 1 root root  790 Jan 24 05:26 
mirror_metadata.2022-01-23T05:20:04-09:00.diff.gz
-rw------- 1 root root  783 Jan 25 05:24 
mirror_metadata.2022-01-24T05:20:33-09:00.diff.gz
-rw------- 1 root root  778 Jan 26 05:29 
mirror_metadata.2022-01-25T05:19:31-09:00.diff.gz
-rw------- 1 root root  731 Jan 27 05:25 
mirror_metadata.2022-01-26T05:23:21-09:00.diff.gz
-rw------- 1 root root  723 Jan 28 05:27 
mirror_metadata.2022-01-27T05:20:37-09:00.diff.gz
-rw------- 1 root root  786 Jan 29 05:29 
mirror_metadata.2022-01-28T05:21:17-09:00.diff.gz
-rw------- 1 root root  772 Jan 30 05:26 
mirror_metadata.2022-01-29T05:23:55-09:00.diff.gz
-rw------- 1 root root 2.7M Jan 30 05:26 
mirror_metadata.2022-01-30T05:20:43-09:00.snapshot.gz
-rw------- 1 root root  725 Feb  1 05:26 
mirror_metadata.2022-01-31T05:21:21-09:00.diff.gz
-rw------- 1 root root 2.6M Feb  3 15:33 
mirror_metadata.2022-02-01T05:20:43-09:00.diff.gz
-rw------- 1 root root  613 Feb  4 05:16 
mirror_metadata.2022-02-03T14:20:54-09:00.diff.gz
-rw------- 1 root root 1.7K Feb  5 05:17 
mirror_metadata.2022-02-04T05:13:29-09:00.diff.gz
-rw------- 1 root root  852 Feb  6 05:55 
mirror_metadata.2022-02-05T05:14:57-09:00.diff.gz
-rw------- 1 root root 1.7K Feb  7 06:36 
mirror_metadata.2022-02-06T05:52:59-09:00.diff.gz
-rw------- 1 root root  73K Feb  8 05:39 
mirror_metadata.2022-02-07T06:33:04-09:00.diff.gz
-rw------- 1 root root 2.7M Feb  8 05:39 
mirror_metadata.2022-02-08T05:33:08-09:00.snapshot.gz

You will see that the mirror_metadata.2022-02-01T05:20:43-09:00.diff.gz with 
the modified date of Feb 3rd is about the same size as the previous snapshot 
file a couple of days before.

If you grep for the lines that match "^File" then I presume you get a good 
count of the number of files that changed, or at least recorded for some reason. Here are 
those stats:

find increments -name "*2022-02-01*" -exec ls -lh {} \; | wc
   85287  767583 11064660
gzip -dc mirror_metadata.2022-01-30T05:20:43-09:00.snapshot.gz | egrep "^File " 
| wc
   89287  178574 4535737
gzip -dc mirror_metadata.2022-02-01T05:20:43-09:00.diff.gz | egrep "^File " | wc
   85288  170576 4374253

Notice how the number of files with that date in the name, (the first wc 
output) is almost the same as the number of files listed in the diff.gz file on 
the last wc call for the diff.gz file.

I also compared some of the entries in the snapshot file to the diff.gz file, 
and never found any differences. Of course I only checked a dozen or two.

I believe you are comparing the wrong files. Welcome to the confusing world of 
reverse diffs. Everything works backward. That 2.6MB 
mirror_metadata.2022-02-01T05:20:43-09:00.diff.gz has the differences that 
would be applied to a 2022-02-03T14:20:54-09:00 snapshot (i.e., the next 
_newer_ state) to construct a 2022-02-01 snapshot. The huge perceived change 
occurred between the  2022-02-01 backup and the 2022-02-03 backup.

I would first look at some of the entries in that 
mirror_metadata.2022-02-03T14:20:54-09:00.diff.gz file and see if some of the 
same filenames appear in the huge 2022-02-01 diff. Hopefully you can spot what 
metadata changed. If you can't find any matching names in the 2022-02-03 diff, 
try the 2022-02-04 diff. As a last resort, I can send you a rather large** awk 
script that you
can use to work back from the nearest future snapshot (currently 2022-02-08) to 
reconstruct a 2022-02-03 snapshot. Then you should certainly be able to see 
what the differences that 2022-02-01 diff is applying.

** A bit over 3KB, somewhat more than I care to spew out to a mailing list.

--
Bob Nichols     "NOSPAM" is really part of my email address.
                Do NOT delete it.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]