Hi Sean,
Apologies it took me so long to reply.
# Now back it up to the backup host
find $filesys -mount $SKIPSTRING -print | \
tee $LISTFILE | /bin/cpio -o | gzip | \
ssh -l root $BU_HOST \
"dd of=/backups/$MYHOST/$DAY/$c_filesys.cpio.gz"
I am not quite sure what $SKIPSTRING is, if I am not mistaken -mount
option to find does not take any arguments. Apart from this, the
above command is equivalent to the following:
tar --rsh-command=/usr/bin/ssh \
--one-file-system \
-c -z -f $BU_HOST:/backups/$MYHOST/$DAY/$c_filesys.cpio.gz $filesys
(Of course, the exact path to ssh utility can differ). If $BU_HOST has
rmt command installed in an unusual location (tar --show-defaults will
show you where tar expects it to be located), you will need to add also
the following option:
--rmt-command=/path/to/rmt
# Now send over the list for verification
cat $LISTFILE | ssh -l root $BU_HOST \
dd of=/backups/$MYHOST/$DAY/index.$c_filesys
# Now Verify what we sent (NOTE: we do the zcat/cpio on the localhost,
not the BUHOST because the BUHOST handles six different machines at the
same time. This reduces the workload while allowing us to send
compressed data across the LAN).
ssh -l root $BU_HOST \
"dd if=/backups/$MYHOST/$DAY/$c_filesys.cpio.gz" | zcat | cpio
-it >$OUTFILE
# Compare OUTFILE/LISTFILE
diff $LISTFILE $OUTFILE >$DIFILE
To reproduce exactly that, we will have to change the above tar
invocation adding several new options:
tar --rsh-command=/usr/bin/ssh \
--one-file-system \
-c -z -f $BU_HOST:/backups/$MYHOST/$DAY/$c_filesys.cpio.gz \
--verbose --index-file $LISTFILE --show-stored-names
$filesys
In the above invocation, --verbose prints verbose file listing,
--index-file redirects it to the given file, and --show-stored-names shows
file names as stored in the archive, not the absolute pathnames. This
possibly requires a clarification: by default GNU tar will not store
absolute filenames in the archive, instead it will strip the leading
file hierarchy suffix and store the "stripped name" in the archive. This
is done to prevent accidental overwriting of vital data while extracting
from the archive. You can disable this feature using -P option. In this
case you will not need to specify --show-stored-names (This option
appeared in the CVS version of GNU tar. Its buildable snapshots are
available from ftp://download.gnu.org.ua/pub/alpha/tar).
Now, the two following commands will do the verification:
tar --rsh-command=/usr/bin/ssh \
-t -z -f $BU_HOST:/backups/$MYHOST/$DAY/$c_filesys.cpio.gz \
--index-file $OUTFILE
diff $LISTFILE $OUTFILE >$DIFILE
Notice, that GNU tar offers a verify mode, during which it will
compare not only file names but also file contents and meta-data. This
mode, however, currently works only for plain, non-comressed archives.
We've tweaked this over the years to compensate for different issues.
The main thing we want is a list of what is getting backed up, then a
list of what was backed up. The problem we run into by trusting the
backup utility to make the LISTFILE is that if the backup utility
doesn't see the file, it wouldn't show up in LISTFILE, whereas find is
pretty thorough and gives us an independent check of what is on the "tape."
Well, if you prefer to use find, then the archive creation command should be
changed as follows:
find $filesys -mount $SKIPSTRING -print |
tee $LISTFILE |
tar --rsh-command=/usr/bin/ssh \
--one-file-system \
-c -z -f $BU_HOST:/backups/$MYHOST/$DAY/$c_filesys.cpio.gz \
-P -T -
$filesys
# `-T -' option takes a list of files to be archived from the stdin
tar --rsh-command=/usr/bin/ssh \
-t -z -f $BU_HOST:/backups/$MYHOST/$DAY/$c_filesys.cpio.gz \
-P --index-file $OUTFILE
diff $LISTFILE $OUTFILE >$DIFILE
Here, we will have to use -P in both cases, to match the listing produced
by find.
BTW: I've been reading that bzip2 might be better than gzip for
compression because you can recover from corrupt files and it compresses
better. Any thoughts on this?
It surely compresses much better than gzip. However, it also takes much
longer to compress.
Regards,
Sergey