[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Extensions to Tar
Extensions to Tar
Sun, 31 Mar 2002 02:02:31 +0100 (BST)
Sorry if this is the wrong place to report this, the tar manual suggests
this address for "bugs and suggestions", which is the closest I could
I've been looking at extending tar to support various features which I
really would like to see for backups: file filtering (e.g. through gzip or
gpg) and remote backups. I've done some work on the former which I would
really like to see incorporated in tar if you are so minded.
I've implemented the "reserved" -y option for compressing individual files
in the archive using zlib (not gzip). It seems to be working okay, with a
few bugs which I am tracking down and fixing. Would you be interested in
adding this code to tar? I am happy to share it under the GPL. The main
disadvantage is that tar currently requires zlib to build with this
extension, although it could be protected with #ifdefs.
I'm also considering adding the ability to filter files through GPG, for
encrypted backups. I've been doing this externally by piping through
GnuPG, but it's very kludgy and I'd prefer to see a built-in solution. Is
this of any interest to you?
One problem I encountered with filtering was that in order to write the
file header, tar needs to know _in advance_ the compressed/filtered size
of the data. This means either compressing the entire file in memory
before writing it, or compressing twice (once to get the size and write
the header, then again to write the compressed data). Can anyone see any
ways around this problem? (currently, I compress in memory, which limits
the maximum file size which can be compressed to the available virtual
I have implemented support for these data types using custom values of the
typeflag header field: 'Y' for zlib-filtered data and 'E' for
GPG-encrypted data. I hope this is the "right" way to do it.
I would like tar to be able to save, restore and modify archives on a
remote system without using rmt, since it seems to be a big security hole.
I have untrusted clients who I want to be able to back up to a central
server without giving them rsh or ssh shell access or anything similar.
I'm considering writing command-line program which could be run as a
'shell' to give file access (but not execution) through a pipe, sort of
like NFS-over-PPP but simpler and more efficient. Any other suggestions?
I would also like to be able to 'update' archives by writing a new file,
rather than modifying the existing one, to create incremental backups
without risking damage to the original archive. Would anyone like me to
Finally, for differential/incremental backups, I was thinking about using
md5sums to detect changed files, but I'm not sure whether the extra CPU
time and disk access to compute the checksums is necessary on both sides.
Should I stick to using the mtime? And if not, where can I find space in
the headers for the md5sum?
Thanks in advance for your advice,
_ __ __ _
/ __/ / ,__(_)_ | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |
- Extensions to Tar,
Chris Wilson <=