[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Extensions to Tar

From: Chris Wilson
Subject: Extensions to Tar
Date: Sun, 31 Mar 2002 02:02:31 +0100 (BST)

Hi all,

Sorry if this is the wrong place to report this, the tar manual suggests 
this address for "bugs and suggestions", which is the closest I could 

I've been looking at extending tar to support various features which I 
really would like to see for backups: file filtering (e.g. through gzip or 
gpg) and remote backups. I've done some work on the former which I would 
really like to see incorporated in tar if you are so minded.

I've implemented the "reserved" -y option for compressing individual files 
in the archive using zlib (not gzip). It seems to be working okay, with a 
few bugs which I am tracking down and fixing. Would you be interested in 
adding this code to tar? I am happy to share it under the GPL. The main 
disadvantage is that tar currently requires zlib to build with this 
extension, although it could be protected with #ifdefs.

I'm also considering adding the ability to filter files through GPG, for 
encrypted backups. I've been doing this externally by piping through 
GnuPG, but it's very kludgy and I'd prefer to see a built-in solution. Is 
this of any interest to you?

One problem I encountered with filtering was that in order to write the 
file header, tar needs to know _in advance_ the compressed/filtered size 
of the data. This means either compressing the entire file in memory 
before writing it, or compressing twice (once to get the size and write 
the header, then again to write the compressed data). Can anyone see any
ways around this problem? (currently, I compress in memory, which limits 
the maximum file size which can be compressed to the available virtual 

I have implemented support for these data types using custom values of the 
typeflag header field: 'Y' for zlib-filtered data and 'E' for 
GPG-encrypted data. I hope this is the "right" way to do it.

I would like tar to be able to save, restore and modify archives on a 
remote system without using rmt, since it seems to be a big security hole. 
I have untrusted clients who I want to be able to back up to a central 
server without giving them rsh or ssh shell access or anything similar. 
I'm considering writing command-line program which could be run as a 
'shell' to give file access (but not execution) through a pipe, sort of 
like NFS-over-PPP but simpler and more efficient. Any other suggestions?

I would also like to be able to 'update' archives by writing a new file, 
rather than modifying the existing one, to create incremental backups 
without risking damage to the original archive. Would anyone like me to 
implement this?

Finally, for differential/incremental backups, I was thinking about using 
md5sums to detect changed files, but I'm not sure whether the extra CPU 
time and disk access to compute the checksums is necessary on both sides. 
Should I stick to using the mtime? And if not, where can I find space in 
the headers for the md5sum?

Thanks in advance for your advice,

Cheers, Chris.
_  __ __     _
 / __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |

reply via email to

[Prev in Thread] Current Thread [Next in Thread]