[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Tar --unlink-first
From: |
jkb |
Subject: |
Tar --unlink-first |
Date: |
3 May 2001 10:55:59 GMT |
GNU tar (1.13) doesn't seem to honour the --unlink-first.
Eg:
bash$ pwd
/tmp/ttt
bash$ ls -lR
total 16
drwxr-xr-x 2 pubseq system 8192 May 3 10:51 d1
drwxr-xr-x 3 jkb system 8192 May 3 10:51 d2
./d1:
total 0
-rw-r--r-- 1 pubseq system 0 May 3 10:51 aaa
./d2:
total 8
-rw-r--r-- 1 jkb system 0 May 3 10:49 a
drwxr-xr-x 2 jkb system 8192 May 3 10:52 d1
./d2/d1:
total 1
-rw-r--r-- 1 jkb system 4 May 3 10:52 aaa
So, d1/aaa is owned by pubseq. d2/d1/aaa is owned by jkb.
As root, I'll backup /tmp/ttt to /tmp/ttt2:
bash# mkdir /tmp/ttt2
bash# (cd /tmp/ttt; $tar -cpf - .) | (cd /tmp/ttt2; $tar -xvvpf -
--unlink-first)
drwxrwxrwx pubseq/system 0 2001-05-03 11:04 ./
drwxr-xr-x pubseq/system 0 2001-05-03 10:51 d1/
-rw-r--r-- pubseq/system 0 2001-05-03 10:51 d1/aaa
drwxr-xr-x jkb/system 0 2001-05-03 10:51 d2/
drwxr-xr-x jkb/system 0 2001-05-03 10:52 d2/d1/
-rw-r--r-- jkb/system 4 2001-05-03 10:52 d2/d1/aaa
-rw-r--r-- jkb/system 0 2001-05-03 10:49 d2/a
bash#
That's all fine. If I look in /tmp/ttt2 then I see exactly what I expect - a
copy of /tmp/ttt.
Now comes the bug:
bash# cd /tmp/ttt2/d2
bash# ls
a d1
bash# rm -rf d1
bash# ln -s ../d1 .
bash# ls -l
total 0
-rw-r--r-- 1 jkb system 0 May 3 10:49 a
lrwxrwxrwx 1 root system 5 May 3 11:07 d1 -> ../d1
This is in the directory the backups are being written TO.
So I'll repeat the backup command:
bash# (cd /tmp/ttt; $tar -cpf - .) | (cd /tmp/ttt2; $tar -xvvpf -
--unlink-first)
drwxrwxrwx pubseq/system 0 2001-05-03 11:04 ./
drwxr-xr-x pubseq/system 0 2001-05-03 10:51 d1/
-rw-r--r-- pubseq/system 0 2001-05-03 10:51 d1/aaa
drwxr-xr-x jkb/system 0 2001-05-03 10:51 d2/
drwxr-xr-x jkb/system 0 2001-05-03 10:52 d2/d1/
-rw-r--r-- jkb/system 4 2001-05-03 10:52 d2/d1/aaa
-rw-r--r-- jkb/system 0 2001-05-03 10:49 d2/a
bash#
The output is just as before - we're copying d1/aaa and d2/d1/aaa. However:
bash# pwd
/tmp/ttt2/d2
bash# ls -l
total 0
-rw-r--r-- 1 jkb system 0 May 3 10:49 a
lrwxrwxrwx 1 root system 5 May 3 11:07 d1 -> ../d1
How come d1 is still a symlink? The original copy in /tmp/ttt/d2 was a
directory owned by jkb and I haven't changed that. The --unlink-first should
have removed the d1 link before recreating it. This is even true if I also
specify --recursive-unlink.
Consequently the backup copy in /tmp/ttt2/d1 has changed ownership:
bash# pwd
/tmp/ttt2/d2
bash# ls -la ../d1
total 17
drwxr-xr-x 2 pubseq system 8192 May 3 10:51 .
drwxrwxrwx 4 pubseq system 8192 May 3 11:04 ..
-rw-r--r-- 1 jkb system 4 May 3 10:52 aaa
This has serious implication for security when using GNU tar with backups.
Consider the case of using GNU tar (with --list-incremental for example) to do
nightly backups to another disk. A user can create a symlink to /etc in their
directory. This is then backed up. The next day they remove their etc symlink
and create a directory called etc. In there they create a new password
file. This is then subsequently copied over the top of /etc/passwd. (I haven't
tested this theory, but it seems to reasonably follow from my own observations
so far.)
So, what does --unlink-first do? I think it handles symlink to files
correctly, but not directories. I'll investigate the source code. However
before I go too far down this route, do I have the latest version? 1.13 is a
couple of years old, but I couldn't find a newer release.
James
--
James Bonfield (address@hidden) Tel: 01223 402499 Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/
- Tar --unlink-first,
jkb <=