[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: explicit extraction of files behind (sym)links

From: Reuti
Subject: Re: explicit extraction of files behind (sym)links
Date: Fri, 22 Jul 2022 17:29:41 +0200

Hi Aiyion,

> Am 22.07.2022 um 10:13 schrieb Aiyion.Prime <>:
> Good morning everyone,
> I thought I knew my way around tar for a few years now, but learned I'm wrong 
> about that yesterday evening:
> I'm archiving a directory-structure, that does contain large redundant files.
> onepath/readme
> onepath/binaryblob13
> anotherpath/readme
> anotherpath/binaryblob13

I don't know your complete workflow, hence I can give only a vague idea:

Assuming you are using symlinks in the above structure:

• instead of archiving the complete directories recursively, create a list of 
files to be saved for `tar`: first all symlinks (as symlinks), then all real 
• on extraction --occurrence=1 will stop at the first encounter
• in case it's a symlink, remove the extracted symlink file and extract the 
real file it points to with the name of the symlink file

This should speed up the processing.

-- Reuti

> I cannot change the pathing, as this is to be fed to a packagemanager, that 
> requires it.
> What I thought I could do, to not have an archive twice the size of 
> `binaryblob13`, was to use sym- or hardlinks and the `-h` flag for creation.
> So archiving this:
> onepath/
> secondpath -> onepath/
> using
> tar --sort=name --owner=0 --group=0 --numeric-owner -chvf normal_sized.tar 
> secondpath onepath ${mtime})
> That would work like a charm if said packagemanger would extract the whole 
> tarfile.
> This is what it does though:
> tar xf $tar_file secondpath/binaryblob13
> And that works fine if I extract files from the directory first referenced in 
> the creation command (in the case above secondpath)
> but returns an error for the latter directory I archived, as it tries to 
> create a hardlink on disk pointing to what would've been the former extracted 
> file. As it does not exist I've got a problem.
> I'd like to avoid extracting all binaryblob13 references beforehand only to 
> have the link I extract point to something valid.
> Is there a flag to tell tar "I dont care if you have to seacrh the archive 
> twice, but extract the original file instead of creating an (invalid) 
> hardlink"?
> I realize thats unuseable for actual tape-records, but maybe someone has a 
> hint for me here.
> Thanks in advance and have a nice morning,
> Aiyion

Attachment: signature.asc
Description: Message signed with OpenPGP

reply via email to

[Prev in Thread] Current Thread [Next in Thread]