bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-tar] Incorrect listing of sparse files with more than 8G of real da


From: Niessen, Chris
Subject: [Bug-tar] Incorrect listing of sparse files with more than 8G of real data
Date: Sat, 25 Oct 2014 21:08:38 +0000

If a sparse file with more than 8G of real data is stored in a POSIX format archive (which is done correctly in 1.28), listing the contents of the archive will fail.

 

Archiving a sparse file with more than 8G of real data results in two extended header entries being written; GNU.sparse.realsize and size.

When the archive is listed, both of those values are read and end up being stored in stat.st_size, which causes whichever value came first (happens to be GNU.sparse.realsize) to be lost.  (This is because size_decoder and sparse_size_decoder currently do exactly the same thing.)

 

If the file has less than 8G of real data, then the amount of real data in the archive, which gets put in stat.st_size when the file header is first read, gets stashed in stat_info->archive_file_size in list.c:692 prior to the extended headers getting parsed.  Then, after the extended headers are parsed, and stat.st_size gets updated with the value in GNU.sparse.realsize by sparse_size_decoder, both values are available, and tar successfully lists the contents of the archive.

 

However, if the file has more than 8G of real data, then the value that gets stashed in stat_info->archive_file_size is the value from the file header, which was written as zero since the actual data size doesn’t fit in the POSIX header field, and the actual data size gets put in a “size” extended header.  Since the actual size of the file in the archive doesn’t get saved in list.c:692 (since it hasn’t been read out of the extended header yet), then the actual size of the data never makes it into archive_file_size, and the listing operation will fail, since tar will not successfully skip to the next member and will display errors.

 

A patch to address this was submitted against 1.27

http://www.mail-archive.com/bug-tar%40gnu.org/msg03905.html

but it doesn’t seem to have made it in to 1.28.

 

Before finding that patch, I generated my own that modifies size_decoder to put the value of the “size” extended header value into archive_file_size, and if archive_file_size and stat.st_size have the same value (meaning stat.st_size hasn’t been updated by a previously parsed extended header), then the “size” attribute will also get put into stat.st_size.  That way, stat.st_size will be updated properly for non-sparse files, but will not be clobbered for sparse ones.

 

I can provide that patch if desired, but its only two lines.

Thanks-

-Chris Niessen

 

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]