bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] use optimal file system block size


From: Tim Kientzle
Subject: Re: [Bug-tar] use optimal file system block size
Date: Wed, 18 Jul 2018 21:20:05 -0700

bsdtar has a similar optimization.

It decouples reads and writes, allowing it to use a more optimal size for each 
side.

When it opens an archive for writing, it checks the target device type.  If 
it’s a character device (such as a tape drive), it writes the requested blocks 
exactly.  When the target device is a block device, however, it instead buffers 
and writes much larger blocks, padding the file at the end as necessary to 
ensure the final size is a multiple of the requested block size.  This produces 
the exact same end result as if it had written blocks as requested but much 
more efficiently.

Tim


> On Jul 18, 2018, at 9:58 AM, Andreas Dilger <address@hidden> wrote:
> 
> On Jul 18, 2018, at 9:03 AM, Ralph Corderoy <address@hidden> wrote:
>> 
>> Hi Christian,
>> 
>>> $ stat -c %o data/blob
>>> 2097152
>> ...
>>> **tar** does not explicitly use the block size of the file system
>>> where the files are located, but, for a reason I don't know (feel free to 
>>> educate me), 10 KiB:
>> 
>> Historic, that being 20 blocks where a block is 512 B.  See `Blocking
>> Factor'.  https://www.gnu.org/software/tar/manual/tar.html#SEC160
>> 
>> It can be changed.
>> 
>>   $ strace -e write -s 10 tar cbf 4096 foo.tar foo
>>   write(3, "foo\0\0\0\0\0\0\0"..., 2097152) = 2097152
>>   +++ exited with 0 +++
>>   $
>> 
>>> I would like to propose to use the native file system block size in
>>> favor of the currently used 10 KiB.
>> 
>> I can't see the default changing.  POSIX's pax(1) states for ustar
>> format that the default for character devices is 10 KiB, and allows for
>> multiples of 512 up to an including 32,256.  So you're suggesting the
>> default is to produce an incompatible tar file.
> 
> The IO size from the storage does not need to match the recordsize
> of the tar file.  It may be that writing to an actual tape character
> device needs to use 10KB writes, but for a regular file on a block
> device (which is 99% of tar usage) it can still write 10KB records,
> but just write a few hundred of them at a time.
> 
> What network filesystem are you using?  Typically, such small IOPS
> should be hidden from the filesystem with readahead and writeback
> cache, though of course there is still more overhead from having
> lots of system calls.
> 
> Cheers, Andreas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]