[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Another rfe: "cp" this time
From: |
Pádraig Brady |
Subject: |
Re: Another rfe: "cp" this time |
Date: |
Tue, 01 May 2012 17:59:02 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0 |
On 05/01/2012 04:43 PM, Bruce Korb wrote:
> On Tue, May 1, 2012 at 1:17 AM, Pádraig Brady <address@hidden> wrote:
>> On 05/01/2012 05:15 AM, H. Peter Anvin wrote:
>>> On 04/27/2012 08:35 AM, Pádraig Brady wrote:
>>>> 32KiB buffer which it serially reads to and writes from.
>>> Why this that, though? At least for file-to-file copy, it could
>>> certainly do much better with mmap() + write().
>>>
>>> 32K is so 1990.
>>
>> Well 2009 which is when we changed from 4K (blksize) :)
>>
>> Just yesterday we bumped to 64K to minimize system
>> call overhead, which on modern machines give another
>> 10% speed increase when copy files from cache.
>>
>> Though it should be emphasised that the bottle neck is
>> usually in the devices/network and so optimizations at
>> this level do not help in the general case, and your
>
> My particular whole issue is the network bottle neck. Doing
> the copy as sequential reads of any size results in an empty
> pipe while each piece gets acked.
Right.
> I don't know what would
> happen were the file mmaped. Would the fs layer do sequential
> readahead or would it send out requests for more data before
> earlier requests are satisfied? It'd likely take reading code to
> find out, but I'm guessing it will wind up serialized just like
> the read 32K at a time basic copy.
Yep, as I said in my reply.
> 64K would help if it caused
> however many concurrent requests for pieces of the 64K, but
> that isn't likely either.
Right. It would help a bit though.
> The big deal (for me) is to get enough
> concurrent requests going so the long wide pipe can be kept
> full of data. However that needs to happen.
Parallelizing cp seems like an effective but awkward solution
to me, as that logic would need to be replicated
in all programs wanting to read efficiently.
It should be possible at least on a per file basis,
to indicate we want to read the file sequentially,
and thus the kernel can use appropriate read-ahead
to keep its buffers full, and network protocols
can behave intelligently too, like interleaving reads
for example.
Now many coreutils already provide such a hint to the kernel:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=47076e3c
But copy.c does not currently. I'm not sure why I didn't
do it for copy.c now, but I look into adding it.
In the meantime you might be able to tweak cifs read-ahead
settings, or ask on a samba list as to how to more efficiently
stream over high latency links.
cheers,
Pádraig.