coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] cat,cp,mv,install,split: Set the minimum IO block size used to 2


From: Pádraig Brady
Subject: [PATCH] cat,cp,mv,install,split: Set the minimum IO block size used to 256KiB
Date: Wed, 28 Feb 2024 17:09:03 +0000
User-agent: Mozilla Thunderbird

>> On 11/30/23 12:11, Pádraig Brady wrote:
>>> Though that will generally give 128K, which is good when processing all
>>> of a file,
>>> but perhaps overkill when processing just the last part of a file.
>>
>> The 128 KiB number was computed as being better for apps like 'sed' that
>> typically read all or most of the file. 'tail' sometimes behaves that
>> way (e.g., 'tail -c +10') and so 'tail' should use 128 KiB in those
>> cases. The simplest way to do that is for 'tail' to use 128 KiB all the
>> time - that would cost little for uses like plain 'tail' and it could be
>> a significant win for uses like 'tail -c +10'.
>
> Yes I agree we should use io_blksize() in other routines in tail
> where we may dump lots of a file. However in this (most common) case
> the routine is dealing with the end of a regular file,
> so it's probably best to somewhat minimize the amount of data read,
> and more directly check the page_size which is issue at hand.
> I've pushed the fix at https://github.com/coreutils/coreutils/commit/73d119f4f
> where the adjustment (which also corresponds to what we do in wc) is:
>
>     if (sb->st_size % page_size == 0)
>       bufsize = MAX (BUFSIZ, page_size);
>
> I'll follow up with another patch to address the performance aspect,
> which uses io_blksize() where appropriate.

More testing shows that 256KiB does indeed give a 10-20% bump on modern systems.
Proposed change is attached.

cheers,
Pádraig.

Attachment: ioblksize-256.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]