coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] cat,cp,mv,install,split: Set the minimum IO block size used


From: Pádraig Brady
Subject: Re: [PATCH] cat,cp,mv,install,split: Set the minimum IO block size used to 256KiB
Date: Wed, 28 Feb 2024 22:30:40 +0000
User-agent: Mozilla Thunderbird

On 28/02/2024 21:49, Jim Meyering wrote:
On Wed, Feb 28, 2024 at 9:09 AM Pádraig Brady <P@draigbrady.com> wrote:
  >> On 11/30/23 12:11, Pádraig Brady wrote:
  >>> Though that will generally give 128K, which is good when processing all
  >>> of a file,
  >>> but perhaps overkill when processing just the last part of a file.
  >>
  >> The 128 KiB number was computed as being better for apps like 'sed' that
  >> typically read all or most of the file. 'tail' sometimes behaves that
  >> way (e.g., 'tail -c +10') and so 'tail' should use 128 KiB in those
  >> cases. The simplest way to do that is for 'tail' to use 128 KiB all the
  >> time - that would cost little for uses like plain 'tail' and it could be
  >> a significant win for uses like 'tail -c +10'.
  >
  > Yes I agree we should use io_blksize() in other routines in tail
  > where we may dump lots of a file. However in this (most common) case
  > the routine is dealing with the end of a regular file,
  > so it's probably best to somewhat minimize the amount of data read,
  > and more directly check the page_size which is issue at hand.
  > I've pushed the fix at 
https://github.com/coreutils/coreutils/commit/73d119f4f
  > where the adjustment (which also corresponds to what we do in wc) is:
  >
  >     if (sb->st_size % page_size == 0)
  >       bufsize = MAX (BUFSIZ, page_size);
  >
  > I'll follow up with another patch to address the performance aspect,
  > which uses io_blksize() where appropriate.

More testing shows that 256KiB does indeed give a 10-20% bump on modern systems.
Proposed change is attached.

Thanks. Probably worth setting an alarm to check once a year :-)
I think you'll want to change this comment, too:
/* As of May 2014, 128KiB is determined to be the minimum

Well once a decade anyway :)

Fixed the comment, and pushed.

cheers,
Pádraig.

p.s. That Power10 system I tested on is awesome.
It ran the 645 coreutils tests in 5.8 seconds!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]