bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32073: Improvements in Grep (Bug#32073)


From: Jim Meyering
Subject: bug#32073: Improvements in Grep (Bug#32073)
Date: Wed, 1 Jan 2020 16:51:00 -0800

On Wed, Jan 1, 2020 at 12:04 PM Sergiu Hlihor <address@hidden> wrote:
> Paul, I have to correct you. On a production server you have usually a mix of 
> applications many times including databases. For databases, having a read 
> ahead means one IO less since usually database access patterns are random 
> reads. Here actually best is to disable completely read ahead. In fact, I do 
> have to say that probably best is to disable completely read ahead and let 
> applications deal with it, either in an automatic fashion, like reading the 
> optimal IO block size from device  or in a configurable way with defaults 
> good enough for today's servers. If you now configure the OS to do a read 
> ahead hitting all HDDs then you induce potentially unnecessary IO load for 
> all applications which use it, which when having HDDs is totally 
> unacceptable. That's why the best is to be application specific and ideally 
> configured to use optimal IO block size.
>
> So no, letting OS to do it is stupid.
>
> On Wed, 1 Jan 2020 at 20:42, Paul Eggert <address@hidden> wrote:
>>
>> On 1/1/20 1:15 AM, Sergiu Hlihor wrote:
>> > If you rely on OS, then
>> > you are at the mercy of whatever read ahead configuration you have.
>>
>> Right, and whatever changes you make to the OS and its read-ahead 
>> configuration
>> will work for all applications, not just for 'grep'. So, change the OS to do
>> that. There shouldn't be a need to change 'grep' in particular (or 'cp' in
>> particular, or 'awk' in particular, etc.).
>>
>> > The issue of large
>> > block sizes for IO operations is widespread across all tools from Linux,
>> > like rsync or cp and its only getting worse
>>
>> Quite right. And it would be painful to have to modify all those tools, and 
>> to
>> maintain those modifications. So modify the OS instead. Scheduling 
>> read-ahead is
>> really the OS's job anyway.

Hi Sergiu,

If you would like to help make grep use larger buffer sizes, please
run and report benchmarks measuring how much of a difference it would
make, at least for your hardware. Here are some of the tests I ran to
justify raising it from ~32k to ~96k:
https://lists.gnu.org/archive/html/grep-devel/2018-10/msg00002.html





reply via email to

[Prev in Thread] Current Thread [Next in Thread]