bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#73721: grep perf docs barely mention mem usage


From: mark.yagnatinsky
Subject: bug#73721: grep perf docs barely mention mem usage
Date: Wed, 9 Oct 2024 21:12:21 +0000

Thanks!!
Re: wouldn't help much: maybe it wouldn't help with speed.
But how could it possibly not help with memory usage?
In particular, the "commit charge" is surely far lower, right?
With a read() based approach, grep needs to explicitly allocate an array, and 
as far as the kernel knows, this array has arbitrary contents.
(In fact, the array matches a certain file on disk, but the kernel can't know 
that.)
If memory gets tight, then if the kernel wants to evict this array from RAM, it 
must find swap space for it in the page file or whatever.
With mmap(), the kernel just needs to set up a bit of book-keeping to note that 
"this virtual address range is backed by this file on disk".
If memory gets tight, it can simply evict the range from RAM, since it knows it 
can reconstruct it perfectly later.
Or am I missing something?

Thanks again for responding so fast!

-----Original Message-----
From: Paul Eggert <eggert@cs.ucla.edu> 
Sent: Wednesday, October 9, 2024 3:03 PM
To: Yagnatinsky, Mark : IT (NYK) <mark.yagnatinsky@barclays.com>
Cc: 73721-done@debbugs.gnu.org
Subject: Re: bug#73721: grep perf docs barely mention mem usage

 CAUTION:  This email originated from outside our organization - 
eggert@cs.ucla.edu  Do not click on links, open attachments, or respond unless 
you recognize the sender and can validate the content is safe. 

______________________________________________________________________
On 2024-10-09 10:01, mark.yagnatinsky--- via Bug reports for GNU grep wrote:
> After a bit of research, it seems that once upon a time, grep used mmap where 
> possible, but it no longer does this.
> Thus, peak memory usage will be proportional to the length of the longest 
> line in the file.
> Thus, if use the "-z multiline hack" to search across lines, grep will read 
> the whole file into memory.
> Thus, if I try this on a huge file, I will likely have a bad time.
> (e.g., a 5 gig file would fail in a 32-bit grep, and would increase memory 
> pressure on the system on a 64-bit grep.)
> 
> Is the above about right?

Sounds right.

mmap likely wouldn't help much. As I recall, it typically made 'grep' 
slower.

This message is for information purposes only. It is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service, nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is intended for the recipient(s) only. It is not directed at 
retail customers. This message is subject to the terms at: 
https://www.ib.barclays/disclosures/web-and-email-disclaimer.html. 

For important disclosures, please see: 
https://www.ib.barclays/disclosures/sales-and-trading-disclaimer.html regarding 
marketing commentary from Barclays Sales and/or Trading desks, who are active 
market participants; 
https://www.ib.barclays/disclosures/barclays-global-markets-disclosures.html 
regarding our standard terms for Barclays Investment Bank where we trade with 
you in principal-to-principal wholesale markets transactions; and in respect to 
Barclays Research, including disclosures relating to specific issuers, see: 
https://publicresearch.barclays.com.
__________________________________________________________________________________
 
If you are incorporated or operating in Australia, read these important 
disclosures: 
https://www.ib.barclays/disclosures/important-disclosures-asia-pacific.html.
__________________________________________________________________________________
For more details about how we use personal information, see our privacy notice: 
https://www.ib.barclays/disclosures/personal-information-use.html. 
__________________________________________________________________________________

reply via email to

[Prev in Thread] Current Thread [Next in Thread]