[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#73721: grep perf docs barely mention mem usage
From: |
mark.yagnatinsky |
Subject: |
bug#73721: grep perf docs barely mention mem usage |
Date: |
Wed, 9 Oct 2024 21:12:21 +0000 |
Thanks!!
Re: wouldn't help much: maybe it wouldn't help with speed.
But how could it possibly not help with memory usage?
In particular, the "commit charge" is surely far lower, right?
With a read() based approach, grep needs to explicitly allocate an array, and
as far as the kernel knows, this array has arbitrary contents.
(In fact, the array matches a certain file on disk, but the kernel can't know
that.)
If memory gets tight, then if the kernel wants to evict this array from RAM, it
must find swap space for it in the page file or whatever.
With mmap(), the kernel just needs to set up a bit of book-keeping to note that
"this virtual address range is backed by this file on disk".
If memory gets tight, it can simply evict the range from RAM, since it knows it
can reconstruct it perfectly later.
Or am I missing something?
Thanks again for responding so fast!
-----Original Message-----
From: Paul Eggert <eggert@cs.ucla.edu>
Sent: Wednesday, October 9, 2024 3:03 PM
To: Yagnatinsky, Mark : IT (NYK) <mark.yagnatinsky@barclays.com>
Cc: 73721-done@debbugs.gnu.org
Subject: Re: bug#73721: grep perf docs barely mention mem usage
CAUTION: This email originated from outside our organization -
eggert@cs.ucla.edu Do not click on links, open attachments, or respond unless
you recognize the sender and can validate the content is safe.
______________________________________________________________________
On 2024-10-09 10:01, mark.yagnatinsky--- via Bug reports for GNU grep wrote:
> After a bit of research, it seems that once upon a time, grep used mmap where
> possible, but it no longer does this.
> Thus, peak memory usage will be proportional to the length of the longest
> line in the file.
> Thus, if use the "-z multiline hack" to search across lines, grep will read
> the whole file into memory.
> Thus, if I try this on a huge file, I will likely have a bad time.
> (e.g., a 5 gig file would fail in a 32-bit grep, and would increase memory
> pressure on the system on a 64-bit grep.)
>
> Is the above about right?
Sounds right.
mmap likely wouldn't help much. As I recall, it typically made 'grep'
slower.
This message is for information purposes only. It is not a recommendation,
advice, offer or solicitation to buy or sell a product or service, nor an
official confirmation of any transaction. It is directed at persons who are
professionals and is intended for the recipient(s) only. It is not directed at
retail customers. This message is subject to the terms at:
https://www.ib.barclays/disclosures/web-and-email-disclaimer.html.
For important disclosures, please see:
https://www.ib.barclays/disclosures/sales-and-trading-disclaimer.html regarding
marketing commentary from Barclays Sales and/or Trading desks, who are active
market participants;
https://www.ib.barclays/disclosures/barclays-global-markets-disclosures.html
regarding our standard terms for Barclays Investment Bank where we trade with
you in principal-to-principal wholesale markets transactions; and in respect to
Barclays Research, including disclosures relating to specific issuers, see:
https://publicresearch.barclays.com.
__________________________________________________________________________________
If you are incorporated or operating in Australia, read these important
disclosures:
https://www.ib.barclays/disclosures/important-disclosures-asia-pacific.html.
__________________________________________________________________________________
For more details about how we use personal information, see our privacy notice:
https://www.ib.barclays/disclosures/personal-information-use.html.
__________________________________________________________________________________