bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Memory leak


From: Stephane Delsert
Subject: Re: [bug-gawk] Memory leak
Date: Wed, 29 Mar 2017 16:35:34 +0000

HI,

I compiled gawk (4.1.4)  without -O2 option and added the additional code in 
code.c source file
The first joined file REPORT_4_20MM.txt has been created without those 
modifications on a 20MM records file

The two other reports are the results of the process of 1MM and 2MM of records 
with the additional messages.

I launched the following process for the night with 100MM of records 
zcat sdelse1.pip.gz.5t | tail -n +2 | head -100000000 | 
~/app/devl/bin/bin/valgrind --leak-check=full ~/app/devl/source/gawk-4.1.4/gawk 
-F '|' -f test.awk 2> REPORT_100MM.txt > /dev/null

At the same time, I track the use of the memory each 30 min into a log file. 

Hope that will be useful

Thanks et best regards,

Stéphane.




-----Original Message-----
From: Andrew J. Schorr [mailto:address@hidden 
Sent: mardi 28 mars 2017 21:15
To: address@hidden
Cc: Vihan_Sharma - Vihan Sharma (LiveRamp) <address@hidden>; Stephane Delsert 
<address@hidden>; Fatima Aliane <address@hidden>; address@hidden
Subject: Re: [bug-gawk] Memory leak

On Tue, Mar 28, 2017 at 12:21:22PM -0600, address@hidden wrote:
> Thanks. I note that there are no 'definitely lost' leaks. I tend to 
> mistrust the 'possibly lost' reports as fals positives, but I'm 
> willing to put some time in reviewing the code.

Agreed. I guess the question is whether we are somehow leaking NODE or BUCKET 
structures. I applied the attached patch to more_blocks to show when 
allocations occur, but I couldn't learn anything definitive from such a small 
dataset.

For a single input record, it allocates 800 NODEs and 400 BUCKETs:

bash-4.2$ tail -n +2 sample4gnu.pip | head -1 | gawk -f test.awk  | wc
debug: more_blocks(1) allocated 100; total 100
debug: more_blocks(2) allocated 100; total 100
debug: more_blocks(1) allocated 100; total 200
debug: more_blocks(1) allocated 100; total 300
debug: more_blocks(2) allocated 100; total 200
debug: more_blocks(1) allocated 100; total 400
debug: more_blocks(1) allocated 100; total 500
debug: more_blocks(1) allocated 100; total 600
debug: more_blocks(2) allocated 100; total 300
debug: more_blocks(1) allocated 100; total 700
debug: more_blocks(2) allocated 100; total 400
debug: more_blocks(1) allocated 100; total 800
      1       1      78

For the entire 344-record input file, it allocate 1200 NODEs and still 400
BUCKETs:

bash-4.2$ tail -n +2 sample4gnu.pip |  gawk -f test.awk  | wc
debug: more_blocks(1) allocated 100; total 100
debug: more_blocks(2) allocated 100; total 100
debug: more_blocks(1) allocated 100; total 200
debug: more_blocks(1) allocated 100; total 300
debug: more_blocks(2) allocated 100; total 200
debug: more_blocks(1) allocated 100; total 400
debug: more_blocks(1) allocated 100; total 500
debug: more_blocks(1) allocated 100; total 600
debug: more_blocks(2) allocated 100; total 300
debug: more_blocks(1) allocated 100; total 700
debug: more_blocks(2) allocated 100; total 400
debug: more_blocks(1) allocated 100; total 800
debug: more_blocks(1) allocated 100; total 900
debug: more_blocks(1) allocated 100; total 1000
debug: more_blocks(1) allocated 100; total 1100
debug: more_blocks(1) allocated 100; total 1200
    344     344  103280

Regards,
Andy
***************************************************************************
The information contained in this communication is confidential, is
intended only for the use of the recipient named above, and may be legally
privileged.

If the reader of this message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited.

If you have received this communication in error, please resend this
communication to the sender and delete the original message or any copy
of it from your computer system.

Thank You.
****************************************************************************

Attachment: REPORT_4_20MM.txt
Description: REPORT_4_20MM.txt

Attachment: REPORT_1MM.txt
Description: REPORT_1MM.txt

Attachment: REPORT_2MM.txt
Description: REPORT_2MM.txt


reply via email to

[Prev in Thread] Current Thread [Next in Thread]