bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and take


From: Alexandre Oliva
Subject: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever
Date: Thu, 13 Jul 2017 02:53:04 -0300
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

Hi,

I've upgraded the root in which I create and verify GNU Linux-libre
tarballs from Fedora/Freed-ora 25 to 26, which brought gawk from 4.1.3 to
4.1.4.

With 4.1.3, it used about 1GB of RAM and took some 15 minutes to run.

With 4.1.4, I gave up after 2 hours of CPU time, and the process was at
6GB and growing.

I saw a number of regexp changes in gawk 4.1.3-4.1.4 diff, so I took the
Fedora 25 binary and it's running on the Fedora 26 root with the
previous memory use.

The command I use to perform this check is:

deblob-check --use-awk linux-libre-4.12.tar.bz2

deblob-check and the tarball can be downloaded from
http://linux-libre.fsfla.org/pub/linux-libre/releases/4.12-gnu/

The script generates and runs a gawk script with monster regexps that
match known blobs, known false positives, and patterns that catch likely
blobs, and it's running that generated script that's taking up a lot of
RAM and time.

deblob-check can use sed, python or perl instead of gawk, but gawk used
to be the best choice for this final checking, because of the low memory
use compared with sed, and the DFA-based regexp not available in python
and perl.  (for deblobbing proper, python turns out to be better due to
the much lower start-up time compiling the monster regexp)

I haven't checked whether gawk 4.1.4 still beats the memory efficiency
of sed, but sed was barely usable for this purpose back then, and gawk
4.1.4 is unfortunately turning out to be unusable too.

Any recommendations as to how we could avoid this huge performance
regression in gawk, short of switching to a different regexp processing
engine?

Thanks in advance,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer



reply via email to

[Prev in Thread] Current Thread [Next in Thread]