[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and
From: |
Andrew J. Schorr |
Subject: |
Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever |
Date: |
Thu, 13 Jul 2017 14:20:28 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
His message included this URL:
http://linux-libre.fsfla.org/pub/linux-libre/releases/4.12-gnu/
I grabbed
http://linux-libre.fsfla.org/pub/linux-libre/releases/4.12-gnu/deblob-check
and
http://linux-libre.fsfla.org/pub/linux-libre/releases/4.12-gnu/linux-libre-4.12-gnu.tar.bz2
I started to bisect. In the master branch, I think commit 3a15491 is good.
I have 436 revisions left...
-Andy
On Thu, Jul 13, 2017 at 11:49:51AM -0600, address@hidden wrote:
> Where can I get the files from?
>
> Thanks,
>
> Arnold
>
> "Andrew J. Schorr" <address@hidden> wrote:
>
> > Hi,
> >
> > I grabbed the file. It's broken both in master and in stable.
> >
> > gawk 4.1.3:
> >
> > bash-4.2$ /bin/time ./deblob-check --use-awk linux-libre-4.12-gnu.tar.bz2
> > 472.19user 96.71system 6:59.57elapsed 135%CPU (0avgtext+0avgdata
> > 876968maxresident)k
> > 0inputs+0outputs (0major+63559391minor)pagefaults 0swaps
> >
> > Master branch (I ctrl-c'ed after 13 minutes):
> >
> > bash-4.2$ /bin/time ./deblob-check --use-awk linux-libre-4.12-gnu.tar.bz2
> > ^C813.84user 17.63system 13:23.65elapsed 103%CPU (0avgtext+0avgdata
> > 1122292maxresident)k
> > 0inputs+0outputs (0major+11885984minor)pagefaults 0swaps
> >
> > Stable branch (I ctrl-c'ed after 13 minutes):
> >
> > bash-4.2$ /bin/time ./deblob-check --use-awk linux-libre-4.12-gnu.tar.bz2
> > ^C828.07user 23.17system 13:35.58elapsed 104%CPU (0avgtext+0avgdata
> > 1590252maxresident)k
> > 456inputs+72outputs (4major+15541814minor)pagefaults 0swaps
> >
> > Kind of a pain to bisect, since each iteration will be so slow. I haven't
> > tried yet.
> >
> > -Andy
> >
> > On Thu, Jul 13, 2017 at 01:42:21AM -0600, address@hidden wrote:
> > > If neither of those are any better, then let's work offline to isolate
> > > when things broke. "git bisect" is quite good at that. :-) If possible,
> > > I'd prefer to fix the problem instead of leaving things alone.
> > >
> > > Thanks,
> > >
> > > Arnold
> > >
> > > address@hidden wrote:
> > >
> > > > Hi.
> > > >
> > > > Can you try building from the gawk-4.1-stable branch in the git repo
> > > > and let me know if you still have the problem?
> > > >
> > > > I'm also curious if you build from master in the repo what happens.
> > > >
> > > > Thanks,
> > > >
> > > > Arnold
> > > >
> > > > Alexandre Oliva <address@hidden> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I've upgraded the root in which I create and verify GNU Linux-libre
> > > > > tarballs from Fedora/Freed-ora 25 to 26, which brought gawk from
> > > > > 4.1.3 to
> > > > > 4.1.4.
> > > > >
> > > > > With 4.1.3, it used about 1GB of RAM and took some 15 minutes to run.
> > > > >
> > > > > With 4.1.4, I gave up after 2 hours of CPU time, and the process was
> > > > > at
> > > > > 6GB and growing.
> > > > >
> > > > > I saw a number of regexp changes in gawk 4.1.3-4.1.4 diff, so I took
> > > > > the
> > > > > Fedora 25 binary and it's running on the Fedora 26 root with the
> > > > > previous memory use.
> > > > >
> > > > > The command I use to perform this check is:
> > > > >
> > > > > deblob-check --use-awk linux-libre-4.12.tar.bz2
> > > > >
> > > > > deblob-check and the tarball can be downloaded from
> > > > > http://linux-libre.fsfla.org/pub/linux-libre/releases/4.12-gnu/
> > > > >
> > > > > The script generates and runs a gawk script with monster regexps that
> > > > > match known blobs, known false positives, and patterns that catch
> > > > > likely
> > > > > blobs, and it's running that generated script that's taking up a lot
> > > > > of
> > > > > RAM and time.
> > > > >
> > > > > deblob-check can use sed, python or perl instead of gawk, but gawk
> > > > > used
> > > > > to be the best choice for this final checking, because of the low
> > > > > memory
> > > > > use compared with sed, and the DFA-based regexp not available in
> > > > > python
> > > > > and perl. (for deblobbing proper, python turns out to be better due
> > > > > to
> > > > > the much lower start-up time compiling the monster regexp)
> > > > >
> > > > > I haven't checked whether gawk 4.1.4 still beats the memory efficiency
> > > > > of sed, but sed was barely usable for this purpose back then, and gawk
> > > > > 4.1.4 is unfortunately turning out to be unusable too.
> > > > >
> > > > > Any recommendations as to how we could avoid this huge performance
> > > > > regression in gawk, short of switching to a different regexp
> > > > > processing
> > > > > engine?
> > > > >
> > > > > Thanks in advance,
> > > > >
> > > > > --
> > > > > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
> > > > > You must be the change you wish to see in the world. -- Gandhi
> > > > > Be Free! -- http://FSFLA.org/ FSF Latin America board member
> > > > > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
> >
> > --
> > Andrew Schorr e-mail: address@hidden
> > Telemetry Investments, L.L.C. phone: 917-305-1748
> > 545 Fifth Ave, Suite 1108 fax: 212-425-5550
> > New York, NY 10017-3630
>
--
Andrew Schorr e-mail: address@hidden
Telemetry Investments, L.L.C. phone: 917-305-1748
545 Fifth Ave, Suite 1108 fax: 212-425-5550
New York, NY 10017-3630
- [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Alexandre Oliva, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, arnold, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, arnold, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, arnold, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever,
Andrew J. Schorr <=
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, arnold, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/13
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, arnold, 2017/07/14
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Alexandre Oliva, 2017/07/14
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, Andrew J. Schorr, 2017/07/14
- Re: [bug-gawk] 4.1.3->4.1.4 = Linux-libre's deblob-check grows huge and takes forever, arnold, 2017/07/14