coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

parallel sort at fault? [Re: [PATCH] tests: avoid gross inefficiency...


From: Jim Meyering
Subject: parallel sort at fault? [Re: [PATCH] tests: avoid gross inefficiency...
Date: Wed, 09 Feb 2011 18:29:20 +0100

Jim Meyering wrote:
> Running "make -j25 check" on a nominal-12-core F14 system would
> cause serious difficulty leading to an OOM kill -- and this is brand new.
> It worked fine yesterday.  I tracked it down to all of the make processes
> working on the "built_programs.list" (in src/Makefile.am) rule
>
> built_programs.list:
>       @echo $(bin_PROGRAMS) $(bin_SCRIPTS) | tr ' ' '\n' \
>         | sed -e 's,$(EXEEXT)$$,,' | $(ASSORT) -u | tr '\n' ' '
>
> Which made me realize we were running that submake over 400 times,
> once per test scripts (including skipped ones).  That's well worth
> avoiding, even if it means a new temporary file.
>
> I don't know the root cause of the OOM-kill (preceded by interminable
> minutes of a seemingly hung and barely responsive system) or why it started
> happening today (afaics, none of the programs involved was updated),
> but this does fix it...

FYI,
I've tracked this down a little further.
The horrid performance (hung system and eventual OOM-kill)
are related to the use of sort above.  This is the definition:

    ASSORT = LC_ALL=C sort

If I revert my earlier patch and instead simply
insist that sort not do anything in parallel,

    ASSORT = LC_ALL=C sort --parallel=1

then there is no hang, and things finish in relatively good time.

I don't have a good stand-alone reproducer yet
and am out of time for today.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]