bug-automake
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12620: Parallel tests vs fast tests (and beyond)


From: Reuben Thomas
Subject: bug#12620: Parallel tests vs fast tests (and beyond)
Date: Wed, 10 Oct 2012 20:53:46 +0100

With the recent work on parallel tests in automake I thought it was
time to give them a spin, so I did, for the "zee" branch of GNU Zile.
This has about 100 tests, the total wall clock time being around 8s on
my 2.5GHz 4-core Sandy Bridge machine, with the following target:

 check-local: $(builddir)/src/zee
        echo $(TESTS) | $(LUA_ENV) $(LUA_TESTS_ENVIRONMENT) xargs $(RUNLUATESTS)

So, I rewrote it as:

LOG_COMPILER = $(LUA_ENV) $(LUA_TESTS_ENVIRONMENT) $(RUNLUATESTS)

and lo! it ran in 2/3 of the time. Not bad for a few minutes' work.
But then I looked closer, and noticed that although the wall-clock
time had gone down, the user+system time had almost doubled. Oops: I
was only winning because I had four cores. Developers using
single-core or dual-core machines might well lose out with parallel
tests.

The test harness is a Lua script, so the parallel tests were starting
an extra Lua interpreter per test, so  this is not a huge surprise,
but it is a pity.

In the end, I found I got a much bigger speed up (down to under 3s), with:

check-local: $(builddir)/src/zee
        NPROC=`nproc`; \
        echo $(LUA_TESTS) | $(LUA_ENV) $(LUA_TESTS_ENVIRONMENT) xargs
--max-procs=$$NPROC --max-args=$$(( `echo $(LUA_TESTS) | wc -w` /
$$NPROC + 1 )) $(RUNLUATESTS) > /dev/null

That would be relatively straightforward to make portable (or rather,
bail out if non-GNU xargs is used, but it's still much clumsier than
parallel-tests, and of course uses the old serial-tests. (I'm working
with automake 1.11.6, in case it matters.) The implication of the 1.12
release notes is that serial-tests will be dropped at some point
post-1.13.

This whole problem is an instance of a more general problem, of which
another example is parallel make: for best performance, it should
probably batch up calls to gcc, for example, so that multiple source
files are compiled by each invocation.

I can imagine a view that we're rapidly increasing the number of
cores, so in fact there's little point working on these intermediate
solutions, but it seems to me that it's not just less-well-off
developers who could benefit, but also compile farms.

Comments? Solutions I've missed?

-- 
http://rrt.sc3d.org





reply via email to

[Prev in Thread] Current Thread [Next in Thread]