[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How do parallel builds scale?

From: Ralf Wildenhues
Subject: Re: How do parallel builds scale?
Date: Fri, 4 Mar 2011 20:09:06 +0100
User-agent: Mutt/1.5.20 (2010-08-04)

* Ludovic Courtès wrote on Fri, Mar 04, 2011 at 06:59:45PM CET:
> Ralf Wildenhues writes:
> > * Ludovic Courtès wrote on Thu, Mar 03, 2011 at 04:42:52PM CET:
> >> I ran a series of build time measurements on a 32-core machine, with
> >> make -jX, with X in [1..32], and the results are available at:
> >> 
> >>
> >
> > Thank you!  Would you be so kind and also describe what we see in the
> > graphs?  I'm sorry but I fail to understand what they are showing, what
> > the axes really mean, and how to interpret the results.
> Y is the number of packages with a speedup <= X.  Does it help?

Well, it helps in that it allows me to understand the graphs, but it
doesn't allow me to interpret them.  How about a histogram with x the
speedup (in some number of intervals) and y the number of packages
exhibiting that speedup?  That would IMVHO be easier to read.

> > A few of the packages (using an Autotest test suite: Autoconf, Bison)
> > would benefit from you passing TESTSUITEFLAGS=-jN to make.
> Oh, I didn’t know that.  So ‘make -jN’ isn’t enough for Autotest?

No.  Doing that right would require something like
which was never added.  (I don't actually remember whether it was
rejected or just not bug-free yet.)

> > FWIW, parallelizability of Automake's own 'make check' has been improved
> > in the git tree (or so at least I hope).
> Yeah, and its ‘make check’ phase already scales relatively well.

Oh, I'm willing to bet we scale to 20 now.

> > I am fairly surprised GCC build times scaled so little.  IIRC I've seen
> > way higher numbers.  Is you I/O hardware adequate?
> I think so.  :-)

OK, so I guess I'd like some details as to how you build it.  Is
bootstrapping enabled (three-stage build), which languages do you build?

> > Did you use only -j or also -l for the per-package times (I would
> > recommend to not use -l).
> I actually used ‘-jX -lX’.  What makes you think -l shouldn’t be used?

Typically, -lX leads to waves in the load, due to the latency between
measure and action, and of course the retardation from the measure
interval.  There are long periods in which processes are already done
but the load is still listed as too high.

> The main problem I’m interested in is continuous integration on a
> cluster.  When building a complete distro on a cluster, there’s
> parallelism to be exploited at the level of package composition (e.g.,
> build GCC and Glibc at the same time, each with N/2 cores), and
> parallelism within a build (‘make -jX’).
> Suppose you’ve scheduled GCC and Glibc on a 4-core machine, you want
> each of them to use 2 cores without stepping on each other’s toes.
> I think -l2 may help with this.

I understand the problem, but I don't think -lX is a real solution to
it.  First off, you want -l4 (or maybe even -l6) on a four-way system,
not -l2.  Then, you still have the waves as above.  It might be simpler
to just use -j3 for each of them.

Do you build GCC and Glibc in separate virtual containers?
(If yes, my -l4 suggestion above might be wrong.)

If not, we could think of providing a high-level job server that just
serves out job cookies to the makes of both projects.  It might require
a bit of adjustment to GNU make to do this in all sorts of interesting
distro build setups.  But it would allow for (more) effective load


reply via email to

[Prev in Thread] Current Thread [Next in Thread]