autoconf
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How do parallel builds scale?


From: Ludovic Courtès
Subject: Re: How do parallel builds scale?
Date: Mon, 07 Mar 2011 18:16:17 +0100
User-agent: Gnus/5.110013 (No Gnus v0.13) Emacs/23.2 (gnu/linux)

Hi Ralf,

Ralf Wildenhues <address@hidden> writes:

> * Ludovic Courtès wrote on Fri, Mar 04, 2011 at 06:59:45PM CET:
>> Ralf Wildenhues writes:
>> > * Ludovic Courtès wrote on Thu, Mar 03, 2011 at 04:42:52PM CET:
>> >> I ran a series of build time measurements on a 32-core machine, with
>> >> make -jX, with X in [1..32], and the results are available at:
>> >> 
>> >>   http://hubble.gforge.inria.fr/parallel-builds.html
>> >
>> > Thank you!  Would you be so kind and also describe what we see in the
>> > graphs?  I'm sorry but I fail to understand what they are showing, what
>> > the axes really mean, and how to interpret the results.
>> 
>> Y is the number of packages with a speedup <= X.  Does it help?
>
> Well, it helps in that it allows me to understand the graphs, but it
> doesn't allow me to interpret them.  How about a histogram with x the
> speedup (in some number of intervals) and y the number of packages
> exhibiting that speedup?  That would IMVHO be easier to read.

I’ve added that.

FWIW I think the cumulative plots make sense when trying to answer the
question “how many packages have a speedup <= X”.

[...]

>> > I am fairly surprised GCC build times scaled so little.  IIRC I've seen
>> > way higher numbers.  Is you I/O hardware adequate?
>> 
>> I think so.  :-)
>
> OK, so I guess I'd like some details as to how you build it.  Is
> bootstrapping enabled (three-stage build),

Yes—that’s the default according to the manual (info "(gccinstall)
Configuration").

> which languages do you build?

 - C and C++ for ‘gcc-4.5’;

 - Fortran only for ‘gfortran-4.5’, though I suspect
   --enable-langauges=fortran implies --enable-languages=c.

Unfortunately there’s no data for GNAT and GCJ.

>> > Did you use only -j or also -l for the per-package times (I would
>> > recommend to not use -l).
>> 
>> I actually used ‘-jX -lX’.  What makes you think -l shouldn’t be used?
>
> Typically, -lX leads to waves in the load, due to the latency between
> measure and action, and of course the retardation from the measure
> interval.  There are long periods in which processes are already done
> but the load is still listed as too high.

Right, I see.

I don’t think it hindered scalability though, since as the measurements
show that few packages scale beyond 2, even with ‘-j32 -l32’.

>> The main problem I’m interested in is continuous integration on a
>> cluster.  When building a complete distro on a cluster, there’s
>> parallelism to be exploited at the level of package composition (e.g.,
>> build GCC and Glibc at the same time, each with N/2 cores), and
>> parallelism within a build (‘make -jX’).
>> 
>> Suppose you’ve scheduled GCC and Glibc on a 4-core machine, you want
>> each of them to use 2 cores without stepping on each other’s toes.
>> I think -l2 may help with this.
>
> I understand the problem, but I don't think -lX is a real solution to
> it.  First off, you want -l4 (or maybe even -l6) on a four-way system,
> not -l2.

Yes, here I always used ‘-jX -lX’.

> Then, you still have the waves as above.  It might be simpler to just
> use -j3 for each of them.
>
> Do you build GCC and Glibc in separate virtual containers?

If you mean Linux containers, no.

> If not, we could think of providing a high-level job server that just
> serves out job cookies to the makes of both projects.  It might require
> a bit of adjustment to GNU make to do this in all sorts of interesting
> distro build setups.  But it would allow for (more) effective load
> usage.

Yeah.

Thanks,
Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]