[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

parallel 'automake' execution [0/4]

From: Ralf Wildenhues
Subject: parallel 'automake' execution [0/4]
Date: Sun, 26 Oct 2008 19:57:14 +0100
User-agent: Mutt/1.5.18 (2008-05-17)


Joakim (and others) have complained that automake has become slow
(over the course of version 1.4 through 1.10).

Well, I took a quick look, and didn't really see much in the way of
low-hanging fruits for optimization; some minor items, but nothing
really standing out, and worth of throwing effort at.  There are
quite a few changes that caused slow-downs, but they typically add
new functionality, provide better abstractions, or similar, all stuff
that I'd be hard pressed to modify a lot.

So I looked (not quick) at parallelization using Perl threads.  When
a project uses many makefiles, it should be possible to update most
of them concurrently.  Issues requiring serialization:

* warning and error output varies in order, there are un-eliminated
  duplicate warnings, and the warning message content is unstable,

* the makefile that distributes the scripts in the build-aux directory
  (AC_CONFIG_AUX_DIR) must wait for all others to have installed
  their files there (with --add-missing), to pick all of them up
  (if automake ever installs files elsewhere, this has to be revisited),

* multiple threads trying to install aux directory files at once may
  interfere (this is again mostly an unstable message content issue).

I will reply to this message with a series of patches implementing
thread support that aims to solve these issues.  I'm sure to have
overlooked some more, so feedback is highly desirable.

Issues IMO not requiring serialization:

* fatal errors; all they need is to still be recognized as fatal error,
  and still be output in full.

* verbose output; it is intended to show what's really going on, so will
  necessarily be volatile with multiple threads.

Example timings: OpenMPI, 211 makefiles (yes, I am trying to be
favorable) on an 8-way system (kudos to the GCC compile farm; the last
entry was run with 12 threads):

jobs    time    efficiency
1       36.31
2       18.99   0.96
4       11.44   0.79
8 (12)   7.63   0.59

For smaller packages which don't have many or large files,
the maximum speedup over serial execution is smaller, of course; e.g.,
coreutils is at about 25%.

Parallelization is not perfect, because:

* the scanning of and *.m4, and the treatment of the last
  makefile are not parallel,
* Perl ithreads are quite expensive beasts,
* the necessary serialization has high overhead (more efficient Perl
  primitives would probably help).

The last point may not be helped by the fact that I wanted output to
have progress (i.e., the master should output messages as they become
available, not just after all files are done), for better interactive
user experience.

Threaded execution is enabled when ithreads are available, suitably many
makefiles are to be processed, and when the user has set the environment
variable AUTOMAKE_JOBS to a positive number.  I'm not yet certain about
the final API to provide, so here's my rationale to use this rather than
a --jobs argument (or both):

automake is run a lot from a command line and from autoreconf, but just
as well from a bootstrap/autogen script, or triggered by the rebuilding
rules hardcoded in files.  Now, where would --jobs arguments
be stored?
* passing AUTOMAKE='automake --jobs...' to autoreconf is ok but not
* likewise for scripts,
* rebuilding rules should not contain them, because their optimal value
  is inherently system-dependent, and should not be used on other

Providing both an environment variable as well as a command line
argument would lead to the question of precedence.  So my line of
thinking is that storing an appropriate value per-system, say in
~/.bashrc, seems like a good idea.  Of course own downside of
environment variables is their non-obviousness for debugging.

Ideally of course, no explicit switch would be needed at all;
however, I haven't found a portable Perl module that figures out
the number of processors/cores on a system, also the user should
have a lever to turn off threaded execution for unexpected bugs.

Review and testing much appreciated; I am aware of the ugliness of some
of the changes, from an internal details POV.  Anything weird in output
or even differences in files are bugs that need fixing.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]