[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Library dependency files

From: Mike Shal
Subject: Re: Library dependency files
Date: Tue, 27 Apr 2010 11:45:48 -0400

On 4/27/10, Todd Showalter <address@hidden> wrote:
> On Mon, Apr 26, 2010 at 11:45 PM, Mike Shal <address@hidden> wrote:
>  > Just curious, what kind of performance increase do you actually see
>  > here? If I understand your solution correctly, instead of including N
>  > dependency files with M dependencies each, you want to include 1
>  > dependency file with N*M dependencies (per directory). I would've
>  > thought most of make's time is spent in parsing out that info into the
>  > DAG, rather than just in opening and closing the files. Is the time
>  > difference all in the system time (I think user time should be about
>  > the same)?
>     As a general rule of thumb, unless you're working on a pig slow
>  processor or a machine with high-end solid state storage these days,
>  finding the bytes and dragging them off the disk is the expensive
>  part.  A single file means two things; the disk doesn't have to do as
>  much seeking (assuming non-pathological filesystem fragmentation
>  levels, of course), and the OS can be clever about readahead.

True, but that really only happens once. I was assuming the timing was
done with a full disk cache, which has enough space to hold all the
dependency files and Makefiles - I should have stated this assumption.
The actual underlying storage technology shouldn't matter for this

I just tried the combining-dep-files technique on mplayer (building on
linux), which has about 490 .d files. I just cat'd them all together
into one .d file and included that instead. With just 'make', the
savings was about 7%, but running with 'make -Rr', the savings
measured at about 20%. Another thought - the time savings may also be
from the fact that make considers included files as targets.

There is definitely some overhead in opening that many files, though.
I also tried a test program to just open and read the one deps file,
vs another program that opened and read all 490 files (so make is not
involved, no DAGs were harmed in the making of this program, etc). I
did this in a loop of 1000 times to get a reasonable time, and the
single file case was ~27x faster (0.081s vs 2.252s). But, another way
to look at that is for 1000 make iterations, the OS only takes 2 extra
seconds for the overhead of multiple files.

>     Readahead is potentially a big one.  If you read part of a file, a
>  lot of OSs will assume you'll probably want the rest and speculatively
>  read part or all of it before you ask it to.  Even for very large
>  projects, dependency files don't get very big (at least from the
>  filesystem's point of view), so there's a good chance that the whole
>  file will get read in quickly.  The OS typically won't speculatively
>  open new files, though, so if your dependencies are scattered around
>  in multiple files you lose some of the benefit of readahead.

That's a good point, I didn't consider readahead.

>     All of that said, I've never really noticed significant overhead
>  from make itself.  Dependency generation is often expensive, but at
>  least on my projects (even the large ones) all of the time is spent
>  actually inside the dependency generator.  So it's entirely possible
>  that it will all be moot.

What do you mean by the "dependency generator" here? Do you have
something that scans all the source files for their dependencies every
time you run make? Or do you have them generated as a side-effect of
compilation like in Philip's case? I think in Philip's setup, after
everything is built, just running 'make' should parse the Makefiles
(along with included .d files), then run through the DAG without
executing anything. No dependency generation would be done, since no
files are recompiled.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]