[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #42288] limit parallelism based on available memory

From: Paul D. Smith
Subject: [bug #42288] limit parallelism based on available memory
Date: Sun, 04 May 2014 23:12:16 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.132 Safari/537.36

Follow-up Comment #1, bug #42288 (project make):

This is a tricky feature.  First even defining "available memory" is
difficult.  Is it physical memory only, not swap?  Is it unused memory, or
total memory?

Second, determining the amount of system memory available is extremely
system-specific: there's no portable function that does it.  On POSIX systems
sysconf() gets you SOME information but it's not available everywhere.

Third, how will the amount of memory required by each target be specified? 
Are you just going to say that the maximum amount per target is X and all
targets are assumed to use the maximum?  It seems like that could result in a
big loss of parallelism if most targets are smaller.

It might be more interesting if make provided a generic method for counting
resources available and used, and the caller would provide the details.

You can imagine that today's parallelism feature is a simplified version of
this: the user provides the amount of the resource (number of jobs that can be
run in parallel), and the cost to run each target is always one.

But suppose we allowed targets to specify they cost two or more resource
elements to run?  Maybe a linker runs in parallel itself and so requires
multiple cores.

Then you can imagine that a resource could represent something other than a
CPU; for example, memory.  Now you can define that certain targets cost more
memory than others, and the person invoking make will provide the total amount
of memory available.

The big problem with this is deadlocks.  Suppose a target uses 5 job slots but
can only get 2 and the others are used elsewhere.  Then either the target can
keep the 2 and wait for the rest, which reduces parallelism through the
system, or free the 2 and try for the entire 5 later which means those jobs
will tend to have to wait a lot, probably.  Maybe that's not so bad.

Then if you introduce multiple resources (CPU and memory, for example) you
have even bigger problems: what if you get all the CPU but not memory?  Again
you'll have to free everything you got and try again later.

It can be done, of course, but requires thought.

And there are some technical issues; for example on UNIX-y systems given
today's implementation the maximum number of "resource items" we can have is
the number of bytes in a pipe, typically 4K.  That's probably enough for now
(even if the resources represented memory you'd make the count much more
granular like 100M or 1G or something) but maybe not forever.  We'd need
multiple pipes, or else switch the POSIX-based systems to use POSIX semaphores
(like Windows), or something.


Reply to this item at:


  Message sent via/by Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]