Re: Limiting parallel when used with recursion

Schweiss, Chip
Re: Limiting parallel when used with recursion
Mon, 3 Aug 2015 08:12:16 -0500

On Mon, Aug 3, 2015 at 12:03 AM, Ole Tange wrote:
On Sun, Aug 2, 2015 at 2:36 AM, Schweiss, Chip wrote:

> The problem with that is that parallel will start execution on parent folder
> before the child process is finish.

Ahh. Yes.

One solution is to find the max depth.
Run all for that depth using GNU Parallel.
Do the same for depth-1.

That sound like the winner!   This has a few inefficiencies, but for the most part fits the problem quite well.  

Thanks for the suggestion!


This way you should have very little time wasted as you will
parallelize over different subdirs at the same level. You will only
have wasted time at the end of each depth. This should work:

# Find the maxdepth
MAX=$(find | perl -ne '$a=s:/:/:g;$max=$a>$max?$a:$max;END{ print $max+1 }')
# For each depth (D) in MAX..1:
#   Find files/dirs at depth D and do_stuff on them in parallel
seq $MAX -1 1 | parallel -j1 -I D 'find . -mindepth D -maxdepth D |
parallel do_stuff {}'


