bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] core-count: A new program to count the number of cpu cores


From: Pádraig Brady
Subject: Re: [PATCH] core-count: A new program to count the number of cpu cores
Date: Tue, 03 Nov 2009 11:27:15 +0000
User-agent: Thunderbird 2.0.0.6 (X11/20071008)

Paolo Bonzini wrote:
>>> I was thinking of an additional option that would automatically decrease
>>> -n so that the requested number of processes is started (then of course
>>> the load may not be well balanced).
>>
>> So you mean, rather than the current situation of:
>>
>> $ yes . | head -n13 | xargs -n4 -P2
>> . . . .
>> . . . .
>> . . . .
>> .
>>
>> xargs could try to distribute like:
>>
>> $ yes . | head -n13 | xargs -n4 -P2
>> . . . .
>> . . . .
>> . . .
>> . .
> 
> No, more like
> 
> seq 1 13 | xargs --parallel -P4
> 1 5 9 13
> 2 6 10
> 3 7 11
> 4 8 12
> 
> (Note there's no -n).  Same for
> 
> seq 1 13 | xargs --parallel
> 
> on a 4-core machine.  This is _by design_ rearranging files, so it
> requires an option.

Right, you're not auto decreasing -n, but when we read all args and
we pass arguments round robin, the args will be distrubuted evenly to
each parallel process. Does this really require a new option though?
When -P is used, the arguments could be processed in any order anyway.

Passing args round robin means each process would get MAX(max_args,
num_args/nproc). The downside to this is that there would be a bit
more latency introduced as max_args*nproc would need to be read before
starting a process, rather than just max_args. Also interleaving
arguments like this might be undesirable for other reasons?
Both these are minor issues I think. We could of course reduce
max_args to max_args/nproc to address the minor latency issue. Note
currently `find` sets a limit of 128KiB of args to each process which
could be about 2000 files for example:
    $ find /usr/share/ | head -n2000 | wc -c
    131337

If we did a more invasive change we could help latency a lot I think.
We could set O_NONBLOCK on stdin, and on EWOULDBLOCK, share what we
have out to the available processes and then exec. I.E. auto reduce -n
to num_args/nproc when we block. This would both result in less interleaving
of args and would mean xargs would exec the processes without delay.
This would be beneficial even without -P, like in the following example
where we wouldn't wait for all input before displaying output:
    (seq 10; sleep 3; seq 10) | xargs

cheers,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]