[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: parallel sort at fault? [Re: [PATCH] tests: avoid gross inefficiency
From: |
Jim Meyering |
Subject: |
Re: parallel sort at fault? [Re: [PATCH] tests: avoid gross inefficiency... |
Date: |
Wed, 16 Mar 2011 13:07:35 +0100 |
Pádraig Brady wrote:
> I've not fully analyzed this yet, and I'm not saying it's wrong,
> but the above change seems to have a large effect on thread
> creation when smaller buffers are used (you hinted previously
> that being less aggressive with the amount of mem used by default
> might be appropriate, and I agree).
>
> Anyway with the above I seem to need a buffer size more
> than 10M to have any threads created at all.
>
> Testing the original 4 lines heuristic with the following, shows:
> (note I only get > 4 threads after 4M of input, not 7 for 16 lines
> as indicated in NEWS).
>
> $ for i in $(seq 30); do
>> j=$((2<<$i))
>> yes | head -n$j > t.sort
>> strace -c -e clone sort --parallel=16 t.sort -o /dev/null 2>&1 |
>> join --nocheck-order -a1 -o1.4,1.5 - /dev/null |
>> sed -n "s/\([0-9]*\) clone/$j\t\1/p"
>> done
> 4 1
> 8 2
> 16 3
> 32 4
> 64 4
> 128 4
...
> 1048576 4
> 2097152 4
> 4194304 8
> 8388608 16
>
> When I restrict the buffer size with '-S 1M', many more threads
> are created (a max of 16 in parallel with the above command)
> 4 1
> 8 2
> 16 3
> 32 4
> 64 4
> 128 4
> 256 4
> 512 4
> 1024 4
> 2048 4
> 4096 4
> 8192 4
> 16384 8
> 32768 12
> 65536 24
> 131072 44
> 262144 84
> 524288 167
> 1048576 332
> 2097152 660
> 4194304 1316
> 8388608 2628
>
> After increasing the heuristic to 128K, I get _no_ threads until -S > 10M
> and this seems to be independent of line length.
Thanks for investigating that.
Could strace -c -e clone be doing something unexpected?
When I run this (without my patch), it would use 8 threads:
seq 16 > in; strace -ff -o k ./sort --parallel=16 in -o /dev/null
since it created eight k.PID files:
$ ls -1 k.*|wc -l
8
Now, for such a small file, it does not call clone at all.