[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: parallel external sort implementation

From: Pádraig Brady
Subject: Re: [PATCH] sort: parallel external sort implementation
Date: Thu, 04 Mar 2010 09:41:23 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20100216 Thunderbird/3.0.2

On 04/03/10 06:01, Joey Degges wrote:

Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
using pthreads. The threading approach taken here is to break up input files
into groups based on physical device and sort each of those groups in
This allows for storage and processing resources to be more fully utilized
during the sort process.

Thanks for looking at this.
Are you aware that there is a group already working on this?

Below is a set of experimental results and the statistics of the machine
they were run on. The test below show speed improvements from 45% - 64%.

In this patch the method used to determine which device a particular file is
relies on comparisons between struct stat ->  st_dev fields. This leads to
problems in cases where files exist on different partitions of the same

Issues like this are why it might be better to split the data
externally to sort. I've commented before that a threading
solution may only be appropriate for working on a single file:

Preliminary tests have shown that under these situations some performance
improvements are still seen, but they are not as dramatic as in the ideal

system details
CPU: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz
cache size: 4096 KB

disk details:
  Timing cached reads:   7556 MB in  2.00 seconds = 3781.13 MB/sec
  Timing buffered disk reads:  320 MB in  3.00 seconds = 106.50 MB/sec
  Timing cached reads:   7772 MB in  2.00 seconds = 3888.09 MB/sec
  Timing buffered disk reads:  224 MB in  3.03 seconds =  74.00 MB/sec
  Timing cached reads:   7444 MB in  2.00 seconds = 3724.34 MB/sec
  Timing buffered disk reads:  222 MB in  3.01 seconds =  73.77 MB/sec
  Timing cached reads:   7512 MB in  2.00 seconds = 3758.49 MB/sec
  Timing buffered disk reads:  240 MB in  3.02 seconds =  79.50 MB/sec

before each run the cache was dropped with:
   echo 3>  /proc/sys/vm/drop_caches

store temp files in RAM:
   mount -t tmpfs -o size=1000M /ram /ram
   export TMPDIR=/ram

multidisk sort with 4x 50M inputs
$ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null

Could you compare:

time -v sort -m <(sort /q/0/50M) \
                <(sort /q/1/50M) \
                <(sort /q/2/50M) \
                <(sort /q/3/50M) > /dev/null


reply via email to

[Prev in Thread] Current Thread [Next in Thread]