bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: parallel external sort implementation


From: Chen Guo
Subject: Re: [PATCH] sort: parallel external sort implementation
Date: Thu, 4 Mar 2010 01:55:42 -0800 (PST)

Hi Padraig,
    Actually we're in the same class. They were assigned external sort, my group
worked on internal sort. Our patch submission's imminent, looks like you guys
are gonna be busy with code reviews :-)




----- Original Message ----
> From: Pádraig Brady <address@hidden>
> To: Joey Degges <address@hidden>
> Cc: Report bugs to <address@hidden>; Matt Ball <address@hidden>
> Sent: Thu, March 4, 2010 1:41:23 AM
> Subject: Re: [PATCH] sort: parallel external sort implementation
> 
> On 04/03/10 06:01, Joey Degges wrote:
> > Hello,
> >
> > Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
> > using pthreads. The threading approach taken here is to break up input files
> > into groups based on physical device and sort each of those groups in
> > parallel.
> > This allows for storage and processing resources to be more fully utilized
> > during the sort process.
> 
> Thanks for looking at this.
> Are you aware that there is a group already working on this?
> http://lists.gnu.org/archive/html/bug-coreutils/2010-02/msg00223.html
> 
> >
> > Below is a set of experimental results and the statistics of the machine
> > that
> > they were run on. The test below show speed improvements from 45% - 64%.
> >
> > In this patch the method used to determine which device a particular file is
> > on
> > relies on comparisons between struct stat ->  st_dev fields. This leads to
> > some
> > problems in cases where files exist on different partitions of the same
> > disk.
> 
> Issues like this are why it might be better to split the data
> externally to sort. I've commented before that a threading
> solution may only be appropriate for working on a single file:
> http://lists.gnu.org/archive/html/bug-coreutils/2009-10/msg00179.html
> 
> > Preliminary tests have shown that under these situations some performance
> > improvements are still seen, but they are not as dramatic as in the ideal
> > case.
> >
> >
> > system details
> > CPU: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz
> > cache size: 4096 KB
> > RAM: 2G
> >
> > disk details:
> > /q/0:
> >   Timing cached reads:   7556 MB in  2.00 seconds = 3781.13 MB/sec
> >   Timing buffered disk reads:  320 MB in  3.00 seconds = 106.50 MB/sec
> > /q/1:
> >   Timing cached reads:   7772 MB in  2.00 seconds = 3888.09 MB/sec
> >   Timing buffered disk reads:  224 MB in  3.03 seconds =  74.00 MB/sec
> > /q/2:
> >   Timing cached reads:   7444 MB in  2.00 seconds = 3724.34 MB/sec
> >   Timing buffered disk reads:  222 MB in  3.01 seconds =  73.77 MB/sec
> > /q/3:
> >   Timing cached reads:   7512 MB in  2.00 seconds = 3758.49 MB/sec
> >   Timing buffered disk reads:  240 MB in  3.02 seconds =  79.50 MB/sec
> >
> > before each run the cache was dropped with:
> >    echo 3>  /proc/sys/vm/drop_caches
> >
> > store temp files in RAM:
> >    mount -t tmpfs -o size=1000M /ram /ram
> >    export TMPDIR=/ram
> >
> > multidisk sort with 4x 50M inputs
> > $ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
> 
> Could you compare:
> 
> time -v sort -m <(sort /q/0/50M) \
>                  <(sort /q/1/50M) \
>                  <(sort /q/2/50M) \
>                  <(sort /q/3/50M) > /dev/null
> 
> cheers,
> Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]