[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: parallel external sort implementation

From: Joey Degges
Subject: Re: [PATCH] sort: parallel external sort implementation
Date: Thu, 4 Mar 2010 03:06:30 -0800

2010/3/4 Pádraig Brady <address@hidden>

> On 04/03/10 09:55, Chen Guo wrote:
>> Hi Padraig,
>>     Actually we're in the same class. They were assigned external sort, my
>> group
>> worked on internal sort. Our patch submission's imminent, looks like you
>> guys
>> are gonna be busy with code reviews :-)
> OK thanks :)
> If we do see a benefit from using threads in the internal sort,
> then it's probably better for sort to manage the threads
> for external sort also.

Thanks for the clarification Chen! My group is also finishing up another
patch that applies a similar approach to the merge process: instead of an
n-way merge, carry out a tree of binary merges from different disks in

Here are the results you asked for using the current sort. I've included the
verbose output but those stats should only apply to the final merge stage.
It appears that the two methods offer similar speedups.

$ time -v sort -m <(sort /q/0/50M) <(sort /q/1/50M) <(sort /q/2/50M) <(sort
/q/3/50M) > /dev/null
        User time (seconds): 1.39
        System time (seconds): 0.12
        Percent of CPU this job got: 22%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.86
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 531
        Voluntary context switches: 5117
        Involuntary context switches: 1
        File system inputs: 552

The patch presented here can be somewhat easily adapted to manage a parallel
internal sort. The general idea is to compute the total size of each device
file group and use the relative difference in total size to assign threads
to each internal sort. This functionality does not exist yet.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]