bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GSoC idea proposal


From: Glen Lenker
Subject: GSoC idea proposal
Date: Fri, 27 Mar 2009 16:45:40 -0700
User-agent: Mutt/1.5.18 (2008-05-17)

I saw that the GNU project is a mentoring organization for this years
Google Summer of Code. Would it be possible to extend my work on
optimizing sort to this summer?

There are a few ideas on the table:

1) Reducing the amount of time wasted with pthread_join

This idea is outlined in the TODO. This would be an optimization off
of the patch Paul Eggert and I submitted a few days ago. The general
idea is increase the number of threads and use inter-thread
communication so that the merge process is more streamlined than it is
now.

2) Merge Insertion and List Merge sort

This is also outlined in the TODO. I could try my hand at merge
insertion and try to see why list merge sort did not improve performance.

3) Integrating our work with the other groups work

As you may know, we only parallelized the internal sort algorithm, the
other group was in charge of the external sort algorithm. They opted
to implement theirs in OpenMP.

4) "In-between sort"

As Jim suggested here:

http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00119.html

and P?draig Brady a moment later, we should also try doing something
more "UNIXy", where the general idea looks like this:

    sort -m \
        <(sort <(dice -k1,4 INPUT)) \
        <(sort <(dice -k2,4 INPUT)) \
        <(sort <(dice -k3,4 INPUT)) \
        <(sort <(dice -k4,4 INPUT)) \
      > out

where dice split INPUT into 4 slices "dice -k1,4 INPUT" grabbing the
first.

I would be willing to work on all 4 of these this summer. These last
ten weeks I think I've accustomed myself to the sort code base and
these additions seem very doable. Does this sound like a good idea?

-- 
Glen Lenker




reply via email to

[Prev in Thread] Current Thread [Next in Thread]