bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Performance optimisations: Results


From: Juergen Sauermann
Subject: Re: [Bug-apl] Performance optimisations: Results
Date: Sun, 06 Apr 2014 17:55:50 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130330 Thunderbird/17.0.5

Hi,

the current solution seems to be (master == thread-0):

    for (int c = 1, c < core_count; ++c)   thread-0 waits for thread-c

One could instead do something this:

    for (int dc = 1; dc < core_count); dc += dx)
        {
           parallel(
               thread-n waits for thread-n+dc ) if (thread-n+dc < core_count)
                   )
        }


Same for start-up. In our case the time would be reduced from 80*tsync to 7*tsync which
would give us about 11 times the current performance.

/// Jürgen


On 04/06/2014 05:28 PM, Elias Mårtenson wrote:
What part of the join should be parallel? The join itself is essentially the main thread waiting for all other threads to finish. What is it that can be parallelised?

Regards,
Elias


On 6 April 2014 22:32, Juergen Sauermann <address@hidden> wrote:
Hi,

one more plot that might explain a lot. I have plotted the startup times and the total times
vs. the number of cores (1024÷1024 array).

For small core counts (i.e. < 6...10), the startup time is moderate and the total time decreases rapidly.

For more cores, the total time increases again. This is most likely because the timer per core becomes negligible
and the join time begins to dominate the total time.

Both start and join times seem to be more-or-less linear with the number of cores which is probably because
the master thread is doing all that. It would have been smarter to do the start and join in parallel which
would then cost O(log P) instead of O(P) for P cores.

/// Jürgen





reply via email to

[Prev in Thread] Current Thread [Next in Thread]