bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] first shot at parallel APL


From: Elias Mårtenson
Subject: Re: [Bug-apl] first shot at parallel APL
Date: Fri, 24 Oct 2014 13:16:56 +0800

OK, I started some tests on my 80-core machine. At first I decided to run the exact same thing as what you ran above.

As you can see, before I set the dyadic threshold, I got the expected results. After setting it, the same command hangs with 200% CPU usage. At the time I'm writing this mail, it's been sitting like that for about 30 minutes or so.

Here's the log of what I did. GNU APL was compiled with CORE_COUNT_WANTED=-3

      ∇Z ← NCPU time LEN;T;X;tmp
[1]       ⎕SYL[26;2] ← NCPU
[2]       X ← LEN⍴2J2
[3]       T ← ⎕TS
[4]       tmp ← X⋆X
[5]       Z←1 1 1 24 60 60 1000⊥⎕TS - T
[6]
      (⍳8) ∘.time 10⋆⍳7
0 0 1 7 40 414 4180
0 0 0 3 38 409 4178
0 0 0 4 39 412 4212
0 0 1 4 38 416 4204
0 0 0 5 39 417 4225
0 0 0 4 39 417 4232
0 0 0 4 38 417 4245
0 0 0 4 38 417 4241
      )COPY 5 FILE_IO
loading )DUMP file /home/emartenson/src/apl/wslib5/FILE_IO.apl...

      1 FIO∆set_dyadic_threshold  '⋆'
8888888888888888888
      (⍳8) ∘.time 10⋆⍳7
(Hangs here)

Regards,
Elias

On 26 September 2014 20:04, Juergen Sauermann <address@hidden> wrote:
Hi Elias,

if you used a recent SVN then you need to set the thresholds (vector size) above which
parallel execution is performed:

      (⍳4) ∘.time 10⋆⍳7
0 0 1 3 29 254 2593
0 0 1 2 25 252 2618
0 0 1 2 26 258 2682
0 0 1 2 26 263 2866
     
      )COPY 5 FILE_IO
loading )DUMP file /usr/local/lib/apl/wslib5/FILE_IO.apl...

      1 FIO∆set_dyadic_threshold  '⋆'  
⍝ returns the previous threshold for dyadic ⋆
8070450532247928832

      (⍳4) ∘.time 10⋆⍳7
0 0 0 2 30 250 2590
0 0 0 1 15 149 1580
0 0 0 1 11 113 1225
0 3 0 0 12 103 1120

I am currently working on a benchmark workspace that determines the optimal thresholds
for the different scalar functions (and those thresholds will beome the future defaults). Right
now the default thresholds are so high that you will always have sequential execution.

/// Jürgen


On 09/26/2014 07:22 AM, Elias Mårtenson wrote:
I've tested this code, and I don't see much of an improvement as I increase the core count:

Given the following function:

    ∇Z ← NCPU time LEN;T;X;tmp
      ⎕SYL[26;2] ← NCPU
      X ← LEN⍴2J2
      T ← ⎕TS
      tmp ← X⋆X
      Z←1 1 1 24 60 60 1000⊥⎕TS - T
    ∇

I'm running this command on my 8-core workstation:

      (⍳8) ∘.time 10⋆⍳7
0 0 0 2 19 188 2139
0 0 1 2 19 189 2147
0 0 1 2 19 210 2256
0 0 0 2 19 194 2427
0 0 0 3 28 284 3581
0 0 0 3 27 280 3510
0 0 0 3 27 284 3754
0 0 0 3 27 279 3637

Regards,
Elias

On 26 September 2014 13:05, Elias Mårtenson <address@hidden> wrote:
Thanks, I have merged the necessary changes.

Regards,
Elias

On 22 September 2014 23:50, Juergen Sauermann <address@hidden> wrote:
Hi,

I have finished a first shot at parallel (i.e. multicore) GNU APL: SVN 480.

This version computes all scalar functions in parallel if the ravel length of the result exceeds 100.
This can make the computation of small (but still > 100) vectors slower than if they were computed sequentially.
Therefore parallel execution is not yet the default. To enable it:

    ./configure
    make parallel
    make
    sudo make install


The current version uses some linux-specific features, which will be ported to other platforms later on (if possible).
./configure is supposed to detect this.

Some simple benchmarks are promising:

      X←1000000⍴2J2   ⍝ 1 Mio complex numbers
     
      ⎕SYL[26;2]←1   ⍝ 1 core
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
246
     
      ⎕SYL[26;2]←2   ⍝ 2 cores
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
136
     
      ⎕SYL[26;2]←3   ⍝ 3 cores
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
102
     
      ⎕SYL[26;2]←4   ⍝ 4 cores
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T

91

The next step will be to find the break-even points of all scalar functions, so that parallel execution is
only done when it promises some speedup.

Elias, the PointerCell constructor has got one more argument . I have updated emacs-mode and sql accordingly.
- you may want to sync back.

/// Jürgen








reply via email to

[Prev in Thread] Current Thread [Next in Thread]