bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] ScalarBenchmark for inner and outer products


From: Juergen Sauermann
Subject: Re: [Bug-apl] ScalarBenchmark for inner and outer products
Date: Fri, 17 Oct 2014 18:48:46 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.0

Hi David,

you may see a non-zero startup cost even though an operation shows cost 0.
This is because startup cost is averaged over all monadic or all dyadic operations.

The reason for zero startup cost on the products is most likely due to a reorg of the counter numbers.
I forgot to update ScalarBenchmark.apl; fixed in SVN 489.

In general the OP and STAT columns in 
ScalarBenchmark.apl should match the ]PSTAT command, e.g if:

     ]pstat 38
╔═════════════════╦════════════╤══════════╤══════════╤══════════╤══════════╗
║ A f.g B         ║          0 │        0 │        0 │        0 │        0 ║
╚═════════════════╩════════════╧══════════╧══════════╧══════════╧══════════╝

then the STAT number for 
f.g  should be 38 in  ScalarBenchmark,apl.

/// Jürgen



On 10/17/2014 05:58 PM, David B. Lamkins wrote:
I'm seeing zero start-up costs for inner and outer products when running ScalarBenchmark.apl.

  ===================== Mat1_IRC +.× Mat1_IRC  ===============================

Benchmarking start-up cost for Mat1_IRC +.× Mat1_IRC ...
 Length   Sequ Cycles   Para Cycles   Linear Sequ Linear Para 
 ======   ===========   ===========   =========== =========== 
     25             0             0             0           0 
     25             0             0             0           0 
     25             0             0             0           0 
     25             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      4             0             0             0           0 
      4             0             0             0           0 
      4             0             0             0           0 
      1             0             0             0           0 

regression line sequential:            0 + 0×N cycles
regression line parallel:              0 + 0×N cycles

  ===================== Vec1_IRC ∘.× Vec1_IRC  ===============================

Benchmarking start-up cost for Vec1_IRC ∘.× Vec1_IRC ...
 Length   Sequ Cycles   Para Cycles   Linear Sequ Linear Para 
 ======   ===========   ===========   =========== =========== 
     25             0             0             0           0 
     25             0             0             0           0 
     25             0             0             0           0 
     25             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
     16             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      9             0             0             0           0 
      4             0             0             0           0 
      4             0             0             0           0 
      4             0             0             0           0 
      1             0             0             0           0 

regression line sequential:            0 + 0×N cycles
regression line parallel:              0 + 0×N cycles

But then in the summary section -- just above ]PSTAT -- I see:

-------------- Mat1_IRC +.× Mat1_IRC -------------- 
average sequential startup cost:     359 cycles
average parallel startup cost:       832 cycles
per item cost sequential:              0 cycles
per item cost parallel:                0 cycles
parallel break-even length:          not reached

-------------- Vec1_IRC ∘.× Vec1_IRC -------------- 
average sequential startup cost:     359 cycles
average parallel startup cost:       832 cycles
per item cost sequential:              0 cycles
per item cost parallel:                0 cycles
parallel break-even length:          not reached

Here the startup costs are nonzero, but the per-item costs are all zero.

This doesn't look right... Or am I missing something?

In case it might shed some additional light, here's the final section of the ]PSTAT output. The rest looks reasonable except for epsilon-underbar, which reports all zeroes.

╔═════════════════╦════════════╤══════════╤══════════╤══════════╤══════════╗
║    Function     ║            │        N │  ⌀ VLEN  │ ⌀ cycles │ cyc÷VLEN ║
╟─────────────────╫────────────┼──────────┼──────────┼──────────┼──────────╢
║   f B overhead  ║ 18446744003448130869 │      283 │     1993 │ 34818579233229 │ 17466187239 ║
║ A f B overhead  ║ 18446743954621671206 │     1114 │       84 │ 1447585256996 │ 17221844259 ║
║   scalar B      ║  130198460 │      283 │     3873 │   460065 │      118 ║
║ A scalar B      ║   91680403 │     1114 │      949 │    82298 │       86 ║
║ clone B         ║ 233950109373 │ 75391125 │      131 │     3103 │       23 ║
║ A f.g B         ║ 911702656227 │    40046 │      163 │ 22766385 │   139671 ║
║ A ∘.g B         ║ 9809803882 │      121 │  1000000 │ 81072759 │       81 ║
║ A ⍴ B           ║       9071 │        3 │       27 │     3023 │      111 ║
║ PrintBuffer(B)  ║  135760049 │     1168 │       25 │   116232 │     4649 ║
╚═════════════════╩════════════╧══════════╧══════════╧══════════╧══════════╝
 


    


reply via email to

[Prev in Thread] Current Thread [Next in Thread]