bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Fun With Benchmarks


From: Elias Mårtenson
Subject: Re: [Bug-apl] Fun With Benchmarks
Date: Sun, 23 Aug 2015 19:04:02 +0800

Well, I've run the test and I have some results. They were somewhat unexpected as the time spent in Value::clone() is much less than it was for other tests. That said, the clone issue is mostly visible when manipulating very large arrays, which this test case do not. In this case, 9.26% of the time was spent in Value::clone() and its descendants.

The main consumer of CPU time in this test case is the reduction on + in the following command:

    Z←((¯1+⍴Z)⌊+/∧\'0'=Z)↓Z←D[⌽Z]

The +/ reduction uses 70% of the CPU time. This includes 28% performing the addition operation (Bif_F12_PLUS::eval_AB()). Another huge contributor was the call to Cell::to_value() which contributed 29.21%. Note that the 28% time spend in the addition and the almost 30% in to_value() are separate.

In other words, the addition and the value conversion consumes 60% of the total time, which is part of the reduction operation (70%).

Regards,
Elias

On 23 August 2015 at 05:21, fred <address@hidden> wrote:
Mike Duvos

Thank you for the correction. I have timed your code:

      ⎕IO←0

      ∇TIME X;TS

      ∇Z←SHOW X;I

      ∇Z←X TIMES Y;D;I;C

      ∇Z←FACTORIAL N;I

      TIME 'SHOW FACTORIAL 300'
30605751221644063603537046129726862938858880417357
69994167767412594765331767168674655152914224775733
49939147888701726368864263907759003154226842927906
97455984122547693027195460400801221577625217685425
59653569035067887252643218962642993652045764488303
88909753943489625436053225980776521270822437639449
12012867867536830571229368194364995646049816645022
77165001851765464693401122260347297240663332585835
06870150169794168850353752137554910289126407157154
83028228493795263658014523523315693648223343679925
45940952768206080622328123873838808170496000000000
00000000000000000000000000000000000000000000000000
000000000000000
22.977 Seconds.

Now, the code under GNU APL runs in comparable time to the
implementation in SNOBOL4, at least.

This is still not good. Maybe not horrible, though.

I rather suspect that the data copying that Elias Mårtenson alludes to
 is dominant in execution. The SNOBOL4 code has (probably) considerably
more interpretation overhead, and is forced to copy the numeric string
on each modification (strings are immutable). It hashes each string
into a global hash on each such modification. If the APL code is forced
into the same contortions (essentially, copying each vector), it should
perform at a similar speed. Given that it does execute at the "speed of
SNOBOL", I suspect that is what is going on.

Eagerly awaiting Elias' results.

FredW

On Sat, 2015-08-22 at 12:20 -0400, fred wrote:
> Ok, so infinite precision integer arithmetic takes over 50 seconds with

> GNU APL to compute 300!
>
> Um... not good. Actually, this is horrific.
>
> I will attempt to put this into perspective. I use the interpretive
> SNOBOL4 implementation from Griswold. This is code that implements a
> SNOBOL4 interpreter. The implementation is that the code implementing
> the interpreter (which was written in the 1960's) is macro-expanded
> into a C program, which is then compiled and run to actually interpret
> the SNOBOL4 language source.
>
> Ok? This is the SLOWEST SNOBOL4 implementation that I know...
>
> I used an infinite precision arithmetic package written 40 years ago
> (specifically, for education -- not for performance). Now, one of the
> reasons this package is slow is that it REDEFINES '+', '-', '*', '/'
> operators AT RUN TIME... Not only are the data types dynamic, the
> actual functions are also dynamic, and have been redefined.
>
> Now, if you are still with me -- an interpreter that is macro expanded
> to C running a run-time binding operator redefinition program in a
> language where strings are immutable, and must be completely
> hashed/copied on each change... and has complete mark/sweep garbage
> collection -- again, implemented in the macro expanded interpreter.
>
> Let us look at the code:
>
> -include 'INFINIP.INC'
>     infinip_start() ;* redefine basic math functions to work on strings
>     x = '1'
>     i = 1
>     l = 300
>     t = time()
> top gt(i, l) :s(btm)
>     x = x * i
>     i = i + 1 :(top)
> btm t = time() - t
>     output = x
>     output = t ' milliseconds'
> end
>
> Now, I had a problem running the Davos code (but I haven't attempted
> debugging - 52 seconds seemed extreme) -- but I assume 52 seconds is..
> um.. normal. I will run this code on a 1.5Ghz Intel i5 (this is my
> Linux tablet, a three year old Acer Iconia tablet):
>
> $ snobol4 -s ifact
> 30605751221644063603537046129726862938858880417357699941677674125947653
> 31767168674655152914224775733499391478887017263688642639077590031542268
> 42927906974559841225476930271954604008012215776252176854255965356903506
> 78872526432189626429936520457644883038890975394348962543605322598077652
> 12708224376394491201286786753683057122936819436499564604981664502277165
> 00185176546469340112226034729724066333258583506870150169794168850353752
> 13755491028912640715715483028228493795263658014523523315693648223343679
> 92545940952768206080622328123873838808170496000000000000000000000000000
> 00000000000000000000000000000000000000000000000
> 20315.096404 milliseconds
> SNOBOL4 statistics summary-
>           1.080 ms. Compilation time
>       20315.416 ms. Execution time
>        36453943 Statements executed, 19353232 failed
>         1834289 Arithmetic operations performed
>         7899675 Pattern matches performed
>             338 Regenerations of dynamic storage
>        1680.975 ms. Execution time in GC
>               0 Reads performed
>               2 Writes performed
>         557.290 ns. Average per statement executed
>        1794.398 Thousand statements per second
> $
>
> 20.3 seconds total. Since this was running with ONLY 8MB memory
> (default), and EVERY string change needed a new copy, 338 garbage
> collections where needed. That is mark&sweep. GC took 1.7 seconds (of
> that 20.3 seconds total)
>
> FredW
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]