guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

guile performance - Ackermann function: way slower than emacs, slower st


From: Ken Raeburn
Subject: guile performance - Ackermann function: way slower than emacs, slower still if compiled
Date: Tue, 4 Aug 2009 05:28:33 -0400

I implemented Ackermann's function A(m,n), a recursive function with scary performance characteristics, based on the definition given at wikipedia involving lots of +1 and -1 operations... A(0,n) takes constant time to compute assuming bounded integer sizes, A(1,n) takes linear time, A(2,n) is quadratic, A(3,n) is exponential... A(4,n) takes a really long time if n=1, and for anything higher you can just give up now. (A(4,2) is in the neighborhood of 2**65536, and A(4,3) is about 2 raised to that power. And it gets there by adding or subtracting 1 in each invocation.) Optimizations are possible, like A(1,n)=n+2, A(2,n)=2n+3, A(3,n)=2**(n+3)-3, but I didn't encode those optimizations.

I wrote both elisp and scheme versions; I changed the recursion to an explicitly-managed stack of values, to avoid some of the dynamic-vs- lexical argument binding issues, turn the native stack usage into large numbers of cell allocations and discards, and avoid exceeding the maximum Lisp evaluation depth Emacs permits. Not surprisingly, it's not too fast, and Emacs spends a lot of time in garbage collection passes even with the byte-compiled version. My time-test function here counts seconds of real time (on an 8-core Mac without a lot of other stuff going on, so CPU usage is probably very close):

ELISP> (time-test '(ackermann 3 1))
1.6e-05
ELISP> (time-test '(ackermann 3 6))
0.036018
ELISP> (time-test '(ackermann 3 7))
0.16574899999999998
ELISP> (time-test '(ackermann 3 8))
0.6457170000000001
ELISP> (time-test '(ackermann 3 9))
2.482042
ELISP> (time-test '(ackermann 4 0))
1.4e-05
ELISP> (time-test '(ackermann 4 1))
642.746514
ELISP>

With the installed guile-1.8.7 binary and the interpreted Scheme version of the code, performance was worse:

% time guile -l ack.scm -c '(ackermann 3 1)'
0.009u 0.003s 0:00.01 0.0%      0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 6)'
0.409u 0.078s 0:00.52 90.3%     0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 9)'
25.300u 4.540s 0:29.85 99.9%    0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 4 1)'
6016.150u 1122.109s 1:58:59.32 99.9%    0+0k 8+4io 149pf+0w
%

Yes, that's just shy of two hours for the last one, where Emacs took less than 11 minutes.

(Did I make some silly mistake in translating, that causes the Scheme version to be far less efficient?)

It's not a terribly fair comparison, for various reasons, but moving on...

I built and installed 1.9.1, so I could test with more recent code:

% time guile -l ack.scm -c '(ackermann 3 1)'
0.026u 0.006s 0:00.03 66.6%     0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 6)'
0.761u 0.141s 0:00.90 100.0%    0+0k 0+0io 0pf+0w
% time guile -l ack.scm -c '(ackermann 3 9)'
47.242u 8.644s 0:55.89 99.9%    0+0k 0+0io 0pf+0w
%

Since this behaved worse than 1.8.6, I skipped the 4,1 test. I tested the compiled version:

% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 1))'
0.012u 0.004s 0:00.01 100.0%    0+0k 0+0io 0pf+0w
% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 6))'
0.772u 0.333s 0:01.10 100.0%    0+0k 0+0io 0pf+0w
% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 9))'
48.728u 22.016s 1:10.75 99.9%   0+0k 0+0io 0pf+0w
%

Not much better than loading the .scm file, and only better for small values of n... in fact, with m=3 and n>=6 the performance seems worse in the precompiled case:

% time guile -l ack.scm -c '(ackermann 3 10)'
181.997u 34.928s 3:36.94 99.9%  0+0k 0+0io 0pf+0w
% time guile -c '(begin (load-compiled "ack.go") (ackermann 3 10))'
196.678u 85.618s 4:42.34 99.9%  0+0k 0+0io 0pf+0w
%

Poking at it a couple of times with Apple's process sampling tool, it does seem to be spending a bit of time in garbage collection -- but scm_leave_guile (called from scm_async_click via scm_pthread_mutex_lock) and getrusage (called from scm_c_get_internal_run_time via mytime) also seem to show up on the stack a significant fraction of the time, and each of those appears to be a comparable if not bigger time sink.

According to the execution timing with the compiled code, GC actually doesn't seem to be a huge fraction of the run time, maybe 8-9%:

scheme@(guile-user)> ,t (ackermann 3 8)
2045
clock utime stime cutime cstime gctime
18.43 12.96  5.47   0.00   0.00   1.66
scheme@(guile-user)> ,t (ackermann 3 9)
4093
clock utime stime cutime cstime gctime
73.47 51.61 21.85   0.00   0.00   5.95
scheme@(guile-user)>

The latter is the case that took under three seconds in Emacs, remember.

Garbage collection is much more of a factor in the non-compiled version:

scheme@(guile-user)> (load "ack.scm")
scheme@(guile-user)> ,t (ackermann 3 8)
2045
clock utime stime cutime cstime gctime
21.97 19.74  2.21   0.00   0.00   7.76
scheme@(guile-user)> ,t (ackermann 3 9)
4093
clock utime stime cutime cstime gctime
86.26 77.37  8.88   0.00   0.00  30.20
scheme@(guile-user)>

It also looks like execution time measurements may add more to the non- compiled execution time, as with the measurements the compiled code was still faster at 3,9.

In retrospect, this is much heavier on the interpretation or execution of some of the simple operations than I intended, and Emacs benefits a lot from opcodes like "add1", and I'm more interested specifically in exercising allocation and GC at the moment, so it's not that useful a test for me after all. But I thought I'd send this along anyways in case someone was interested. In particular the fact that the performance of the precompiled version gets worse as n grows might be something to investigate, unless the cause is already known....

Ken

Attachment: ack.scm
Description: Binary data

Attachment: ack.el
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]