axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Axiom-developer] Value stack overflow bug


From: root
Subject: [Axiom-developer] Value stack overflow bug
Date: Sun, 25 May 2003 02:52:22 -0400

I've been chasing a very difficult bug for several months and
I recently achieved a breakthru. Unfortunately it isn't a happy
result. 

The nature of the bug is that several hundred of the thousand+
algebra compiles die because of a "value stack overflow". I've 
been micro-stepping the compiler (you REALLY begin to appreciate
what a billion instructions per second mean :-) ). Since this
is "known correct" code in some sense the bug must have something
to do with the lisp but it has to be proven.

Originally I tried playing with the memory allocations using 
init-memory-config but that had no effect. Schelter had introduced
init-memory-config for Axiom because we wrestled this problem before.

Later I tried using (setq si::*multiply-stacks* 16) at system build
time but that also had no effect. However the "no effect" was
subtle. I read the code and it appears that this setq can only be done
at the top level command prompt (for obvious reasons). The larger
stacks are NOT saved using save-system so they cannot be saved into an
image. Nor can they be automatically expanded in an Axiom image as you
never enter the top level REP loop.

I worked under the assumption that the larger stacks WERE saved
in the image and thus never saw the benefit which led me to the
conclusion that it was still an Axiom bug and thus continued to
chase it by hand.

Tonight I set out to rebuild Axiom's databases to try to eliminate
them as the source of the failure. The database build also failed
with a Value stack overflow. So I built a complete system by 
hand-loading each routine into a clean lisp. It still failed with
a value stack overflow. So I restarted lisp, setq'd the *multiply-stacks*
variable at the command prompt and reloaded Axiom one routine at a time.
This time the database build died with a Segmentation Violation.

Ah Ha! Rebuilding the databases is a stand-alone program that is
completely independent of the compiles. Now there are two paths that
fail due to Value stack overflow. Expanding the stacks (successfully
this time) causes a segfault.

The same behavior happens in GCL 2.4.1 and GCL 2.5.2

So, the conclusions? 
(1) GCL's default value stack size is too small to handle Axiom
(2) The (setq si::*multiply-stacks* n) method
    (a) can't be used to build Axiom's images
    (b) causes a memory failure with a setfault
(3) I need some way to hardcode a larger value stack size during
    GCL image build. It looks like setting VSSIZE is the correct
    method but I can't (yet) figure out where this should be
    changed as it is #included in some obscure way.

Please help.

Tim






reply via email to

[Prev in Thread] Current Thread [Next in Thread]