Re: [Axiom-developer] Value stack overflow bug

Camm Maguire
Subject: Re: [Axiom-developer] Value stack overflow bug
Date: 04 Jun 2003 17:42:05 -0400
Greetings!  Just wanted to check how this is going.  I've just tried
multiplying the VSSIZE in h/ by four, and I seem to be
getting further along.  I forgot to do the *print-readably*, so am
retrying with that now.

Take care,

root <address@hidden> writes:

> I've been chasing a very difficult bug for several months and
> I recently achieved a breakthru. Unfortunately it isn't a happy
> result. 
> The nature of the bug is that several hundred of the thousand+
> algebra compiles die because of a "value stack overflow". I've 
> been micro-stepping the compiler (you REALLY begin to appreciate
> what a billion instructions per second mean :-) ). Since this
> is "known correct" code in some sense the bug must have something
> to do with the lisp but it has to be proven.
> Originally I tried playing with the memory allocations using 
> init-memory-config but that had no effect. Schelter had introduced
> init-memory-config for Axiom because we wrestled this problem before.
> Later I tried using (setq si::*multiply-stacks* 16) at system build
> time but that also had no effect. However the "no effect" was
> subtle. I read the code and it appears that this setq can only be done
> at the top level command prompt (for obvious reasons). The larger
> stacks are NOT saved using save-system so they cannot be saved into an
> image. Nor can they be automatically expanded in an Axiom image as you
> never enter the top level REP loop.
> I worked under the assumption that the larger stacks WERE saved
> in the image and thus never saw the benefit which led me to the
> conclusion that it was still an Axiom bug and thus continued to
> chase it by hand.
> Tonight I set out to rebuild Axiom's databases to try to eliminate
> them as the source of the failure. The database build also failed
> with a Value stack overflow. So I built a complete system by 
> hand-loading each routine into a clean lisp. It still failed with
> a value stack overflow. So I restarted lisp, setq'd the *multiply-stacks*
> variable at the command prompt and reloaded Axiom one routine at a time.
> This time the database build died with a Segmentation Violation.
> Ah Ha! Rebuilding the databases is a stand-alone program that is
> completely independent of the compiles. Now there are two paths that
> fail due to Value stack overflow. Expanding the stacks (successfully
> this time) causes a segfault.
> The same behavior happens in GCL 2.4.1 and GCL 2.5.2
> So, the conclusions? 
> (1) GCL's default value stack size is too small to handle Axiom
> (2) The (setq si::*multiply-stacks* n) method
>     (a) can't be used to build Axiom's images
>     (b) causes a memory failure with a setfault
> (3) I need some way to hardcode a larger value stack size during
>     GCL image build. It looks like setting VSSIZE is the correct
>     method but I can't (yet) figure out where this should be
>     changed as it is #included in some obscure way.
> Please help.
> Tim
Camm Maguire
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

