[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] Re: strange, strange bug

From: Camm Maguire
Subject: [Gcl-devel] Re: strange, strange bug
Date: 26 May 2005 19:01:58 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Greetings, and thanks for the report!  I believe it is now fixed.  A
new package is building -- I'll try to install it tonight.

This was difficult due to the randomness involved.  The issue went
away when using gdb, for example.  I'd like to build you a debugging
version of the CLtL1 image, say under /u/camm/gcl-2.6.6twc, and wonder
if you find another bug, whether you might be able to boil it down to
a reproducible example in the gdb version.  I.e., perhaps you could
test first with the fully optimized build as before (as the
optimization itself may introduce bugs), and then if you find one, try
to reproduce or find a similar one which arises under gdb on the
debugging version.  I know this sounds like extra work, so if it is
too much, please don't hesitate to say so -- it would just speed
things up from this end quite a lot.

To use gdb, do for example

gdb /u/camm/gcl-2.6.6twc/unixport/saved_gcl
GNU gdb 5.3
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb) r   -dir /p/lib/gcl-2.6.6twc/usr/lib/gcl-2.6.6/unixport/ -libdir 
/p/lib/gcl-2.6.6twc/usr/lib/gcl-2.6.6/ -eval '(setq si::*allow-gzipped-file* 
t)' -eval '(setq si::*tk-library* "/p/lib/gcl-2.6.6twc/usr/lib/tk8.3")' 

The stuff after the 'r' (for 'run') above is just to reproduce the
command line given by the gcl wrapper script installed under /p.  Most
likely, a simple 'r' to start the program will suffice.  Then you get
a normal lisp prompt, and can test as usual.

The experimental fixnum_times function which accelerated your
benchmark is in this build, as well as some stack clearing code which
enables collection of huge lists.  This is rather large patch, but it
appears stable, building maxima, acl2, axiom, and all self tests
including the ansi ones.  I need to clean it up before committing for
sure -- please let me know if you therefore feel that testing now is a
waste of time.  If you do not feel so, it is of course very helpful
for me.  This is just a note to say that more changes are likely ahead
before this gets pushed into a released version.

Feedback is also appreciated on the question of, assuming bugs are all
closed, how useful the improvements herein are.  At some point, we'll
have to make a decision on whether they are too substantial to include
in a 2.6.x version quickly and must therefore wait for 2.7.0 at around
summer's end, or whether they are so critical that they are needed

Finally, it appears quite easy to make use of the address space below
0x8000000 for immediate fixnums.  This can be a compile time setting,
if I understand the -T switch to ld correctly, which should allow the
user to set the starting text address of the executable.  This would
give (using the x86 default for example) 1/32 of the fixnum range as
immediate, avoiding all allocation, memory indirection and gc costs,
but adding a comparison and a branch to every fixnum indirection
outside this range.  Your thoughts on if/when this is a win are most

Take care,

Robert Boyer <address@hidden> writes:

> Here is one of the strangest bugs I have ever encountered.  I think it is a
> gbc bug in twc.
> I have tried to package this up in a way that you could ftp and reproduce at
> your site, but I have failed at that several times.  I think that this bug
> may be sensitive to something crazy like the length of some pathname.
> I hope that I have set the protections (to gcl) so that you can do the
> following when logged into UTCS on any public linux machine.
>   cd /u/boyer/nqthm-2nd/nqthm-1993
>   time /p/bin/gcl-2.6.6twc < fi.text > fi.out
>   tail fi.out
> The existence of the bug is the
>   #<FREE OBJECT 09bd1dd0>
> that is printed out by the final tail command.
> I believe that before the final gc (in fi.text) 
>   (symbol-plist '*1*OS-WAITING-INPUT-HANDLER-PATH2)
> has a reasonable value, but after the (si::gbc t)
> we get the nonsense value #<FREE OBJECT 09bd1dd0>.  I don't think
> that the bug is actually caused by this particular final gc.  I'm
> not sure but I think that memory has crapped out before then.
> It takes about 73 seconds on elgin.cs.utexas.edu.  I'm not sure how long it
> will take on other linux boxes.
> Bob
> P. S.  This is such a very weirdly sensitive bug that I enclose below my PATH.
> % printenv
> PWD=/u/boyer/nqthm-2nd/nqthm-1993
> TERM=dumb
> mypathset=1
> PRINTER=tower
> METAMAIL_TMPDIR=/u/boyer/metamail
> EDITOR=emacs
> REMOTEHOST=cpe-24-28-64-156.austin.res.rr.com
> HOST=elgin.cs.utexas.edu
> GROUP=prof
> OSTYPE=linux
> VENDOR=intel
> HOSTTYPE=i386-linux
> DISPLAY=localhost:14.0
> SSH_TTY=/dev/pts/5
> SSH_CONNECTION= 52614 22
> SSH_CLIENT= 52614 22
> SHELL=/bin/csh
> MAIL=/u/boyer/mailbox
> PATH=/u/boyer/bin-override:/lusr/bin:/lusr/X11R6/bin:/lusr/tex/bin:/lusr/transcript/bin:/lusr/ssl/bin:/lusr/java5/bin:/sbin:/bin:/usr/sbin:/usr/bin:/lusr/kde/bin:/lusr/gnome2/bin:/lusr/gnome/bin:/lusr/share/hosts:/lusr/netpbm/bin:/lusr/gnu/bin:.:/u/boyer/bin:/lusr/gnu/bin:/projects/hvg/compBio/programs:/p/bin
> HOME=/u/boyer
> LOGNAME=boyer
> USER=boyer
> % 

Camm Maguire                                            address@hidden
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]