gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] Memory corruption?


From: Daniel Bump
Subject: [gnugo-devel] Memory corruption?
Date: Fri, 14 Dec 2001 12:11:21 -0800

Here is a problem that is mysterious to me. To see the
problem you need the current snapshot of the semeai code
which I put up as a patch called semeai_1_17.1 linked
from devel.html. This is not in the cvs.

After compiling this code, execute:

gnugo -l [file] --quiet --decide-semeai G4/B3

on the following file:

(;GM[1]FF[4]RU[Japanese]SZ[19]
PW[GNU Go 3.1.16 (level 10) load and print]PB[Unknown]HA[0]KM[0.0]GN[GNU Go 
3.1.16 load and print Random Seed 1008361938] 
AW[ab][bb][eb][cc][ec][fc][ad][ed][be][ce][ei][ej][ek][el][bm][fm][im][jm][km][bn][cn][dn][en][fn][hn][ln][mn][nn][on][pn][bo][co][fo][go][ho][ko][lo][qo][ro][fp][bq][cq][eq][lq][oq][qq][cr][dr][er][fr][gr][hr][lr][mr][nr][qr][cs]
AB[ba][ca][ea][fb][hc][kc][nc][dd][gd][pd][de][ee][qe][bf][cg][eh][ih][jh][di][hi][ki][cj][gj][jj][qj][bk][dk][fk][ik][bl][dl][fl][hl][kl][am][cm][dm][em][gm][om][qm][an][rn][ao][mo][bp][cp][dp][ep][gp][kp][lp][aq][dq][fq][gq][hq][jq][mq][sq][br][ir][rr][es][fs][gs][hs]
PL[W]
IL[ap]
)

I get a segmentation fault. There seems to be no reason for this,
so there is probably some memory corruption.

What happens is that there are two calls to catalog goal
at the top of do_owl_analyze_semeai. After applying the
patch and compiling without optimization, I execute the following
commands from gdb:

set args -l zot.sgf --quiet --decide-semeai G4/B3 --semeai-variations 1000
del
b silent_examine_position
run
fin
b owl.c:333
c
p owlb.goal[21]

Here 21 is A19. We get:

$2 = 0 '\000'

You see that all is well at this point. We are about to execute:

wormsa = catalog_goal(owla, goal_wormsa);

After stepping over this command, we find that owlb is
corrupted, even though the command that was just executed
should have nothing to do with owlb.

(gdb) p owlb.goal[21]
Cannot access memory at address 0x11a.

The next command,

wormsb = catalog_goal(owlb, goal_wormsb);

causes a segmentation fault.  There looks to be nothing in
catalog_goal that would cause a problem here.

Incidentally the crash goes away if you replace the (m,n)
loop by a pos loop in catalog_goal():

   for (k = 0; k < MAX_WORMS; k++)
     goal_worm[k] = NO_MOVE;
 
-  for (m = 0; m < board_size; m++)
-    for (n = 0; n < board_size; n++) {
-      int pos = POS(m, n);
+  for (pos = BOARDMIN; pos < BOARDMAX; pos++)
+    if (ON_BOARD(pos)) {
       if (owl->goal[pos] && board[pos]) {
       int origin = find_origin(pos);
       int found_one = 1;

This may be sweeping a real problem that needs to be found
under the rug. According to http://web.mit.edu/ghudson/info/corruption,

> Bugs in any kind of program can be divided into two categories: bugs
> which cause visibly incorrect behavior as soon as the incorrect code
> executes, and bugs which corrupt state (variable values, data
> structures, files, etc.) such that correct code behaves incorrectly
> later on.  Bugs in the former category are usually easy to find and
> fix, since you can simply trace the execution of the code up to the
> point of the incorrect behavior and see which piece of code failed.
> Bugs in the second category are often much harder to find, since there
> is no simple way of determining where the state of the program was
> corrupted.

Since the function which causes owlb to mysteriously change does not 
look suspicious, we may be faced with a problem of this type. There
are a couple of programs that might help us. One is called purify,
which is a commercial program. The other is called Checker and it
is free software. It is available in the alpha directory at
ftp.gnu.org but so far I've been unable to compile it.

Dan









reply via email to

[Prev in Thread] Current Thread [Next in Thread]