I discovered I can consistently provoke problems by commenting out all
services in the config file, and simply starting monit on my ubuntu desktop.
To get a better backtrace I turned -O2 off and rebuilt so the functions
weren't inlined:
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7c1a770 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7c1bef3 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7c4fd0b in __fsetlocking () from /lib/tls/i686/cmov/libc.so.6
#4 0xb7c578bd in mallopt () from /lib/tls/i686/cmov/libc.so.6
#5 0xb7c57a44 in free () from /lib/tls/i686/cmov/libc.so.6
#6 0x0805095a in _gc_inf (i=0x80c4328) at gc.c:427
#7 0x0804fe88 in _gc_service (s=0x80a24f4) at gc.c:173
#8 0x0804fda3 in _gc_service_list (s=0x80a24f4) at gc.c:148
#9 0x0804fb38 in gc () at gc.c:92
#10 0x08053b29 in do_exit () at monitor.c:442
#11 0x08053421 in main (argc=1244508, argv=0xb7d25000) at monitor.c:120
Nothing else calls _gc_inf, so this may just be where glibc happens to
notice the heap corruption caused by an earlier double-free.
Will
On 12/21/06, * Will Bryant* <address@hidden
<mailto:address@hidden>> wrote:
Hi guys,
I eventually managed to reproduce the double-free bug I mentioned on
the general list, which happened on one of my production servers and
more or less knocked it over. Here's the stack trace:
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7ca1770 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7ca2ef3 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7cd6d0b in __fsetlocking () from /lib/tls/i686/cmov/libc.so.6
#4 0xb7cdf1cd in free () from /lib/tls/i686/cmov/libc.so.6
#5 0xb7ce053e in calloc () from /lib/tls/i686/cmov/libc.so.6
#6 0x08060d18 in xcalloc (count=1, nbytes=16) at xmalloc.c:93
#7 0x0806d6f2 in addeventaction (_ea=0x80ad5a0, failed=1, passed=1)
at p.y:2334
#8 0x0806dc77 in createservice (type=3, name=0x811b308 "renderd",
value=0x80e6260 "/var/run/modellure/renderd.pid",
check=0x805f010 <check_process>) at p.y:1799
#9 0x08071909 in yyparse () at p.y:705
#10 0x08071e3a in parse (controlfile=0x8098410 "/etc/monit/monitrc")
at p.y:1633
#11 0x080525d4 in main (argc=7, argv=0xbfbd9614) at monitor.c:225
Is that enough info to diagnose the problem? It looks to me like it
could just be due to an uninitialized structure, but I don't
understand enough of the code yet to say.
Will
------------------------------------------------------------------------
_______________________________________________
monit-dev mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/monit-dev