help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: (resolved) Cfservd coredumps


From: Frank Ranner
Subject: Re: (resolved) Cfservd coredumps
Date: Sat, 03 Dec 2005 00:14:33 +1100
User-agent: Mozilla Thunderbird 1.0.6 (Windows/20050716)

Martin, Jason H wrote:
After troubleshooting it some more, I determined that the problem occurred due 
to a corrupted cfservd ChecksumDatabase.  The backtrace from the 2.1.14 version 
was:

#0  0x40045d95 in __bam_split () from 
/usr/local/cfengine-test/BerkeleyDB.4.3.27/lib/libdb-4.3.so
#1  0x4003a719 in __bam_c_put () from 
/usr/local/cfengine-test/BerkeleyDB.4.3.27/lib/libdb-4.3.so
#2  0x400827c3 in __db_c_put () from 
/usr/local/cfengine-test/BerkeleyDB.4.3.27/lib/libdb-4.3.so
#3  0x4007cfb4 in __db_put () from 
/usr/local/cfengine-test/BerkeleyDB.4.3.27/lib/libdb-4.3.so
#4  0x40088704 in __db_put_pp () from 
/usr/local/cfengine-test/BerkeleyDB.4.3.27/lib/libdb-4.3.so
#5  0x080723e8 in ChecksumChanged (filename=0x441fc91c "/SOMEFILE",
    digest=0x441fd91c "»-½ï{buåÏEì?÷2% ", warnlevel=2, refresh=1, type=109 'm') 
at misc.c:384
#6  0x0804fbe5 in CompareLocalChecksum (conn=0x4097c580, sendbuffer=0x442029ac 
"",
    recvbuffer=0x442039ac "MD5 /SOMEFILE") at cfservd.c:2858
#7  0x0804d0b0 in BusyWithConnection (conn=0x4097c580) at cfservd.c:1486
#8  0x0804c559 in HandleConnection (conn=0x4097c580) at cfservd.c:1135
#9  0x400edc6f in pthread_start_thread (arg=0x44204be0) at manager.c:279

Removing the checksumdatabase file resolved the problem.  Is there any way to 
modify CFE to be insulated from dbm corruption?

Thank you,
-Jason Martin

I tried to sort out the checksum db corruption problem some time ago. The problem is that the database is opened and closed for every checksum and in a multi-cpu, mutil-treaded setup corruption is inevitable.

What I tried was to open the db file at the start with the correct multi-thread environment and allow each thread to use the global handle. This seemed promising, and the performance was good, but was still not stable. Eventually I turned off the checksum database.

I am about to try again, this time dumping Berkeley-DB and using sqlite for the checksum database. Sqlite is an embedded sql database, which handles multi-threading natively. It is very lightweight and fast, especially for simple queries.

One enhancement to the checksum process is to store the file mtime along with the checksum and path. This means if the source file is modified a new checksum will be computed and stored. Indeed, an external program could periodically read the database, stat the files, compare mtimes and update checksums for any changed files.

Regards,
Frank Ranner


reply via email to

[Prev in Thread] Current Thread [Next in Thread]