[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: opengroupware and gnustpe-base WAS: [OGo-Developer] gsmake2 branch s
From: |
Sebastian Reitenbach |
Subject: |
Re: opengroupware and gnustpe-base WAS: [OGo-Developer] gsmake2 branch segmentation fault |
Date: |
Fri, 07 Mar 2008 07:16:58 +0100 |
Hi Richard,
Richard Frith-Macdonald <richard@tiptree.demon.co.uk> wrote:
>
> On 19 Feb 2008, at 06:42, Sebastian Reitenbach wrote:
>
> > Hi,
> >
> > Richard Frith-Macdonald <richard@tiptree.demon.co.uk> wrote:
> >>
> >> On 18 Feb 2008, at 09:01, Sebastian Reitenbach wrote:
> >>
> >>> However, OGo, on Linux, *BSD usually compiled against libFoundation,
> >>> seems
> >>> to have a problem when compiled against gnustep-base. Stable or
> >>> unstable
> >>> doesn't matter, it ends up, with the same segmentation fault, see
> >>> forwarded
> >>> mail below. This segfault was observed on OpenBSD(i386) and Linux.
> >>>
> >>> There seem to be differences in libFoundation and gnustep-base with
> >>> regards
> >>> to NSAutoreleasePool, but I have no real idea, what the actual
> >>> problem is.
> >>>
> >>> Any idea, what could be the problem? or how to investigate further?
> >>> any more
> >>> information I could provide?
> >>
> >> Looking at the code, there only appear to be a few possibilities, and
> >> none of them are really issues with NSAutoreleasePool itsself:
> >> 1. somehow the pool is being deallocated recursively so that it's
> >> trying to release a nil object ... but unless I'm missing something,
> >> that seems to be ruled out by the fact that we don't see it in the
> >> stacktrace,.. unless one of the objects being deallocated releases
> >> the
> >> pool in another thread so that it doesn't appear in the trace ...
> >> hard
> >> to see how that would happen.
> >> 2. an object being deallocated returns a bad value from
> >> instanceMethodForSelector: or methodForSelector: ... obviously rather
> >> unlikely as this would really need the object to be of a class which
> >> overrides the method in a buggy way, or would need to be corrupt.
> >> 3. by far the most likely would be if the pool is releasing an object
> >> which has already been deallocated ... all that's needed for this to
> >> happen is a retain/release bug somewhere in the code.
> >>
> >> To check for (1) you could add code to test to see if anObject is
> >> nil ... if it is, then the pool is being emptied recursively.
> >> To check for (2) you could add code to test the result of those calls
> >> to see if it is zero.
> >> To check for (3) you could try replacing the fast
> >> GSObjCClass(anObject) with the slower [anObject class]. This will
> >> effectively remove the method lookup optimisation from the code here,
> >> but will more reliably catch the lookup of the class of a deallocated
> >> object, making it clearer what the underlying problem is.
> >>
> >> Approaching the issue from another direction, you could try calling
> >> [NSObject enableDoubleReleseCheck: YES] right at the start of the
> >> program ... this will get the autorelease system to spot cases where
> >> an object is autoreleased too often (though it can't spot cases where
> >> the object is released too often after being added to a pool).
> >>
> >> You could also try running with the environment variable setting
> >> 'NSZombieEnabled=YES' ... which will effectively leak all memory (so
> >> you need to find out how to reproduce the crash after running the
> >> program for as short a time as possible before you try this) but will
> >> log when you try to deallocate an object which was already
> >> deallocated.
> >>
> >>
> > thanks for these suggestions, I think they will keep me busy for a
> > time.
>
> Any news on this?
>
> If you haven't had time to look at it, perhaps you could send me the
> test software you are using and some instructions on how to reproduce
> the problem and I'll have a look.
There was not that very much progress, only a little bit, as I got
distractet by some other stuff, and my knowledge about the memory management
and these autoreleasepools are not the best. See this thread on the ogo
developer mailing list:
http://mail.opengroupware.org/pipermail/developer/2008-February/003427.html
there is where I got stuck now.
My system is a xen image, using a remote database, ldap and imap, but I
could easily add that stuff to the xen image, so that all is in one image.
Or I could give you shell and web access to this one image, and create a
user for you, so that you can try login to ogo, as the bug happens, after
you try login.
Your offer is very welcome.
cheers
Sebastian