[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Making guardians a module?

From: Dirk Herrmann
Subject: Re: Making guardians a module?
Date: Thu, 7 Dec 2000 14:22:33 +0100 (MET)

On 6 Dec 2000, Marius Vollmer wrote:

> Dirk Herrmann <address@hidden> writes:
> > On Wed, 6 Dec 2000, Dirk Herrmann wrote:
> > 
> > > With respect to guardians one should know, that guardians are not a very
> > > well thought out thing at all (IMO).  The interface is nice, but the
> > > semantics are quite strange.  Assume for example, that every port that is
> > > created is placed into a guardian, to allow for a close operation if the
> > > file object gets lost.  Further, assume that the user puts a pair
> > > consisting of a string and a port into a guardian, with the intention to
> > > print out the string to the port as soon as the pair gets lost.  Now
> > > assume that the pair actually gets lost and with it the port.  Then, the
> > > pair can be fetched from the pair's guardian, and the port object can be
> > > fetched from the guardian that stores all ports on creation.  
> > > Unfortunately, there is no protection agains the case that the port is
> > > fetched first by some code, that then performs finalization (i. e. closes
> > > the port) and only later the pair is fetched from the other guardian.  
> > > The attempt to print out the pair's string on the port will fail, because
> > > the port is already closed.
> > 
> > To see this effect, here is an example program:
> I think we should fix this.  References from objects that are about to
> be returned from a guardian should not be considered dead, the
> referenced objects should be marked.
> In your example, the port would not be returned by the port-guardian
> as long as the cell is in the cell-guardian.
> Is this possible?

It is possible, but requires some tricks.  I have thought out an algorithm
that I hope will solve the problem without additional time overhead.  I
have only thought about a version that works with the current gc.  I will
think about whether it can be adapted to a generational gc.

Let's first assume that we have a function scm_gc_mark_successors, which
for an object X calls scm_gc_mark on every successor of X, but not X
itself.  This means, if we call scm_gc_mark_successors on an unmarked
object, but the object is marked after that, then the object references
itself.  Otherwise, if after the call to scm_gc_mark_successors the object
is still unmarked, it does _not_ reference itself.

No to the idea of the algorithm:

Every guardian has a zombie-list that contains objects, which are proven
to be only referenced by guardians, but not by any other object (including
itself).  Thus, when a zombie gets extracted from the guardian, it is
known that there can not possibly be another reference to that zombie.  
Thus, finalizing that zombie is safe.  During garbage collection, the
zombie objects can already be marked using scm_gc_mark during the normal
mark phase.

If we don't consider cyclic references, things are quite simple:  Besides
the zombies, there is a live-list of objects.  Every object that is handed
to the guardian is first placed into the live-list.  After the mark phase
of the gc, everything that can be reached from the heap is already marked.  
And, since zombies are also marked, everything that is reachable from the
known zombies is also marked.  In the live lists, there may now still
exist objects, that are not marked yet.  These objects could potentially
also be zombies.  We iterate through all these objects and apply
scm_gc_mark_successors to each of them.  Objects that are reachable to any
other object will thus become marked.  After all objects in the live list
have been treated like this, the only unmarked objects that are then to be
found in the live list are those, which are not reachable through any
other object.  These are finally marked and placed into the zombie list.  
Objects that are self-referencing, btw., will also be marked and thus
won't become zombies.

This algorithm is quite simple.  There are, however, some problems with
it:  It requires to implement scm_gc_mark_successors, which will hold a
lot of duplicate code with scm_gc_mark.  Further, circular references will
not be treated very elegantly.  On the positive side, this algorithm has
almost no performance overhead at all, since the marking process itself is
used to determine the topological ordering.  The algorithm could even
issue a warning if during gc a cycle of references among the guarded
objects is encountered.  Such a warning, however, would in exact terms be 
too conservative:  If objects that lie within a cycle can be reached from
a zombie, it was still possible that the code that retrieves the zombie
might somehow break up the cycle.

If one also wants to threat cyclic references, one would have to extend
the above algorithm by, for example, collecting objects that lie within
cycles into a cycle list.  After everything else (zombies and live
objects) has been marked, there may appear marked objects in the cycle
list.  These are the cycles that can be reached from the outside and may
potentially get broken.  Remaining unmarked objects in the cycle list can
not be reached from anywhere else.  Only for these objects warnings should
be issued, since these objects will never be retrieved from a
guardian.  One could even think about a special guardian, that provides
such objects in cyclic references to the user to allow debugging or such.

Best regards,
Dirk Herrmann

reply via email to

[Prev in Thread] Current Thread [Next in Thread]