guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Making guardians a module?


From: Dirk Herrmann
Subject: Re: Making guardians a module?
Date: Wed, 6 Dec 2000 12:37:40 +0100 (MET)

On 5 Dec 2000, Mikael Djurfeldt wrote:

> Dirk Herrmann <address@hidden> writes:
> 
> > I just realized that it would be easily possible to extract guardians into
> > a module that was only initialized if necessary.  Below is a possible
> > implementation for it.  However, I am not sure whether it is a good idea
> > at all, and whether the solution below is good/portable at all.  At least,
> > it works here :-)
> 
> Probably they should belong (and be initialized) in the libguile
> library.  (If not, they should probably be broken out completely into
> a shared library of their own rather than just initialized.)
> 
> The reason why I suggest them continuing to be a part of libguile is
> that they feel like a very fundamental tool which we might want to
> have available also when implementing Guile itself.  It's the kind of
> tool which can make the structure of the code cleaner.  If they are a
> separate library, we won't have that freedom, so there will be a
> temptation to write nasty code instead.

I agree that we should provide guardians as a part of guile.  However, I
don't think that everything that guile provides has to be initialized from
the beginning, given that later initialization does not have any negative
consequences.  The reason is, that everything that is initialized in guile
takes up computation time and memory.  In the case of guardians this is
probably negligible, since it is just one binding (make-guardian) and two
additional functions that are performed with each garbage
collection.  Still, probably a lot of code does not need to use guardians.

Some thoughts about guardians:

With respect to guardians one should know, that guardians are not a very
well thought out thing at all (IMO).  The interface is nice, but the
semantics are quite strange.  Assume for example, that every port that is
created is placed into a guardian, to allow for a close operation if the
file object gets lost.  Further, assume that the user puts a pair
consisting of a string and a port into a guardian, with the intention to
print out the string to the port as soon as the pair gets lost.  Now
assume that the pair actually gets lost and with it the port.  Then, the
pair can be fetched from the pair's guardian, and the port object can be
fetched from the guardian that stores all ports on creation.  
Unfortunately, there is no protection agains the case that the port is
fetched first by some code, that then performs finalization (i. e. closes
the port) and only later the pair is fetched from the other guardian.  
The attempt to print out the pair's string on the port will fail, because
the port is already closed.

In other words, if you receive an object X from a guardian, you should not
access and use any futher objects that can be reached from X, because
these other objects may already be finalized if they were stored in a
different guardian.  Actually, I don't even understand the point given in
Dybvig's paper about objects that are registered more than once in more
than one guardian:  Any object can onle be finalized once.  If an object
can be retrieved from more than one guardian or several times from the
same one, the whole point of guardians, namely providing a finalization
method, is broken.

This makes the concept of guardians far less usefull than it appears at
the first glance.  Guile's current mechanism, which for each type provides
finalization code that is called at the very point of garbage collection
is much cleaner, though more limited.

I have thought about how the nice idea of guardians could be implemented
more cleanly, but this is quite difficult:  During garbage collection
there appear objects, that are only protected by guardians.  These
objects, however, may reference other objects, some of which are also
protected by guardians.  The 'safe' solution would be to find objects, 
which are _only_ referenced by guardians, and then put them into the
zombie list of the guardian.  Further, an object should be only received
exactly once from exactly one guardian.  (It may, however, be placed into
a guardian again after it was received.)

Why is this difficult?  Well, during gc you would have to determine a
reachability matrix between all objects that are potential zombies.  Smobs
are a special problem in this context.  More difficulties arise in cases,
where an object can be reached through itself.  Such objects would never
become zombies, except the cycle could be broken by coming from the
outside.

An implementation, however, could work as follows:  All objects, that are
stored in guardians, are stored in a hash table, together with the list of
guardians they are stored within.  Receiving an object from a guardian
means removing that object from all other guardians.  After the gc mark
phase, the zombies are determined according to the following scheme:  Any
unmarked object X that is reached from a reachable guardian is marked with
the function scm_gc_mark2, which differs from scm_gc_mark in the following
way:  scm_gc_mark2 does _not_ mark X itself, but only the other objects
that are reached from X.  For these objects scm_gc_mark2 simply calls
scm_gc_mark.  Thus, after a call to scm_gc_mark2 all objects that are
reached by X are marked.  X will only be marked, if it can be reached
through itself.  After this has been done for all objects that are reached
through all reachable guardians, only objects in a guardian remain
unmarked, which were only reachable through the guardian itself.  These
objects can then safely be put into the zombie lists.

With this implementation (assuming that it works and that I have not made
any mistakes :-) receiving an object from a guardian has the
semantics:  'This object can not be reached any more from any other
position, not even from a different guardian, thus it can be safely
finalized'.  It requires to implement scm_gc_mark2, which has a lot of
similarities with scm_gc_mark.  And, since it does not really build a
reachability matrix, it cannot solve the problem with circular
structures.

With a full reachability matrix, one could even drop objects completely
that are known to be never returned from a guardian.  For example, two
objects, that can reach each other, are in a circle.  As long as the
circle exists, none of them will be retrievable.  If, however, any of the
objects within a circle can be reached from the outside (namely from some
object that is not itself in some circle), the circle could potentially be
broken.  Objects within circles that can not be broken any more can simply
be dropped from the guardians.

Best regards,
Dirk Herrmann




reply via email to

[Prev in Thread] Current Thread [Next in Thread]