[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: Foreign objects facility

From: Doug Evans
Subject: Re: RFC: Foreign objects facility
Date: Tue, 29 Apr 2014 09:08:54 -0700

On Mon, Apr 28, 2014 at 9:08 AM, Andy Wingo <address@hidden> wrote:
> [...]
> 1.4 Foreign Object Memory Management
> ------------------------------------
> Once a foreign object has been released to the tender mercies of the
> Scheme system, it must be prepared to survive garbage collection.  In
> the example above, all the memory associated with the foreign object is
> managed by the garbage collector because we used the 'scm_gc_'
> allocation functions.  Thus, no special care must be taken: the garbage
> collector automatically scans them and reclaims any unused memory.
>    However, when data associated with a foreign object is managed in
> some other way--e.g., 'malloc''d memory or file descriptors--it is
> possible to specify a "finalizer" function to release those resources
> when the foreign object is reclaimed.
>    As discussed in *note Garbage Collection::, Guile's garbage collector
> will reclaim inaccessible memory as needed.  This reclamation process
> runs concurrently with the main program.  When Guile analyzes the heap
> and determines that an object's memory can be reclaimed, that memory is
> put on a "free list" of objects that can be reclaimed.  Usually that's
> the end of it--the object is available for immediate re-use.  However
> some objects can have "finalizers" associated with them--functions that
> are called on reclaimable objects to effect any external cleanup
> actions.
>    Finalizers are tricky business and it is best to avoid them.  They
> can be invoked at unexpected times, or not at all--for example, they are
> not invoked on process exit.  They don't help the garbage collector do
> its job; in fact, they are a hindrance.  Furthermore, they perturb the
> garbage collector's internal accounting.  The GC decides to scan the
> heap when it thinks that it is necessary, after some amount of
> allocation.  Finalizable objects almost always represent an amount of
> allocation that is invisible to the garbage collector.  The effect can
> be that the actual resource usage of a system with finalizable objects
> is higher than what the GC thinks it should be.
>    All those caveats aside, some foreign object types will need
> finalizers.  For example, if we had a foreign object type that wrapped
> file descriptors--and we aren't suggesting this, as Guile already has
> ports --then you might define the type like this:
>      static SCM file_type;
>      static void
>      finalize_file (SCM file)
>      {
>        int fd = scm_foreign_object_signed_ref (file, 0);
>        if (fd >= 0)
>          {
>            scm_foreign_object_signed_set_x (file, 0, -1);
>            close (fd);
>          }
>      }
>      static void
>      init_file_type (void)
>      {
>        SCM name, slots;
>        scm_t_struct_finalize finalizer;
>        name = scm_from_utf8_symbol ("file");
>        slots = scm_list_1 (scm_from_utf8_symbol ("fd"));
>        finalizer = finalize_file;
>        image_type =
>          scm_make_foreign_object_type (name, slots, finalizer);
>      }
>      static SCM
>      make_file (int fd)
>      {
>        return scm_make_foreign_object_1 (file_type, (void *) fd);
>      }
>    Note that the finalizer may be invoked in ways and at times you might
> not expect.  In particular, if the user's Guile is built with support
> for threads, the finalizer may be called from any thread that is running
> Guile.  In Guile 2.0, finalizers are invoked via "asyncs", which
> interleaves them with running Scheme code; *note System asyncs::.  In
> Guile 2.2 there will be a dedicated finalization thread, to ensure that
> the finalization doesn't run within the critical section of any other
> thread known to Guile.
>    In either case, finalizers run concurrently with the main program,
> and so they need to be async-safe and thread-safe.  If for some reason
> this is impossible, perhaps because you are embedding Guile in some
> application that is not itself thread-safe, you have a few options.  One
> is to use guardians instead of finalizers, and arrange to pump the
> guardians for finalizable objects.  *Note Guardians::, for more
> information.  The other option is to disable automatic finalization
> entirely, and arrange to call 'scm_run_finalizers ()' at appropriate
> points.  *Note Foreign Objects::, for more on these interfaces.
>    Finalizers are allowed to allocate memory, access GC-managed memory,
> and in general can do anything any Guile user code can do.  This was not
> the case in Guile 1.8, where finalizers were much more restricted.  In
> particular, in Guile 2.0, finalizers can resuscitate objects.  We do not
> recommend that users avail themselves of this possibility, however, as a
> resuscitated object can re-expose other finalizable objects that have
> been already finalized back to Scheme.  These objects will not be
> finalized again, but they could cause use-after-free problems to code
> that handles objects of that particular foreign object type.  To guard
> against this possibility, robust finalization routines should clear
> state from the foreign object, as in the above 'free_file' example.
>    One final caveat.  Foreign object finalizers are associated with the
> lifetime of a foreign object, not of its fields.  If you access a field
> of a finalizable foreign object, and do not arrange to keep a reference
> on the foreign object itself, it could be that the outer foreign object
> gets finalized while you are working with its field.
>    For example, consider a procedure to read some data from a file, from
> our example above.
>      SCM
>      read_bytes (SCM file, SCM n)
>      {
>        int fd;
>        SCM buf;
>        size_t len, pos;
>        scm_assert_foreign_object_type (file_type, file);
>        fd = scm_foreign_object_signed_ref (file, 0);
>        if (fd < 0)
>          scm_wrong_type_arg_msg ("read-bytes", SCM_ARG1,
>                                  file, "open file");
>        len = scm_to_size_t (n);
>        SCM buf = scm_c_make_bytevector (scm_to_size_t (n));
>        pos = 0;
>        while (pos < len)
>          {
>            char *bytes = SCM_BYTEVECTOR_CONTENTS (buf);
>            ssize_t count = read (fd, bytes + pos, len - pos);
>            if (count < 0)
>              scm_syserror ("read-bytes");
>            if (count == 0)
>              break;
>            pos += count;
>          }
>        scm_remember_upto_here_1 (file);
>        return scm_values (scm_list_2 (buf, scm_from_size_t (pos)));
>      }
>    After the prelude, only the 'fd' value is used and the C compiler has
> no reason to keep the 'file' object around.  If 'scm_c_make_bytevector'
> results in a garbage collection, 'file' might not be on the stack or
> anywhere else and could be finalized, leaving 'read' to read a closed
> (or, in a multi-threaded program, possibly re-used) file descriptor.
> The use of 'scm_remember_upto_here_1' prevents this, by creating a
> reference to 'file' after all data accesses.  *Note Garbage Collection
> Functions::.
>    'scm_remember_upto_here_1' is only needed on finalizable objects,
> because garbage collection of other values is invisible to the program -
> it happens when needed, and is not observable.  But if you can, save
> yourself the headache and build your program in such a way that it
> doesn't need finalization.

Hi. fwiw, this is all great stuff (and welcome!), but I think it's in
the wrong place in the docs.
The issue comes up in multiple places, so I would write it ("it" being
the general prose regarding issues with finalizers) once and refer to
it from all the places that use finalizers.
That would make this section shorter (more likely to be read) and
*everywhere* that has to deal with finalizers benefits from far more
easily finding all this great text.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]