[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: debugging guile runtime

From: rixed
Subject: Re: debugging guile runtime
Date: Thu, 1 Sep 2011 13:32:07 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

> > #1 ports are not thread safe (and any other thread safety issues) ;
> In general I think this issue needs to be split up between issues with
> port buffers and other issues; while it might be helpful to you to have
> a tracker bug, it's not helpful to me to conflate things that require
> different fixes.
> So!  As you say, not thread-safe.  But can we fix it in 2.0?  I am not
> sure.  We can add a mutex onto the end of scm_t_port.  But it seems that
> ignoring ABI compatibility might allow us to focus on the solution more
> easily.

I can't see what solution you envision that requires breaking the
API. Are you thinking about a more functional style API for ports ?
I confess that the only solution I envisaged for now was merely to add a
global lock on all ports operations (I'm a little afraid about a
per-port mutex but that's certainly because I spend the last 2 days
hunting a race condition in my C program :-))

Anyway, as I'm not very familiar with the runtime I though you (or
someone who is) would suggest the best solution to me ; so I did not
try anything yet :-)

> What is your target?  How much are you willing to do yourself?

Ideally, I'd like the list to hint me toward a quick fix that I could
implement quickly, so that the problem is at least solved and the program
I work on can go to production, thus effectively testing the fix before
commit it into guile. The alternative being me rewriting some 15 lines
of scheme into 150 lines of C (and suffering sarcasm from my colleagues
I might have a week or so to devote to this matter before my team favor the
other solution (do it in C). A full week of time of paid work on the runtime!
Too bad I'm a guile and scheme newbie, so unfortunately it's probably roughly
equivalent to 2 hours for any of you.

> > #2 fork may freeze in some occurrence ;
> I assume this is because of the port-table mutex bug that you posted
> earlier?  We should be able to fix this with an atfork.

I was referring to the problem I posted a while ago about this small
scheme program that was deadlocking in open-pipe:

(use-modules (ice-9 popen)
                         (ice-9 threads))

(define (repeat n f)
  (if (> n 0)
                (repeat (- n 1) f))))

(define (forever f)
  (forever f))

(display "Spawn a thread that performs some writes\n")
(make-thread forever (lambda ()
                                           (display "write...\n")))

(display "Now exec some processes...\n")
(forever (lambda ()
                   (let ((pipe (open-input-pipe "sleep 0")))
                         (close-pipe pipe))))

I can't reproduce the bug with guile2, and honestly I can't say for sure
that I have seen it in the wild with guile 2 (although it's frequent
when the app runs with guile 1.8), but I have not performed many tests
with guile 2 yet. I'm about to upgrade one of our most used test server
with guile 2 to see how it behaves, so we will quickly know if it's
still relevant or not.

> > #3 the use of select prevent the extended app to open more than 1024
> >    files ;
> I recall something about this; can you give a link to a bug?  If there
> isn't one, can you file one?

I did not filled a bug at savannah, but posted a patch here.
For the record, here is the only part of the patch that's still relevant
for v2.0.2:

diff --git a/libguile/fports.c b/libguile/fports.c
index 0b84d44..f19d291 100644
--- a/libguile/fports.c
+++ b/libguile/fports.c
@@ -49,7 +49,9 @@
 #include <sys/stat.h>
+#ifdef HAVE_POLL_H
+#include <poll.h>
 #include <errno.h>
 #include <sys/types.h>
@@ -585,7 +587,14 @@ scm_fdes_to_port (int fdes, char *mode, SCM name)
 static int
 fport_input_waiting (SCM port)
+#ifdef HAVE_POLL
+  int fdes = SCM_FSTREAM (port)->fdes;
+  struct pollfd pollfd = { fdes, POLLIN, 0 };
+  if (poll(&pollfd, 1, 0) < 0)
+    scm_syserror ("fport_input_waiting");
+  return pollfd.revents & POLLIN ? 1 : 0;
+#elif defined(HAVE_SELECT)
   int fdes = SCM_FSTREAM (port)->fdes;
   struct timeval timeout;
   SELECT_TYPE read_set;

Patch for 1.8 was much heavier though, since select was used here and there
within the runtime (by fport_wait_for_input which vanished and by scm_accept
that's not using select any more). With the above patch a guile2 user can open
more than 1024 files (as long as he does not call explicitly select of course).

I'm using this and it seams to work (again, not heavily tested with guile2).

> > #4 fork does not close all open files.
> This won't change in 2.0.  You can do something in an atfork, but... I'm
> not sure this is the right thing.  The POSIX behavior was
> well-considered, and we should be hesitant to change it without a good
> reason.

Yes this was discussed already, I was wrong and the current behavior is
correct (yet I still think POSIX is weird here but that's another matter).

> > #5 new syntax definitions are not loaded by compiler
> Hmm?

Also, already discussed. Mark convinced me that this is not a bug and I should
stop using load for loading code but use the module system instead.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]