guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wip-ports-refactor


From: Andy Wingo
Subject: wip-ports-refactor
Date: Wed, 06 Apr 2016 22:46:28 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Hi,

I have been working on a refactor to ports.  The goal is to have a
better concurrency story.  Let me tell that story then get down to the
details.

So, right now Guile has a pretty poor concurrency story.  We just have
pthreads, which is great in many ways, but nobody feels like
recommending this to users.  The reason is that when pthreads were
originally added to Guile, they were done in such a way that we could
assume that data races would just be OK.  It's amazing to reflect upon
this, but that's how it is.  Many internal parts of Guile are vulnerable
to corruption when run under multiple kernel threads in parallel.
Consider what happens when you try to load a module from two threads at
the same time.  What happens?  What should happen?  Should it be
possible to load two modules in parallel?  The system hasn't really been
designed as a whole.  Guile has no memory model, as such.  We have
patches over various issues, ad-hoc locks, but it's not in a state where
we can recommend that users seriously use threads.

That said, I've used threads and had good results :) And Guix uses them
in places though they had to avoid loading modules from threads.  One of
the areas I patched over in 2.2 was ports: I just added a lock to almost
every port.  But that slowed things down especially on non-x86, and it
would be nice to find a better solution.

Anyway, one part of fixing Guile's concurrency story is to be able to
tell users, "yes, just use kernel threads".  That would be nice.  It
will take some design but it's possible I guess.

However!  Concurrency doesn't necessarily mean parallelism.  By that I
mean, it's possible to have concurrent requests to a web server, for
example, but without involving kernel threads.  But with Guile we have
no story here.  Racket has something it calls "threads", which are in
user-space; Go has its goroutines; Node has a whole story around
callbacks and the main loop; but for Guile there is not much.
Guile-GNOME does this in a way, but not very well, and not in a
nice-to-program way.  More appropriate is 8sync, a new project by Chris
Webber that is designed to be a kind of user-space threading library for
Guile.

I did give a try at prototyping such a thing a long time ago,
"ethreads".  Ethreads are user-space threads, which are really delimited
continuations with a scheduler.  If the thread -- the dynamic extent of
a program that runs within a prompt -- if the thread would block on I/O,
it suspends itself, returning to the scheduler, and then the scheduler
resumes the thread when I/O can continue.  There's an epoll loop
underneath.

That hack seemed to work; I even got the web server working on it, and
ran my web site on it for a while.  The problem was, though, that it
completely bypassed ports.  It made its own port types and buffers and
everything.  That's not really Guile -- that's a library.

                            *  *  *

Which brings us to the port refactor.  Ultimately I see ports as all
having buffers.  These buffers can be accessed from Scheme.  Normal I/O
goes to the buffer first.  When the buffers need filling or emptying,
Scheme code can call Scheme code to do that.  There could be Scheme
dynamic parameters defining whether filling/emptying blocks -- if it
doesn't block, then if the read would block it could call out to some
context to suspend the thread.  Since it's all Scheme code, that
continuation can be resumed as well -- the delimited continuation does
not capture a trampoline through C.  The buffer itself is represented as
a bytevector with a couple of cursors, which gives us some basic
threadsafety without locks on the Scheme side -- Scheme always checks
that accesses are within bounds.

But, currently in Guile 2.0 and in master, buffering is handled by the
port implementation.  That means that there is no buffer to expose to
Scheme, and no real path towards implementing efficient I/O operators
that need to grovel in a buffer from Scheme.  It also means that there's
no easy solution for non-blocking I/O, AFAIU.

The wip-port-refactor branch is a step towards centralizing buffering
management within the generic ports code.  It thins the interface to
port implementations, instead assuming that the read/write functions are
to unbuffered mutable stores, as Guile is the part handling the
buffering.  I've documented what I can in the branch.

The commits before the HEAD are fairly trivial I think; it's the last
one that's a doozy.  It doesn't yet remove locks; there's still a lot of
locks, and it's hard to know what we can do without locks given the
leeway give to modern C compilers.  But it's a step.

Going forward we need to define a Scheme data type for ports, and to
allow the read/write procedures to be called from Scheme, and to allow
Scheme implementaitons of those procedures.  We also need to figure out
how to do non-blocking I/O, both on files and non-files; should we set
all our FD's to O_NONBLOCK?  How does it affect our internal
interfaces?  I do not know yet.

There's still space for different schedulers.  I wouldn't want to
include a scheduler and a thread concept in Guile 2.2.0 I don't think --
but if we can build it in such a way that it seems natural, on top of
ports, then it sounds like a good idea.

Review welcome, especially from Mark, Ludovic, and Chris.

Cheers,

Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]