l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reliability of RPC services


From: Marcus Brinkmann
Subject: Re: Reliability of RPC services
Date: Sun, 23 Apr 2006 22:09:33 +0200
User-agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.7 (Sanjō) APEL/10.6 Emacs/21.4 (i486-pc-linux-gnu) MULE/5.0 (SAKAKI)

At Sun, 23 Apr 2006 21:15:08 +0200,
Tom Bachmann <address@hidden> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Marcus Brinkmann wrote:
> > At Sat, 22 Apr 2006 19:00:54 +0200,
> > Marcus Brinkmann wrote:
> > Here is, in an informal manner, one of the invariants I mean: When a
> > process is in a call, and waiting on a reply (send-once) capability,
> > from a global system perspective one can identify a process "on which"
> > the caller is waiting: Namely the process holding the reply
> > capability.  This is true of course because the reply capability can
> > only be moved around (or invalidated by generating a message on it via
> > invocation or implicitely by dropping it).
> > 
> > This sounds like a useful property to have, because now one can, in
> > principle, always find a task responsible for another task waiting on
> > a call.
> 
> I think I understand this. But what can we gain from it? I mean, in
> practice, we do not have a global view of the system.

The user has a reasonably global view of the system to potentially
make this useful.

For example, in the current Hurd, if a user starts a translator, and
the translator has a bug (or is hostile), that causes all calls to the
server to hang indefinitely, the user may identify the translator
process and kill it.  This will then unblock all waiters and cause
them to proceed.

I don't know.  Maybe I just feel uneasy about programs blocking on an
event that semantically can not occur anymore, and no process can be
identified in the system to blame it for that.  I like call semantics.

Say you do the equivalent to "kill -9" in a system that does not have
this guarantee.  Then _every_ process that is currently in an
invocation to the killed program will be stuck indefinitely.  It seems
to me that you need _something_ in the system to compensate for that.
Otherwise you rely on all programs (that you potentially call) to be
always terminable in a friendly manner, ie with SIGTERM.  That seems
to be optimistic, given that there easily can be systematic and
probabilistic failures causing SIGTERM to not work anymore.

What is the _something_?  Garbarge collection?  Watchdogs?  Timers?
Send-once capabilities?  Send-on-destroy capabilities?  Real-time
semantics (ie, all programs can cope with timeout issues)?  Sessions
that are destroyed by whoever kills the invokee?  I am not sure.  I am
most familiar with send-once capabilities, so that is what I pursued
first.

What I do not want to overstate is the application for hostile
programs.  So far, my main concern here are buggy programs.  For
hostile or potentially hostile programs one needs additional
precautions: For example, if the potentially hostile program is run by
a different user, one may not have the authority to kill it.  However,
there is a range of problems here, and I feel that we have to look at
each case individually.

Thanks,
Marcus





reply via email to

[Prev in Thread] Current Thread [Next in Thread]