[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Persistence (was: Future Direction of GNU Hurd?)
Persistence (was: Future Direction of GNU Hurd?)
Sat, 20 Mar 2021 19:07:35 +0100
On Tue, Mar 16, 2021 at 11:39:39AM -0700, Jonathan S. Shapiro wrote:
> On Sun, Mar 14, 2021 at 11:23 AM Olaf Buddenhagen <email@example.com>
> The Coyotos "Endpoint" object contains a *process* capability to the
> receiving process (note: *not* an entry capability!). It's a scheduler
> activation design, so the effect of message arrival is (a) mark in the
> shared page that a message is waiting and (b) if the process is
> sleeping, wake it up so that it notices. The tricky part in scheduler
> activations on a multiprocessor is that these two things can be in a
> race. Anyway, the receiving process typically holds the "master"
> capability to the endpoint, *so it is in a position to change the
> process capability*. If it does so, the recipient process changes.
> This is very similar to the notion of a receive port or receive
That's interesting: I was indeed vaguely considering the use of some
sort of "master" capabilities, instead of passing copies of actual
receiver capabilities: this is probably more in line what is actually
needed for the persistence mechanism. (Not entirely sure -- I haven't
really thought it through...)
One of my constraints however is that my intitial implementation will be
on top of legacy systems (mostly GNU/Linux; and hopefully also the
existing Mach-based Hurd) -- and while top efficiency isn't paramount on
such a non-native implementation, it needs to be at least reasonably
efficient (especially on Linux) to be useful at all... I'm not sure
whether a master capability can be implemented efficiently with existing
Having said that, it could still be interesting for a later native
implementation, as long as the differences in the underlying mechanism
can be transparent to applications...
> The reason this is OK is that the *original* recipient process could
> equally well implement this by forwarding the message to the *new*
> recipient process. That is: changing the process capability in the
> endpoint is logically equivalent to forwarding the message.
> Note that this would not be possible in the Mach reply capability
> design, because that capability cannot be forwarded. It requires an
> explicit reply capability that can be forwarded.
> If I remember correctly (hey, it's only been 38 years), Mach is even
> weirder, because a reply port is part of the *process* state rather
> than the *thread* state. A message received by one thread can be
> replied by a different thread in the same process, but cannot be
> replied by a different process. This creates a strange asymmetry.
Not sure what you mean? In the Mach variants I'm familiar with, the only
special thing about reply capabilities is that they are *typcially*
implemented using send-once rights: which can only be moved, not copied.
I don't see how this would affect the ability to transparently forward
either the reply capability itself, or manually forward individual
> > That's funny: the thing that (I think) I need receiver capabilities
> > for, is actually for implementing a (not quite orthogonal)
> > persistence mechanism :-)
> Feel free to steal what we did.
I don't think there is much I can steal from either EROS or Coyotos,
when it comes to persistence... You probably don't remember this: but
the concept of transparent orthogonal persistence by persisting the
entire memory image of each process never sat well with me. Rather, I
intend to distinguish between transient memory and persistent state --
as I said, it's not orthogonal.
> One the problem with orthogonal persistence is that it doesn't
> actually simplify much in networked systems. Two processes running on
> the same machine will be restored in a mutually consistent way, but
> processes running on *different* machines will not. This tends to mean
> that communications across the machine perimeter behave very
> differently, and a lot of processes need to know about it.
> In abstract, we know how to build a multi-machine cluster that acts
> *as if* it were a single failure domain
> But even if you do this, there will still be "foreign" systems you
> need to talk to. The problem of independent failure domains isn't
> going to go away, and once you have to deal with it *anywhere* the
> incentive to expand individual failure domains is greatly reduced.
Exactly my thinking as well... I thus have no intention of attempting
any kind of persistence beyond machine boundaries -- except through
mirroring, for different purposes.
> The other problem is that you sometimes *need* to violate
> orthogonality. For example, you don't want to lose a committed banking
> transaction if the system has to restart before the next checkpoint.
> KeyKOS, EROS, and Coyotos *all* have ways to bypass the checkpoint
> rules for this kind of situation.
Is it possible to briefly explain how this bypassing works?
> So far as I know, Coyotos did not borrow from Viengoos.
I didn't think it did :-) In my undestanding however, you did make some
changes to the IPC mechanism in Coyotos, after Neal and Marcus discussed
async requirements with you?... So my thinking was that you probably
discussed this until you came up with a design that felt right to both
parties -- which then ended up both in Coyotos and later in Viengoos...
That's just speculation on my part though, since I'm not familiar with
the contents of your discussions beyond those on this list. Frankly, I
don't think we need an archeological examination of this -- it was just
a thought that I found interesting :-)