Why kernel REVOCABLE COPY is difficult

l4-hurd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Why kernel REVOCABLE COPY is difficult

From:	Jonathan S. Shapiro
Subject:	Why kernel REVOCABLE COPY is difficult
Date:	Sun, 09 Oct 2005 15:37:02 -0400

[This topic is fairly simple, so I will deal with it first.]

On Sun, 2005-10-09 at 15:45 +0200, Matthieu Lemerre wrote:

> So we want to have both operations available: the servers would COPY
> their capabilities to clients, and clients would REVOCABLE COPY their
> capabilities to servers.

Clients do not have to do this. A capability can safely be transferred
to an untrusted server as long as the capability itself does not permit
revocation.

What I want to focus on here is your implied suggestion that a kernel
should implement *both* COPY and REVOCABLE COPY. I would like to explain
why this turns out to be very complicated.

The in-kernel COPY implementation is very simple. Bits are copied from
one memory location to another. Depending on the capability
implementation, a linked list may be updated. The important issue is
that no storage is allocated by a COPY operation. This lets the
operation be very fast. In EROS, the limiting factor in capability copy
is the need to update some linked lists. In Coyotos this will be
eliminated, and capability copy will be a simple bit copy within the
kernel.

The difficulty with REVOCABLE COPY is the need for book-keeping. In
order to later perform the revocation, we must somewhere store a record
that lets us revoke the capabilities later. In L4, this is the mapping
database node, and it is allocated by the kernel. In EROS, this is the
Node, and it is allocated by user-mode code.

The essential issue is: who pays for the storage? If the storage
allocated is kernel storage, then after some number of REVOCABLE COPY
operations we will exhaust this storage, and even privileged
applications will no longer be able to perform them.

To solve a similar problem in EROS, we once considered introducing
quotas on thread creation. We concluded that (a) this was possible, and
(b) the resulting system was too complicated to use in practice. The
fundamental problem is that the allocator cannot tell when the using
threads are done with the resource and cannot rely on the using threads
to acknowledge when they are done (either because of failure or
hostility; it doesn't matter). From a practical design perspective, the
effect is that the allocator's resource becomes hostage to client
behavior. This is true even if -- in theory -- the server has the
necessary tools to revoke. The problem is that the server cannot revoke
without breaking contract.

Adding reference counts does not solve this. It provides reliable notice
when all clients have disappeared, but it does so at significant cost.
For an in-memory system the cost is acceptable. For a persistent system
(like EROS) it is a serious problem. Ultimately, however, a client that
is willing to spend some resource (the storage for an inactive thread)
can continue to make the REVOCABLE COPY overhead structure unreclaimable
(because there is an outstanding reference), and the allocator remains
hostage to the client.

In EROS/Coyotos, we do not consider any of these outcomes acceptable. We
have a very foundational, common sense design rule in EROS/Coyotos:

  The party who pays must always be permitted to deallocate. If
  this results in a violation of behavior contract, then either
  the contract is mis-designed or the wrong party is paying.

In practice, the problem usually turns out to be that the wrong party is
paying. This is why we have first-class storage allocators (space banks)
that a client can pass to a server.

The main problem with the EROS wrapper design is performance: it
requires a number of IPCs. Fortunately, it appears to be a very rare
design pattern in our system.

Both L4sec and Coyotos have *considered* a design where there would be
in-kernel heaps that were backed by an out-of-kernel storage allocator.
This would allow many cases of these allocations to be implemented
directly by the kernel but still be charged to the user. I am not sure
if the L4sec design today retains this idea.

In Coyotos, we still think that this is a good idea for performance, but
we are not convinced that we can formally verify a kernel that uses this
idea. The problem is that *any* heap implementation violates the type
system, and this makes things difficult for provers. We have decided for
now to continue to rely on a wrapper-like design, mainly because there
does not appear to be a compelling performance motivation for change.

However: this means that we have deferred the issue, not abandoned it. I
believe that the right model is to think of these heaps as in-kernel,
caching front-ends to space banks. This is conceptually straightforward,
and it would drop into the existing architecture fairly easily. At the
moment, however, I want to stick to what we absolutely know how to do so
that we can get a kernel working quickly.

The one thing that makes me put these in-kernel heaps on the "deferred"
list rather than the "forget" list is my concern that hardware context
switch performance is getting steadily worse, and that microkernel
designs may have to compromise purity to remain competitive. This is
disturbing, but it may be inevitable.

shap

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Why COPY != SIMULATED COPY, (continued)

Prev by Date: Re: Comparing "copy" and "map/unmap"
Next by Date: Re: Comparing "copy" and "map/unmap"
Previous by thread: Re: Why COPY != SIMULATED COPY
Next by thread: Re: Why kernel REVOCABLE COPY is difficult
Index(es):
- Date
- Thread