Re: Comparing "copy" and "map/unmap"

l4-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Comparing "copy" and "map/unmap"

From:	Jonathan S. Shapiro
Subject:	Re: Comparing "copy" and "map/unmap"
Date:	Sun, 09 Oct 2005 14:50:00 -0400
On Sun, 2005-10-09 at 15:45 +0200, Matthieu Lemerre wrote:
> > SOME COMMENTS
> >
> > The *good* part about the COPY model is that it makes object creation
> > and capability copy cheap. The *bad* part is that selective revocation
> > requires some application-level planning. In Coyotos, the way you do
> > selective revocation is that you insert a transparent forwarding object
> > (this is a kernel-implemented object) in front of the real capability,
> > and then pass the forwarding capability to the receiver instead of the
> > real capability. Later, you can destroy the forwarding object, which
> > revokes their capability.
> 
> OK.  So you have to create a new forwarding object for each new
> client, but I assume this is a relatively cheap operation.

In EROS, here is how this works:

You hold some existing capability. You wish to give it to me, but in a
way that you can revoke. You already hold a space bank, which is a
source of storage. Even better, the client may provide the space bank,
so you can make *them* pay in advance for the right to be revoked.

You go to this space bank and allocate a Node.

You insert the real capability into this Node.

You perform an operation on the node capability which gives you a
"wrapper" capability to the same node.

You now hand the wrapper capability to me.

TO REVOKE

In order to revoke, you use the original node capability and simply
deallocate the node by returning it to the space bank. This destroys the
Node, with the side effect that the client's capability is effectively
severed.

In the special case where "you" are the server, there is a mechanism
that makes it unnecessary to actually retain the node capability in most
cases.


It is not yet clear how we will do this in Coyotos. The details will be
different, but the end mechanism should be similar.

One difference between this approach and UNMAP is that the EROS approach
interacts well with persistence, while the UNMAP approach does not
appear to (to me). My question about unmap is that I do not understand
how it scales successfully when the mapping database grows large enough
to become an out-of-memory data structure, and I have already described
in a previous note why flushing and reconstructing the mappings doesn't
work without severe restrictions on system architecture.

> What we originally planned was to have the cap server be also a
> reference counter, and thus the place where ressource accounting would
> have been done.

Yes. Neal and Marcus described this. Reference counting is sometimes
unavoidable, but it is a very bad approach to resource management
because it provides direct support for resource denial of service
attacks. I will take this up in a separate thread of discussion. I do
not know of any solution here that is completely satisfactory.

>   Thus the cap server could easily allocate storage for
> its client and assign this to the amount of ressources allocated by a
> client. (I guess I should use the word "principal" here instead of
> client).  Maybe I have a too much simplistic view of the problem,
> though.

You assume that the cap server knows what resources the client is
entitled to use. In order for this to be true, the cap server must be in
bed with and trusted by the resource policy mechanism. Carrying this
design out in the logical way will lead to an effectively centralized
system architecture, and managing untrusted code will become
progressively more complex.

> The other possibility is that storage necessary for storing the
> capabilities can be given in what we call "memory containers" by the
> client.  But this imposes some restrictions on the cap server
> implementation (you can't create linked lists between object in
> different containers because they could be revoked and you would loose
> a part of the list).

The design you are sketching is close to one that we use extensively. I
will describe it shortly.

> So, I think that in the general case, the steps 2. and 3. are
> problems, but they seem to be solved when the cap server is also the
> ressource accounter, which is what we planned to 2.

I hope that this question is now re-opened. What you propose will work
(functionally) but it will not lead to a flexible system structure or a
scalable system design. Or at least, it has never been done successfully
in the past. There are 40 years worth of examples of systems designed
this way that support my assertion.

> > 1. Imagine trying to build robust programs in a Java system where
> >    you needed to say in advance how many total objects you planned
> >    to allocate.
> 
> This is a consequence of restricted memory in the cap server, which (I
> think) is no longer true in our very particuliar case with our
> work-around.

The last time I checked, neither the available virtual address space nor
the available disk space were infinite things. If you have found a
source if infinite capacity disk drives, I would be very interested to
buy one.

What do you do in the capserver when you run out of virtual space? Where
does it obtain the backing pages for the memory that it *does* allocate,
and who pays for the allocation of these pages?

> Unmapping [endpoint] objects is not that harmful: upon RPC attempt, the
> client only has to be prepared to an error.

I think that you are not thinking about this hard enough. For example,
your design assumptios have unintentionally conceded that you will never
need to build a shared-memory interface. Further, you have not addressed
the problem of reconstructing these endpoint mappings. Remember that L4
revokes in two situations:

  1. When the endpoint is intended to be revoked.
  2. When memory must be reclaimed for other use.

In the second case, mapping reconstruction is required. The protocol
needed to do this is complex, and the security and robustness
consequences are somewhere between unpleasant and a complete disaster.

> Also, I don't see how capability with copy, which can be selectively
> (or collectively) revoked, solve this problem.  Does the client
> receive a notification upon revocation?

No. This goes back to the topic of reference counts and storage
reclamation, which I will take up separately (as I said above).

> > 3. Now imagine that the revoking thread doesn't even need to be hostile.
> >    Imagine that exit of a thread revokes any capability it holds
> >    (the analogy is that L4sec task exit reclaims its address space).
> >    Now try to answer the question: "Given that the exiting thread
> >    and the using thread are supposed to be isolated from each other,
> >    how do I design a protocol that allows the exiting thread to know
> >    when it is safe to exit?"
> 
> I assume here that you don't want to take "emulating capability copy
> by a trusted third party" in account, because this would be a bit
> cheating.

Not cheating. If you can find a way to make this work with a design
where none of the storage is global system storage (which includes any
storage dynamically allocated by the cap server), it's a perfectly
acceptable design. I think it will be slower than hell, and you still
need to resolve the problem of race conditions in the capability
exchange protocol, but it may be a functionally feasible design.

But take a second here. Go back and look at the chart on page 73 of
Liedtke's "Toward Real Microkernel's" paper:

        http://www.l4ka.org/publications/1996/towards-ukernels.pdf

What the chart basically says is that with incredibly careful IPC design
it is possible to get a factor or (approximately) 18 advantage over
Mach. What the L4Linux measurements say is that this is just *barely*
good enough to compete directly with Linux.

You now seem to propose that we should give up a factor of three to four
in order to run a CapServer for the majority of capability transfers.

Further, you appear to assume that most of these transfers are only one
layer deep. As components become hierarchical, you will do many of these
transfers in the course of a single operation. Some of these will happen
inside loops, where the cost impact will be magnified.

Obviously, I think that the COPY operation really needs to be in the
kernel, but I am making my own assumptions about the nature of system
design, and these are motivating my requirements. I only suggest that
you need to look very carefully at whether COPY or REVOKABLE COPY is the
right primitive, and you should not ignore the measured performance
results while you do so.

Further: you should remember that the numbers in that chart were
measured on a 486-DX. The relative cost of IPC is at least a factor of
four higher on the Pentium 4. So let's see, 4x3 is... yes, thankfully,
we will still be almost as good as Mach if we work at it. :-)

> But the problem stays with COPY if the thread providing
> the ressource wants to exit.

We need to be careful what we mean by "providing" here. Do we mean:

1. The thread that originally asked the object server to create
   the object, or
2. The object server itself?

Obviously, if the object server goes away then the object goes away. But
the thread creating the object is just part of the transport, and is no
different from any other thread in the transport.

> > 4. Once you have designed that protocol, now imagine that the thread
> >*using* the capability is hostile. One way to be hostile would be
> >never to admit to the allocating thread that you are done with a
> >capability. This prevents a well-behaved allocating thread from
> >exiting, which is a direct denial of resource attack.
> >
> Same that for 3., emulating capability copy would be a solution.

No no! This is exactly what you must NOT do. In your design, you only
have two options:

 1. The CapServer pays for all storage. This will lead directly to
    system-wide denial of resource, and I believe this is unacceptable.

 2. The original creator of the object is paying for the storage used
    by the CapServer. In this case, the hostile thread is holding
    the creator's resource hostage, and we have a *local* denial of
    resource attack.

 3. Everyone who makes a copy provides storage, in which case we
    are back to the "transport can revoke" problem.

Local denial is better than global denial, but the design goal should be
*zero* denial. We will not always achieve this, but this should be the
goal.

> What is the allocating thread?  Is it the thread who respond to calls
> on the capability?  If so, the problem is the same with COPY: if this
> thread exits, all clients won't have access to the capability anymore.

Let me suggest that we use the following terms:

  serving thread: the thread *implementing* the object. This thread
     is part of the object server.
  allocating thread: the thread that initially requested the allocation
     of the object, and (I am assuming) provided the storage resource
     that the object occupies.

So the real question is: "Do the bindings to the storage resource
disappear when the allocating thread disappears?" In L4sec, the answer
appears (to me) to be "yes", but it is possible that they have some
clever way to avoid this problem. In EROS/Coyotos, the answer is that
the lifespan of a storage allocator is different from the lifespan of a
thread. As long as the storage allocator survives, the object that lives
in the storage survives.

We also have a (paper) design that would allow the receiver to do a
"storage swap" with the creator. The idea is to do an exchange of empty
pages for occupied pages, such that the original creator gets back some
number of unused pages and nodes equivalent to what they allocated, and
the receiver now owns the storage that the object occupies. Think of
this as similar to a "dollar exchange" (or perhaps a Euro exchange,
though personally I would prefer dollars at the moment :-).

Implementing this would be trivial. There is no difficulty with it in
principle. We simply haven't gotten to it yet.

> If we have direct access to capability copy semantics, we would still
> have to get something for the two other functionnalities.

I will suggest shortly that you do not, but this depends on some design
choices, and you may not agree with my opinions about the best choices.

> * I know that deallocating ressources when nobody is using it can be a
> denial of resource attacks.  But the problem is minimized by these two
> facts:
> 
> -Ressources are allocated on behalf of the client. So the client can
>  only exhaust its own ressources.

Nonsense. Any receiver of an object created by a client can exhaust the
client resource in a reference counted design.

> -Even if we deallocate a ressource when the reference counter reach 0,
>  we can still force revocation.

Then why use reference counts at all?

> So the reference counter is much more a convenience, but I think we
> still need it.  What we need in fact is a notification to the server
> when nobody is using the ressource (if the server is allowed to
> receive this type of notification).

> How do you do such a thing in EROS?

We don't, because it is unnecessary and foundationally insecure. Topic
for next note.

> > Finally, note that there is a race condition in any capability
> >exchange protocol. As far as I know the L4sec team has not defined an
> >efficient, race-free protocol to accomplish this.
> 
> Are you speaking about capability exchange protocol between a server
> and a client, or between 2 clients?

Doesn't matter. An exchange is an exchange is an exchange.

> I would be interested to know more about this race.  What could a
> malicious client do?

I send you a capability. During the window of time when you are trying
to exchange it, I revoke it. If I do this fast enough in the MAP/UNMAP
design, your attempt to invoke the CapServer will take a memory fault.
Note that this memory fault can occur at any place where your
application receives a capability, which includes EVERY RPC!!! Now what?

> The only race I see is if A unmaps the capability before B calls C,
> but B would be able to detect that when the IPC fails.

Yes. Definitely a strong design when the underlying architecture
introduces a need for millions of simple interaction to add lots of
error checking code in order to do simple things correctly. REVOCABLE
COPY guarantees this. Let's do this!

I am reminded of an "Opus" cartoon, during the time when Opus is
learning to talk:

   Person: "Opus: this is Senator Kravitz. Can you say 'public servant'"
   Opus: Bozo!

>  A lot of experience in KeyKOS/EROS/Coyotos suggests that three types of
> > capability transfers account for the overwhelming majority of transfers
> > in real systems:
> >
> >   1. Transfers between an allocating server and a client, where the
> >      client is not going to be the exclusive user of the object.
> >   2. Transfers between mutually trusting components where neither
> >      is going to revoke the other.
> >   3. Transfers between a client and a server, where the server will
> >      hold the capability temporarily, but the client trusts the server
> >      to handle that capability correctly.
> >
> > All of these are cases where the operation you want is COPY, not
> > REVOCABLE COPY.

Crud. I mis-typed item (1) above. I meant that the client IS going to be
the exclusive user of the object.

> For 3., in the Hurd most often the client does not fully trust its
> server.  So it wouldn't want to give to it ressources that he couldn't
> revoke later.  So I think the operation we would want here is more
> REVOCABLE COPY.

Reclaiming resource is completely orthogonal to revocable copy.

        
> What is sure is that we don't want the server to be able to revoke the
> capability provided by the client.  So it would be better to provide
> at least an "unrevocable" capability.

I think that we are confusing two things here:

  REVOCABLE COPY means that the sender can revoke any capability
    that it sends.
  A revocable object is an object that honors the "destroy()"
    operation.

In a pure capability system, there is no such thing as a revocable
capability. There are only revocable objects.


> > In fact, there is a hierarchy problem in L4.x2 today in the memory
> > manager. Consider two process A, B with respective pagers A', B'. Now:
> >
> >     A' maps to A
> >     A maps to B
> >     A' revokes
> >     B' knows nothing and cannot reconstruct the mapping.
>
> I have to object here:
> 
> -First, why would B' reconstruct the mapping?  The best would be to
>  cancel the current operation...

If the revocation was intentional, the reconstruction will not succeed
and it does not matter. But if the revocation was an artifact of paging
the object out, then surely you want the object to be paged back in and
the capabilities to be reconstructable?

> -Second, are there pagers for mappable objects other than fpages?  Is
>  there a need for them?  And, as I wrote above, it's much easier to
>  recover from an unmap for these objects than for memory pages.

We think that they need to exist, because you can run out of main memory
storage for objects other than pages.

> OTOH, I agree that the fear of an unmap at anytime is a problem.  I
> don't know if Neal find his solution satisfactory, and I don't know it
> very well.

I would say rather that mixing revocation with the object paging system
is a mistake. This is a minor disagreement between Coyotos and L4sec.
The L4 group believes that paging is policy and should occur entirely
outside the kernel. We mostly agree, but we think that the kernel can
and should support the continuity of object names (capabilities) in
spite of object paging.

> > THE EMULATION FALLACY
> >
> > It is being said that either system can be emulated on top of the other.
> > This is true only in the very narrow sense that it is possible to build
> > a library API with a functional interface that can be implemented by
> > both systems. Unfortunately, it is NOT true in two important regards:
> >
> >   There is a fundamental difference in performance, which
> >   impacts the set of feasible system architectures.
> >
> >   COPY cannot be emulated on top of REVOCABLE COPY without
> >   a centralized CapServer. The CapServer must allocate storage
> >   for every capability that is created, and can therefore be
> >   subjected to denial of resource attacks.
> >
> > The second issue is critical. One of the most basic design principles in
> > EROS/Coyotos is:
> >
> >   No free rides! The party who allocates must pay!
> >
> > This is a design principle because in EVERY case where it has failed we
> > have been able to identify successful attacks on the overall system
> > design. I have not seen (and I have been unable to design) a CapServer
> > that satisfies this design principle.
> >
> 
> What is so special with the cap server?

The cap server is not special. That is the point. The cap server must
not provide free rides either.

>   The problem I have seen is
> that if the client provides revocable memory to the cap server for its
> storage, then you can't have pointers between the different memory
> regions.  But it seems to be solved if the cap server can allocate
> memory for itself and attributes it back to the client (or at least,
> it can use trusted memory accounted to the client, which is supposed
> to have a limited amount of memory).
> 
> Is there any problem with this approach?

I am not sure, because I am not sure that we are thinking about this
issue from the same perspective. I suggest that we should re-open this
question after my note about storage management (to follow).

> I'll try to make a little summary of this discussion:
> 
> * There are two set of primitives operations, MAP/UNMAP and COPY.
>   Each can be emulated by the other (with problems), so it is better
>   to name them COPY and REVOCABLE COPY.

Yes.

> * Emulating COPY requires a cap server, which is problematic for
>   performances issues and possible denial of ressources attack.  I
>   tried to propose solutions to avoid this denial of ressources
>   attack.  There is also a race problem when copying a capability
>   between two clients.

Yes. I do not agree that you have successfully resolved the denial of
resource issues.

> * The fact that memory can be unmapped at anytime is a real problem,
>   but I think this isn't the case for other capabilities.

I believe that it is the case for other capabilities as well.

> * When the only operation is MAP, all processes in the chain of
>   transport must still be alive when the client is using the
>   capability.  The problem become more evident when capabilities can
>   be authentificated.

Yes, unless a capability exchange protocol is introduced, and there are
problems with this that we have now discussed.

> * The cap server was planned to have three roles: a trusted third
>   party used to reestapblish capability copy semantics, a point for
>   accounting ressources, and a reference counter.  If we have copy
>   semantics, how can we have the two other functionnalities.

I think that you do not want them. This is the next discussion.

> * Depending on the type of capability transfert, it is better to have
>   COPY or REVOCABLE COPY.  We don't know what is the majority of
>   transferts in the Hurd yet.

Yes.

> * Programs written with copy seem more robust, but I don't know if
>   we encounter often the problem you gave in the Hurd.

Give it time. You will! Just about the time you have enough code to make
a change of design impossibly painful! :-)

> * MAP/UNMAP leads to a hierarchical organisation, and this is a
>   security issue.

Yes.

> I'd like to add something to our discussion.  Why wouldn't we want to
> have both operations?  Revocable copy seems to be useful in the
> following cases:

Okay. Good question. This is easy, but let me do it in a separate note.

> * What is important is that upon revocation of a REVOCABLE COPY, all
>   capabilities copied from the capability revocably copied are also
>   revoked.

Yes.
  
>   This is the case with EROS' forwarding object.

Yes.
  
>   If we extended L4 to support copy operations...

I have had this discussion with them. They have rejected this extension,
because extending the mapping database to handle this turns out to be
very complicated and requires new storage allocation in the kernel
(which is REALLY a denial of resource opportunity).

>   Thus, both operations have to be cheap, and we shouldn't priviledge
>   one over the other.

And neither operation can allocate storage that nobody pays for. For
COPY this is easy. For REVOCABLE COPY, this is *very* difficult to
achieve if the operation is performed by the kernel. For example, the
MAP operation must allocate a kernel mapping database node.

> * I also have a question on implementation of EROS' forwarding
>   objects: if we have to establish a hierarchy of REVOCABLE COPIES, we
>   thus have to create a new forwarding object each time.  How does the
>   lookup of the real object occurs?  Is it constant in time, or
>   dependent of the depth of the mapping tree?

Linear in the depth of the chain of wrappers. In our experience, this
case is VERY rare, and can usually be architected around.

But it *is* an issue.

EROS imposes a constant bound on this depth in order to ensure that all
kernel operations execute in bounded time.


shap
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Comparing "copy" and "map/unmap", (continued)
Prev by Date: Re: The idea of an own L4
Next by Date: Why kernel REVOCABLE COPY is difficult
Previous by thread: Re: Comparing "copy" and "map/unmap"
Next by thread: Re: Comparing "copy" and "map/unmap"
Index(es):
- Date
- Thread