[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: On hierarchy

From: ness
Subject: Re: On hierarchy
Date: Fri, 21 Oct 2005 20:01:29 +0200
User-agent: Mozilla Thunderbird 1.0.6 (X11/20050813)

Jonathan S. Shapiro wrote:
Espen has asked me to explain why I am unhappy about hierarchy. This has
been a very useful question. It has forced me to examine things more
carefully, and this is always useful. It took some thinking.

Summary: 1. Hierarchy is important and useful, but there are ways to
            go wrong with it.

         2. I believe (and I will explain why) that the L4 architecture
            includes operations that cannot meet real-time requirements.

            My concern here is not that there may be an imperfect
            implementation. My concern is that I do not see how
            to preserve the current specification of UNMAP within the
            requirements of real-time implementations.

         3. I would be VERY happy to be wrong about L4, and *if* I am
            wrong, it is most likely because of:
              a) a difference of opinions about requirements, or

              b) some feasible implementation of the L4 UNMAP
                 operation that has not occurred to me. None of
                 the current L4 implementations that I have examined
                 satisfy the requirements that I will identify

         4. EROS has *also* gone wrong (in a different place), but
            in both of the cases I can think of it is an issue of
            implementation rather than architecture. A third case
            that existed in KeyKOS was removed in EROS. That one
            *was* an architectural issue.


I assume that we want a microkernel that is capable of supporting
real-time activity, and that we can secure with confidence. This creates
several requirements:

1. System calls are only preemptable between well-defined "units of

2. Because they are non-preemptable, units of operation must complete in
   a small *constant* number of steps.

  2a) in practice, because the sizes of memories remain relatively
      small, an operation that is log(n) where n is the number of
      in-memory objects, may be considered to satisfy this

  2b) nested loops that have the effect of expanding the steps in a
      unit of operation by a multiplicative factor must be avoided.

  => In my *opinion*, the L4 string bound is too large to satisfy
     this requirement, but this is definitely a matter of opinion
     and preference. In effect, I want a smaller constant bound.

3. For any microkernel operation, the portion of the operation that
   performs semantically observable mutations to system state must
   be atomic.

   => This is the most likely place that the L4 designers will
      have a different view.

   => The EROS/Coyotos team believes this is essential in order to
      reason successfully about security and time bounds.

   => Cache invalidation, where the state invalidated is PURELY
      a cache, can be usually broken down into units of operation
      that satisfy Requirement (2). These invalidations can be
      viewed as occurring *before* the mutating operation, and
      it MAY safe to implement them incrementally, but NOT

4. Locks may not be held across preemptions. This is necessary to avoid
priority inversion.

Okay. Back to hierarchy:

In a *balanced* tree, several operations satisfy these requirements:

   1. Traversals upwards toward the root can always be done
      in log(n) time.

   2. Lookup of a single item can be done in log(n) time.

   3. Insert and delete operations can be done in log(n) time.

   4. Subtree delete can be divided into atomic units that
      satisfy the requirement, PROVIDED the deletes are not
      semantically observable. That is: only when the tree
      is purely a cache.


   5. Visiting all members of a subtree can NOT be done within
      the time bound requirement.

Based on my current understanding of the mapping database -- which is
incomplete -- I have three concerns about the design:

1. The hierarchy is unbalanced. In consequence, no time bounds on
traversals less than O(n) can be stated, and these do not satisfy the
stated requirements.

This is not entirely correct, IMHO. The hierarchy _may_ be unbalanced.
The point is, that the system affects (or better: defines) the
hierarchy. This means, if the system relies on a balanced hierarchy, it
can create one.

2. The UNMAP operation requires an "all children" visit. The question
is: can this operation be divided successfully into constant-bounded
units of operation? The answer appears to me to be "no". There are two
versions of UNMAP to consider:

   2a) UNMAP invalidating all entries. This version CAN be divided
       into constant-bounded atomic units of operation, because
       entries can be removed from the mapping database as they
       are invalidated.

       => This relies on the assertion that the mapping database
          is purely a cache. I do not think this is entirely correct.
          More importantly, Kevin Elphinstone (who knows the design
          much better than I do) has expressed doubts about this as

I'm don't know the design very good, too, but the mdb is purely a cache,
AFAIK. In 2 papers I read ([1],[2]) is mentioned that the mdb state
doesn't have to be exported, as it can be reconstructed.

   2b) UNMAP that *downgrades* all entries -- e.g. removing write
       permission while leaving read permission intact.

       I do not see any way to subdivide this into atomic units
       of operation. The difficulties are:

         2a1) you need some sort of index that keeps track
              of your position in the traversal, and

         2a2) your unmap is interleaved with other operations
              that may grow the subtree unless long-held locks
              are introduced. In consequence, there is no
              guarantee that the index makes monotonic forward
              progress. In consequence, the operation may never

So: my concern is not with hierarchy, but with L4's use of *unbalanced*
hierarchy and L4's architectural exposure of operations that do not
appear to lend themselves to implementation by atomic units.

Ultimately, this is the reason that Coyotos and L4.sec did not merge.

My previously stated concerns about the difficulty of establishing
software contracts when REVOCABLE COPY is the default primitive are
unrelated to these issues.

EROS has related problems in two cases:

1. Destroying an object potentially requires revoking O(n) capabilities
by traversing a linked list.

2. The "merge with parent" variation of space bank destruction requires
an all-child visit.

Case (2) is obscure, and requires more explanation of the data
structures than I have time to give right now. Bottom line: it's purely
an implementation problem, and it isn't a kernel issue at all.

Case (1) is also an implementation issue. In abstract, it is possible to
eliminate the linked list, allowing the capabilities to be invalidated
in unit time. This modification was discussed extensively on the EROS
list at one point. The impediment is that EROS resume capabilities get
in the way. One of the reasons I like the move to endpoints is that
resume capabilities are going away and many things will get simplified
as a result.

So: those are my concerns about hierarchy, such as they are. If there
are algorithmic ways to avoid the concerns I have stated, that would be
wonderful. If there are differing opinions about the requirements, then
either we will converge or we will at least understand why we do not.
This would also be wonderful.


[1] User-level Checkpointing Through Exportable Kernel State
[2] User-level Management of Kernel Memory


reply via email to

[Prev in Thread] Current Thread [Next in Thread]