[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: On hierarchy
Re: On hierarchy
Fri, 21 Oct 2005 20:01:29 +0200
Mozilla Thunderbird 1.0.6 (X11/20050813)
Jonathan S. Shapiro wrote:
Espen has asked me to explain why I am unhappy about hierarchy. This has
been a very useful question. It has forced me to examine things more
carefully, and this is always useful. It took some thinking.
Summary: 1. Hierarchy is important and useful, but there are ways to
go wrong with it.
2. I believe (and I will explain why) that the L4 architecture
includes operations that cannot meet real-time requirements.
My concern here is not that there may be an imperfect
implementation. My concern is that I do not see how
to preserve the current specification of UNMAP within the
requirements of real-time implementations.
3. I would be VERY happy to be wrong about L4, and *if* I am
wrong, it is most likely because of:
a) a difference of opinions about requirements, or
b) some feasible implementation of the L4 UNMAP
operation that has not occurred to me. None of
the current L4 implementations that I have examined
satisfy the requirements that I will identify
4. EROS has *also* gone wrong (in a different place), but
in both of the cases I can think of it is an issue of
implementation rather than architecture. A third case
that existed in KeyKOS was removed in EROS. That one
*was* an architectural issue.
I assume that we want a microkernel that is capable of supporting
real-time activity, and that we can secure with confidence. This creates
1. System calls are only preemptable between well-defined "units of
2. Because they are non-preemptable, units of operation must complete in
a small *constant* number of steps.
2a) in practice, because the sizes of memories remain relatively
small, an operation that is log(n) where n is the number of
in-memory objects, may be considered to satisfy this
2b) nested loops that have the effect of expanding the steps in a
unit of operation by a multiplicative factor must be avoided.
=> In my *opinion*, the L4 string bound is too large to satisfy
this requirement, but this is definitely a matter of opinion
and preference. In effect, I want a smaller constant bound.
3. For any microkernel operation, the portion of the operation that
performs semantically observable mutations to system state must
=> This is the most likely place that the L4 designers will
have a different view.
=> The EROS/Coyotos team believes this is essential in order to
reason successfully about security and time bounds.
=> Cache invalidation, where the state invalidated is PURELY
a cache, can be usually broken down into units of operation
that satisfy Requirement (2). These invalidations can be
viewed as occurring *before* the mutating operation, and
it MAY safe to implement them incrementally, but NOT
4. Locks may not be held across preemptions. This is necessary to avoid
Okay. Back to hierarchy:
In a *balanced* tree, several operations satisfy these requirements:
1. Traversals upwards toward the root can always be done
in log(n) time.
2. Lookup of a single item can be done in log(n) time.
3. Insert and delete operations can be done in log(n) time.
4. Subtree delete can be divided into atomic units that
satisfy the requirement, PROVIDED the deletes are not
semantically observable. That is: only when the tree
is purely a cache.
5. Visiting all members of a subtree can NOT be done within
the time bound requirement.
Based on my current understanding of the mapping database -- which is
incomplete -- I have three concerns about the design:
1. The hierarchy is unbalanced. In consequence, no time bounds on
traversals less than O(n) can be stated, and these do not satisfy the
This is not entirely correct, IMHO. The hierarchy _may_ be unbalanced.
The point is, that the system affects (or better: defines) the
hierarchy. This means, if the system relies on a balanced hierarchy, it
can create one.
2. The UNMAP operation requires an "all children" visit. The question
is: can this operation be divided successfully into constant-bounded
units of operation? The answer appears to me to be "no". There are two
versions of UNMAP to consider:
2a) UNMAP invalidating all entries. This version CAN be divided
into constant-bounded atomic units of operation, because
entries can be removed from the mapping database as they
=> This relies on the assertion that the mapping database
is purely a cache. I do not think this is entirely correct.
More importantly, Kevin Elphinstone (who knows the design
much better than I do) has expressed doubts about this as
I'm don't know the design very good, too, but the mdb is purely a cache,
AFAIK. In 2 papers I read (,) is mentioned that the mdb state
doesn't have to be exported, as it can be reconstructed.
2b) UNMAP that *downgrades* all entries -- e.g. removing write
permission while leaving read permission intact.
I do not see any way to subdivide this into atomic units
of operation. The difficulties are:
2a1) you need some sort of index that keeps track
of your position in the traversal, and
2a2) your unmap is interleaved with other operations
that may grow the subtree unless long-held locks
are introduced. In consequence, there is no
guarantee that the index makes monotonic forward
progress. In consequence, the operation may never
So: my concern is not with hierarchy, but with L4's use of *unbalanced*
hierarchy and L4's architectural exposure of operations that do not
appear to lend themselves to implementation by atomic units.
Ultimately, this is the reason that Coyotos and L4.sec did not merge.
My previously stated concerns about the difficulty of establishing
software contracts when REVOCABLE COPY is the default primitive are
unrelated to these issues.
EROS has related problems in two cases:
1. Destroying an object potentially requires revoking O(n) capabilities
by traversing a linked list.
2. The "merge with parent" variation of space bank destruction requires
an all-child visit.
Case (2) is obscure, and requires more explanation of the data
structures than I have time to give right now. Bottom line: it's purely
an implementation problem, and it isn't a kernel issue at all.
Case (1) is also an implementation issue. In abstract, it is possible to
eliminate the linked list, allowing the capabilities to be invalidated
in unit time. This modification was discussed extensively on the EROS
list at one point. The impediment is that EROS resume capabilities get
in the way. One of the reasons I like the move to endpoints is that
resume capabilities are going away and many things will get simplified
as a result.
So: those are my concerns about hierarchy, such as they are. If there
are algorithmic ways to avoid the concerns I have stated, that would be
wonderful. If there are differing opinions about the requirements, then
either we will converge or we will at least understand why we do not.
This would also be wonderful.
 User-level Checkpointing Through Exportable Kernel State
 User-level Management of Kernel Memory
Re: On hierarchy, Espen Skoglund, 2005/10/21
Re: On hierarchy, ness, 2005/10/21
Re: On hierarchy, Jonathan S. Shapiro, 2005/10/23
- Re: On hierarchy,