[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: On hierarchy

From: Jonathan S. Shapiro
Subject: Re: On hierarchy
Date: Sun, 23 Oct 2005 13:12:58 -0400

On Fri, 2005-10-21 at 23:02 +0200, Udo A. Steinberg wrote:
> The following web page attempts to shed some light on the issue
> http://www.linuxdevices.com/articles/AT5136316996.html
> and the arguments by Bill Weinberg make sense to me. But I'm by no means an
> English linguist, so maybe you can provide some insight why you think it
> should be "preemptable".

Because the word is ultimately a reconstruction of the phrase "able to
be preempted". However, English is very inconsistent about whether the
'a' is changed to an 'i' when this rearrangement is done. If you try a
Google search, you will find MANY hits under both spellings. The only
reasonable conclusion is that both spellings are actively in use.

> JSS> Yes. I was aware of this. There are two cases in the API that I do not
> JSS> understand how to implement correctly if operations are preemptable. Can
> JSS> you explain them:
> JSS> 
> JSS> 1. Two threads exist in a task. One performs an unmap from a location
> JSS> while the other (simultaneously) executes a loop performing a map. Is
> JSS> the unmap guaranteed to terminate in any predictable amount of time?
> The thread invoking unmap will lock down the affected (sub)tree of the page
> frame in question and the thread invoking map can no longer add new nodes
> in that subtree. It can, however, add new nodes upstream, i.e., in a
> different subtree (from the root node's POV).

Yes. Given your later comments below, this is the answer that I
expected. Thank you.

> JSS> 2. One thread performs a long unmap that is preempted after making
> JSS> partial progress. Because it is preempted, the scheduler permits other
> JSS> threads to run. In particular, a debugger thread performs a "stop"
> JSS> operation on the thread that is performing the UNMAP. I do not see any
> JSS> outcome in which this is possible within the requirements I have stated
> JSS> without leaving the system overall in an inconsistent state.
> The thread with a preempted unmap is inside the kernel at the time of the
> "stop" operation. It can run the unmap to completion but it must not return
> to user mode in order to maintain the "stop" semantics for the user.

This is the answer that I expected. We now have a system architecture in
which real-time guarantees appear to be impossible to assure in the
presence of debuggers. Note that variable timeouts also start to fail in
this case.

I am not sure how big a problem this is in practice, and it certainly
isn't a problem with debuggers per se. The bigger issue is that the
debugger implements its behavior using certain authorities, and any
process that holds these authorities can similarly halt the process. In
order to state guarantees about process behavior, we must be in a
position to know that this authority is never held by an untrusted third

Also, note that you have just halted the system in an uncheckpointable
state unless you are prepared to checkpoint kernel state -- which is
extremely difficult if you then want to upgrade the kernel. You also
cannot terminate this operation if something goes wrong, because in that
situation you are unable to go forward (because something is wrong) and
unable to go backward (because undo information has not been recorded).

> JSS> Unfortunately, none of these mechanisms work in the general case -- a
> JSS> discussion we should take up separately. The problem is that there is no
> JSS> way in general to know which thread to escalate if the IPC mechanism is
> JSS> general-purpose. Both the L4 and EROS IPC mechanisms are general
> JSS> purpose.
> I don't understand your concern here. Both with locks and with IPC you can
> have one thread depend on another. Both issues can employ some sort of
> helping or time/priority donation to prevent priority inversion.

Here is the problem, but we are getting way off topic.

Two threads High Priority Thread and Low Priority Thread (resp. HPT,
LPT) are calling a common service. LPT calls first, and the service
grabs a lock. HPT now calls the service and needs the lock.

Question: which thread to escalate in order to get the lock released?
Answer: whichever thread will ultimately cause the lock to be released.

Question: which thread is that?
Answer: not necessarily LPT. In fact, if calls and returns are
interleaved by the service it could be any thread in the system at all;
even one that has not been created yet.

> JSS> Setting this aside, long-held locks also create opportunities for kernel
> JSS> deadlock that we would prefer to avoid, and they make analysis of kernel
> JSS> correctness about two orders of magnitude harder. I don't know if formal
> JSS> analysis is an active objective for L4.sec, but for Coyotos this is
> JSS> definitely a concern.
> Short-held locks will deadlock the same way long-held ones do, if you don't
> preclude deadlocks by careful system design.

The atomicity contract guarantees that this is incorrect. First, by
"short held" I mean "locks that are only held for the duration of a
single, atomic system call". Within such a system, no deadlock is
possible. All locks must be grabbed before the commit point, and the
process that would deadlock simply backs out and starts over.

> JSS> If I understand you, this is the case where the unmap is a total unmap:
> JSS> the PTE's are going to invalid. I had in mind a different algorithm, but
> JSS> I agree that yours works. Also, I *think* that your solution resolves my
> JSS> "map during unmap" question.
> JSS> 
> JSS> The case that I do not currently understand how to do is the *partial*
> JSS> unmap -- e.g. taking away write permission while leaving read
> JSS> permission. For this case, I think you will need to add some form of
> JSS> index to keep track of progress. Is there a way around this?
> It's late and I fail to see how downgrading rights from read/write to
> read-only is any different from downgrading rights from present to
> non-present, except that we don't deallocate the affected nodes.
> Can you provide an example that illustrates your point?

I think so.

The case where you deallocate the nodes is much simpler. A node that has
been deallocated obviously does not need to be visited again.

However, if you are merely downgrading the nodes, you may have an
arbitrarily large number of nodes to visit. Let is imagine that there
exist 100,000 nodes to visit, and that we have set our arbitrary
constant bound at 100 steps. The algorithm must now proceed by doing 100
nodes at a time, and it therefore needs some way to keep track of where
it is in the tree (how much progress it has made).

We have already discussed that a debugger can make this operation take a
long time. Now consider:

   A initiates unmap
   B halts A (using debugging interface)
   C destroys A

Note that C is now blocked by B, who may not be a collaborator. This
violates the design rule that the party paying for storage should always
be able to promptly reclaim it.

Also note that there is no good resolution of this situation. Either we
cancel the operation mid-way, leaving the system in an inconsistent
state, or we stall C, violating our design rule and leading to denial of

Or at least I do not *see* a good resolution. Perhaps you have one.

> JSS> Do the performance measurements that have been published for the map
> JSS> operation include this locking, or are they measuring an incorrect
> JSS> implementation?
> Which performance numbers are you referring to?

You indicated that the UNMAP operation could be made restartable by
marking the mapping nodes "busy" in some way. This introduces checks in
both the map and the unmap code. Are these checks included in the
published performance numbers?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]