l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Asynchronous messaging (was: Comparing "copy" and "map/unmap")


From: Jonathan S. Shapiro
Subject: Asynchronous messaging (was: Comparing "copy" and "map/unmap")
Date: Mon, 10 Oct 2005 14:50:09 -0400

On Mon, 2005-10-10 at 16:09 +0200, Matthieu Lemerre wrote:

> This is an example, maybe we don't need a message server in EROS (we
> planned to use this for blocking RPCs)....

There is an overwhelming amount of evidence in the literature that
blocking RPCs are something you *want*, not something you want to avoid.
However, there are rare circumstances where buffering and/or asynchrony
becomes necessary. Let me deal with buffering first.

There are exactly three reasons to buffer:

  1. To reduce the *likelihood* of blocking.
  2. To reduce intermediate storage consumption, as in gcc -pipe
  3. When there is an exactly known, bounded amount of data to send,
     and buffering is cheaper than keeping the sender active.

The last is a very unusual case. In practice, I have only seen it used
in drivers to reduce interrupt rates on outbound I/O.

The first is nearly always a bad idea. What it *actually* accomplishes
is to obscure the conditions under which the blocking occurs and make
those conditions harder to test. It really only works when you have
reason to know that the receiver will mostly keep up (possibly with
minor variance). From a systems perspective, a better solution is to
work on receiver performance.

Note that Linux recently did a bunch of work on this -- the Linux pipe
implementation now works very hard to be zero copy, and it usually
succeeds. What this indicates is that most situations of buffering
actually weren't needed at all. In many cases the current need would be
eliminated by a thread-migrating IPC operation.

The second case is a mistake too. In this case, the assumption is that
the receiver will read faster than the sender will send. If this is the
case, why buffer?

Finally, have a look at the graph on page 5 of:

        http://www.eros-os.org/papers/iwooos96-ipc.ps

What this graph tells you is that once you hit the D-cache size, you are
better off context switching than buffering.

We have certainly implemented user-level buffering objects in EROS and
KeyKOS, but it was always a bad workaround for some situation where we
needed to fix the *real* problem.


Asynchronous notification is a very different situation. This arises
when you have some sort of shared buffer, the sender must not block, and
the receiver wants to get multiple messages per round if possible (in
boxcar fashion). Also, it is helpful in a situation where you are very
near the capacity of the machine and the cost of context switching for
each communication becomes a significant factor. The packet ring buffers
in the EROS ethernet subsystem are a good example.

In this case, you need a mechanism where the sender can "post" a notify
to the receiver without performing a full context switch. At a minimum,
the notification must atomically post the action and schedule the
recipient to run if the recipient is idle. It should NOT (in my opinion)
preempt the recipient -- this was the problem with UNIX signals.

In EROS, we did not have a mechanism to do this, and we were forced to
resort to using extra threads. The ethernet system that did this is
described in:

        http://www.eros-os.org/papers/usenix-net-2004.ps

To summarize: the extra context switches used to achieve separation cost
us about 15% of throughput on Gbit networks.

This led to one of the architectural changes in Coyotos: a sender
holding the right type of endpoint capability can "post" a notification.
The sender is guaranteed not to block.

Think of this as a non-preemptive signal. The next time that the
receiver becomes ready to receive a message, the kernel will synthesize
a message delivering the mask of active notifications.

Our measurements on the Gbit net suggest that this is enough to get the
performance difference back.

At the risk of immodesty, that network design was good work. It can be
beaten, but it is *much* harder to beat than the Linux or BSD networking
stacks, and we subsequently built the necessary infrastructure to
protect it further:

        http://www.eecs.harvard.edu/~prash/papers/wbia2005.pdf&e=10401


I recognize that this note will require some discussion. Let me pause
here to see what reactions and questions people have.

shap





reply via email to

[Prev in Thread] Current Thread [Next in Thread]