l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions


From: Espen Skoglund
Subject: Re: Questions
Date: Mon, 27 Oct 2003 14:40:27 +0100

[Martin Schaffner]
> I have two questions:
> 1) Performance:

> In 06/04/2001, "sjames" wrote on slashdot
> (http://slashdot.org/comments.pl?sid=11531&cid=310196):

>> The biggest problem for microkernels is that they have to switch
>> contexts far more frequently than a monokernel (in general).
>> 
>> For a simple example, a user app making a single system call. In a
>> monokernel, The call is made. The process transitions from the user
>> to the kernel ring (ring 3 to ring 0 for x86). The kernel copys any
>> data and parameters (other than what would fit into registers) from
>> userspace, handles the call, possably copies results back to
>> userspace, and transitions back to ring3 (returns).
>> 
>> In a microkernel, the app makes a call, switch to ring 0, copy
>> data, change contexts, copy data to daemon, transition to ring3 in
>> server daemon's context, server daemon handles call, transitions to
>> ring 0, data copied to kernelspace, change contexts back to user
>> process, copy results into user space, transition back to ring3
>> (return).
>> 

In short, the guy is saying that in a microkernel the single
user-to-kernel switch for syscalls is translated into an RPC from
user-task to kernel-task (I don't understand why everyone has to get
all tangled up in this ring-terminology).  And yes, this will incur
some overhead.  You should take the data copying part of his argument
with a grain of salt, though, since a) there might be no arguments
that need copying (e.g., they may reside in registers), and b)
arguments may be copied directly from user-task to kernel-task without
using a temporary kernel buffer.

> The app has previously aquired a capability for the file it wants to
> read, and allocated a buffer. When it calls "read", the glibc
> function makes an RPC directly to the filesystem translator (no
> going to ring 0, assuming the translator is owned by the same user
> as the calling process),

Will need to enter kernel-level unless translator is within same
address space as the application.

> which in turn RPCs the driver of the backing store (will probably
> reside in ring 0) for the data.

Except that the driver will probably *not* reside in kernel-land.

> Assuming zero-copy, there is a minimum of data copying, but there
> are still four context switches, which can't be done with super-fast
> IPCs, since they concern three different processes (app, translator,
> driver).

I'm by no means a hurd expert, but I suspect that the intermediate IPC
(i.e., the one to the translator) can probably be circumvented for
common case operations (i.e., read/write).

> Is it likely that l4/hurd will be slower than linux, for things like
> filesystem operations?

The major overhead of filesystem operations tend to be with the
hardware itself.  The overhead of IPC will of course still be present.
However, considering a 1GHz processor, an IPC time of 200 cycles, and
a syscall every 1 microsecond, the pure overhead of the syscall RPC
will amount to 0.04%.  Other factors, such as cache working sets and
TLB flush operations due to untagged TLBs will then have a higher
impact.

        eSk






reply via email to

[Prev in Thread] Current Thread [Next in Thread]