l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 64-bit virtual adresses and registers


From: Espen Skoglund
Subject: Re: 64-bit virtual adresses and registers
Date: Tue, 7 Aug 2007 16:17:42 +0200

[Jonathan S Shapiro]
> On Mon, 2007-08-06 at 16:52 -0300, Fortes Marcelo wrote:
>> In time,(pardon me by my ignorance)Is Eros/CoyotOS IPC message
>> passing sychronous like L4 or Assyinchronous like Mach ?

> Synchronous. We did discuss asynchronous IPC for a while, but the
> idea was much too complicated. It was eventually dropped.

I assume that you might still support some sort of asynchronous
notification mechanism?  This is very useful when one needs an
efficient and reliable signal delivery mechanism.  Some time ago I
implemented something similar to the async notification mechanism
described in the NICTA Nx APIs.  The following numbers where measured
on a quad-core Intel Xeon E5310 @ 1.60GHz:

                                           SMP-kern  SMP-kern  SMP-kern
                       Inter-AS  Intra-AS  Inter-AS  Inter-AS    XCPU

Send async notify:       99.13     97.15    113.67    113.66    123.13
Poll async notify:        1.55      1.55      1.55      1.55      1.55
Pingpong async notify:  543.07    233.57    618.21    315.42   2837.07
Pingpong (single MR):   523.26    207.33    453.07    208.28   8184.09

As one can see polling is extremely cheap since it can be done
completely in user-mode.  Sending a notification in enters the kernel,
updates a bitmask in the destination, and checks if the destination
needs to be woken up (in this case no wakeup).  Async pingpong is two
threads using the async notification mechanism to wake each other up,
and single MR pingpong is the standard pingpong measurement for a
single message register.

Some observations:

  - Sending a notification (i.e., with kernel entry and cheking for
    whom to schedule) takes less than 100 cycles (some more on SMP
    because we have to use bus-locking instructions and do some more
    tests).
  - Sending a notification has pretty much the same cost whether it be
    to a local or remote CPU.  The reason for the difference is that
    the remote thread is constantly polling for notifications, so the
    cache line will have to transition from modified to shared all the
    time.
  - Async pingpong (single CPU) is typically a little more costly than
    sending a 0-byte synchronous IPC.  This is to be expected because
    of some more extra work to be done.
  - Async pingpong between CPUs is about 1/3 the cost of synchronous
    IPC.  This is due to not having to wait for other CPU to synch up
    so that one can do a rendezvous between the threads.

I've also done some measurements on an AMD box.  These are quite a bit
faster (especially for inter-AS due to the TLB flush filter), but I
don't have the numbers at hand right now.

        eSk




reply via email to

[Prev in Thread] Current Thread [Next in Thread]