[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Objective-C and Smalltalk; speed of message send

From: Alexander Malmberg
Subject: Re: Objective-C and Smalltalk; speed of message send
Date: Tue, 10 Aug 2004 13:15:44 +0200

Well, to add some real numbers to this ;), try the attached program.
ix86 only, although there should be equivalents to rdtsc on other
platforms (it returns the raw clock cycle count from the processor).

Compile with -O2, -fomit-frame-pointer to reduce overhead in the called
dummies, -fPIC since that's what we use for all GNUstep code (and,
amazingly, it gives faster code here).

My values are for a PII, gcc 3.5 snapshot with my optimized message
lookup patch:

Jeff Teunissen measured with an unpatched gcc on an athlon xp system.

                                            My           Jeff's
                             Loop overhead:  6 cycles    ?, likely 6
                    Normal c function call:  6 cycles    6 cycles
             C call, two args (self, _cmd):  8 cycles    7 cycles
Indirect call, two args (aka. IMP caching):  9 cycles    7 cycles
                              Message send: 24 cycles   37 cycles

This is GNU runtime (of course :). Receiver and message are constant,
although that's irrelevant since the GNU runtime does lookups in
constant time (modulo the necessary stuff from memory being in the

Thus, c calls are essentially free, IMP caching costs 1-3 cycles, and a
message send costs ~31 cycles normally, ~18 cycles with my patch.
Excluding loop overhead, on a 1GHz system, I'd expect around 30 million
message sends/second with normal gcc, ~60 million/second with my patch.

All this is assuming that the callee doesn't do anything. Without
-fomit-frame-pointer, you get an extra 2-3 cycles of frame setup in all
methods, and eg. the common case in -characterAtIndex: for 8-bit strings
(in range, character is an ascii character) is ~14 cycles.

As a final note, recent oprofile data from Matt Rice puts
objc_msg_lookup at 15%-20% of execution time in messaging heavy -gui
code. In other words, if message lookup was free, our programs would be
~20% faster. Significant, yes, but 20% really isn't that much.

Thus, I maintain that message sending is really quite cheap. :) Except
for performance critical code, I'd rather take a 20% performance hit
than uglify my code with IMP caching, and I wouldn't shy away from
messaging heavy code. :)

- Alexander Malmberg
#include <objc/Object.h>

static inline unsigned long long int llclock(void)
        unsigned long long int a;
        asm volatile ("rdtsc" : "=&A" (a));
        return a;

void foo(void) __attribute__ ((weak));
void foo(void)

void foo2(id foo, SEL s) __attribute__ ((weak));
void foo2(id foo, SEL s)

int i=1000;

void (*foo2_id)(id,SEL)=foo2;

@implementation Object (foo)
-(void) foo

int main(int argc, char **argv)
        unsigned long long int t1,t2;
        id self=[Object alloc];
        SEL address@hidden(foo);
        void (*foo3)(id,SEL)=foo2_id;

        while (i--)
//              foo(); /* ~6 cycles/call */
//              foo2(self,cmd); /* ~8 cycles/call */
//              foo3(self,cmd); /* ~9 cycles/call */
                [self foo]; /* ~24 cycles/call */

        printf("%llu clock cycles\n",t2);

        return 0;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]