[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[avr-gcc-list] Re: C vs. assembly performance

From: David Brown
Subject: [avr-gcc-list] Re: C vs. assembly performance
Date: Sat, 28 Feb 2009 21:26:15 +0100
User-agent: Thunderbird (Windows/20081209)

Nicholas Vinen wrote:
OK, I only spent a few minutes looking at old code and I found some obviously sub-optimal results. It distills down to this:

#include <avr/io.h>

int main(void) {
  unsigned long packet = 0;

  while(1) {
    if( !(PINC & _BV(PC2)) ) {
      packet = (packet<<1)|(((unsigned char)PINC>>1)&1);
    PORTB = packet;

Did you write the code like this just to test the optimiser? It certainly gives it more of a challenge than most code, since it contains 32-bit data (the compiler writers will place more emphasis on getting good code for far more common 8-bit and 16-bit data), and the compiler must combat the C rules for integer promotion to generate ideal code.

Try re-writing your code like this (which I think is clearer anyway):

int main(void) {
  unsigned long packet = 0;

  while (1) {
    if (!(PINC & _BV(PC2))) {
      packet <<= 1;
      if (PINC & 0x02) {
              packet |= 0x01;
    PORTB = packet;

This generates:

  77                    main:
  78                    /* prologue: frame size=0 */
  79                    /* prologue end (size=0) */
  80 0032 80E0                  ldi r24,lo8(0)   ;  packet,
  81 0034 90E0                  ldi r25,hi8(0)   ;  packet,
  82 0036 A0E0                  ldi r26,hlo8(0)  ;  packet,
  83 0038 B0E0                  ldi r27,hhi8(0)  ;  packet,
  84                    .L7:
  85 003a 9A99                  sbic 51-0x20,2   ; ,
  86 003c 00C0                  rjmp .L8         ;
  87 003e 880F                  lsl r24  ;  packet
  88 0040 991F                  rol r25  ;  packet
  89 0042 AA1F                  rol r26  ;  packet
  90 0044 BB1F                  rol r27  ;  packet
  91 0046 9999                  sbic 51-0x20,1   ; ,
  92 0048 8160                  ori r24,lo8(1)   ;  packet,
  93                    .L8:
  94 004a 88BB                  out 56-0x20,r24  ; , packet
  95 004c 00C0                  rjmp .L7         ;

You may note that this code is in fact one instruction and one cycle shorter than your hand-written assembly...

I'm not disputing the fact that avr-gcc's "optimiser" does not always generate "optimal" code. And there are certainly types of code which can be written smaller and faster in assembly than using any realistic compiler, simply because you can use techniques that are virtually impossible in C or which would require a totally different way of compiling code (using dedicated registers is a prime example).

However, avr-gcc constantly surprises me in the quality of its code generation - it really is very good, and it has got steadily better through the years. Sometimes it pays to think a bit about the way your source code is structured, and maybe test out different arrangements.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]