[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[avr-gcc-list] Re: C vs. assembly performance

From: David Brown
Subject: [avr-gcc-list] Re: C vs. assembly performance
Date: Sun, 01 Mar 2009 01:00:25 +0100
User-agent: Thunderbird (Windows/20081209)

Georg-Johann Lay wrote:
David Brown schrieb:
Nicholas Vinen wrote:

OK, I only spent a few minutes looking at old code and I found some obviously sub-optimal results. It distills down to this:

#include <avr/io.h>

int main(void) {
  unsigned long packet = 0;

  while(1) {
    if( !(PINC & _BV(PC2)) ) {
      packet = (packet<<1)|(((unsigned char)PINC>>1)&1);
    PORTB = packet;

Did you write the code like this just to test the optimiser? It

As far as I understand, it's a stripped down example to demonstrate the code bloat in a reproducable way (combileable source).

Yes, I understand - it's just bad luck that it happens to be particularly tough code for the optimiser.

However, avr-gcc constantly surprises me in the quality of its code generation - it really is very good, and it has got steadily better through the years. Sometimes it pays to think a bit about the way your source code is structured, and maybe test out different arrangements.

Source code structure is a concern of the project, not of the compiler.
Even for braindead code that comes from a code generator a compiler is supposed to yield good results.

That's true in theory - but embedded programmers are used to the difference between theory and practice (there's an interesting discussion about the theory and practice of "volatile" on comp.arch.embedded at the moment). In theory, the compiler should generate good code no matter how the source code is structured. In practice, the experienced programmer can do a lot to help the tools. avr-gcc *does* do a good job with most code - I do much less re-structuring of my source code for avr-gcc than I do for most other compilers (I use a lot of compilers for a lot of different targets).

I am inspecting the produced asm in some of my AVR projects with hard realtime requirements, too. But I would not encourage anyone to dig in the generated asm and try to get best code by re-arranging it or trying to find other algebraic representations. That takes a lot of time, and a compiler should care for the sources it gets, not the other way round. And if your code is intended to be cross-platform, you are stuck. If your code changes some 100 source lines away from the critical code, the inefficient code can return and you have to rewrite your code again to find another representation that avoids the bad code.

It is certainly true that you want to keep such compiler-helpful structuring to a minimum. But if you are trying to write efficient code (rather than emphasising portability or development speed or other priorities), you *must* be familiar with your compiler and the types of code it generates for particular sequences of input. You can very quickly learn some basic tricks that can make a great difference to the generated code with very little re-structuring of the source code. A prime example is to use 8-bit data rather than traditional C "int" where possible. Another case in point is to prefer explicit "if" conditionals rather than trying to calculate a conditional expression, such as was done here (if you are using a heavily pipelined processor, the opposite is true).

But I fully agree that you should not be hand-optimising all your source code and studying the generated assembly - the readability of the source code is more important than the tightness of the generated code in all but the most time-critical sections (there's no point in writing fast code if you can't be sure it's correct!).

However, in this case, I believe that my re-write is better source code, although I'm aware that's a personal preference. I think it is much clearer what the code is doing, and it is far more obvious which pins are being used - it would also be much easier for proper code (rather than this example code) in which the pins would normally have defined symbolic names rather than "magic numbers" in the code.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]