avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] [bug] cbi optimization error for 8-bit AVRs


From: Joern Rennecke
Subject: Re: [avr-gcc-list] [bug] cbi optimization error for 8-bit AVRs
Date: Sun, 9 Nov 2014 21:00:11 +0000

On 8 November 2014 00:32, Szikra István <address@hidden> wrote:
> Hi everyone!
>
> My problem in sort: I’m getting
>         in      r24, 0x18
>         ldi     r25, 0x00
>         andi    r24, 0xEF
>         out     0x18, r24
> instead of
>         cbi     0x18, 4
> .
>
> I’m trying to write efficient modern C/C++ code for multiple platforms
> including AVR 8 bit controllers.
>
> Unfortunately GCC support for AVR (among other things) is not always
> flawless. And it changes from versions to version (and not always for the
> better).
> Since I’m a long time AVR developer I have a lot of compiler versions
> installed (WinAVR 2006-2010, and Atmel Studio 6.2 with GCC 4.8.1), but I
> could test my code with only one (the latest).
>
> I run into some trouble with clearing port bits not translating from C into
> cbi in assembler. It is caused by my bits template, but I do not know why.
> It seems to me like a bug in GCC. Maybe someone here can shed some light on
> the reason, or suggest a fix.

The transformation would seem prima facia best suited for the combine pass.
When we see the store, we know it's only 8 bit wide, thus we can perform the
arithmetic in 8 bit.  However, the store is to a volatile (actually,
I/O) address.
Now, gcc is not very good at optimizing code involving volatile (the
target-independent
code would likely already have simplified the arithmetic if these
weren't in the way -
buy you'd really need a completely different test case without
volatile to keep the
computation relevant).
These memory / I/O accesses require extra ordering constraints, so
large parts of the optimizers just punt when they encounter a volatile
reference.
There is some scope to tweak this - I've attached at proof of concept
patch to optimize
your code (based on gcc 5.0).  However, this opens the possibility
that it'll break something with respect to volatile ordering now - or
even later down the line.

ISTR we have some more detailed dependency checks at least in some places. in
fact, if there weren't, we should already see the existing cbi pattern
misbehaving.
consider:
typedef unsigned char uint8_t;

main ()
{
int i = (*(volatile uint8_t *)((0x18) + 0x20)) ;
(*(volatile uint8_t *)((0x17) + 0x20)) = 0x0f;
(*(volatile uint8_t *)((0x18) + 0x20))  = i & ~4;
}
if the complier was oblivious to the intervening write it could
generate a cbi here -
but it doesn't - well, at least not at -O2 ...

Another possible approach would be to use a peephole2 pattern or a
target-specific
optimization pass to do the 16->8 bit arithmetic transformation.
However, if you work after combine (as is definitely the case with peephole2),
you need to

Attachment: tmp.diff
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]