[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os

From: Dave N6NZ
Subject: Re: [avr-gcc-list] Avr-gcc Produces Incorrect Code with -Os
Date: Fri, 16 May 2008 14:12:35 -0700
User-agent: Thunderbird 1.5 (X11/20051201)

John Regehr wrote:
Well, isn't the net effect of volatile simply a more fine-grained clobbering lock?

Almost but not quite:

- volatile says nothing about the atomicity of any given access

- volatile does not suppress reordering (except with other volatiles)

- volatile has no effect on caches and out-of-order memory subsystems (not an issue for AVR obviously)

Well, true, but lack of cache coherency is a hardware bug. But I agree it throws in some interesting ordering issues w.r.t the volatile keyword. In any case if the CPU writes it's local data cache, that should invalidate all other copies. (Well, all other copies in data caches. Instruction caches often require explicit invalidates.)

(I spent quite a few years as a CPU logic designer, starting about 1980. Pretty much every machine I worked on was out-of-order to some extent, and had caches of some flavor.)

Also volatile is usually too fine-grained, ensuring consistency-always instead of what you want (consistency on lock release) and this can easily lead to inefficiencies.

True in the case of trying to make volatile into a critical section. Not true for its original use as a way to talk to PDP-11 memory mapped I/O.

That's a counter-intuitive result. The "No idea why." part makes me a little squinty-eyed. It certainly *could* be a generalizable result, but then again it might be an artifact of your code structure.


My off-the-wall guess was that the clobbers reduced register pressure. I could not think of an easy way to test that hypothesis.

Forcing early spills helps? Does that say that the optimizer should be more aggressive about spilling? I should probably get quiet about now since I'm wandering outside my expertise talking about register allocators.

Finally I'll just add a random plug for a piece of work that a colleague and I recently completed where we found that most compilers have problems implementing the volatile qualifier:


Interesting. I read over section 2. Will have to go back and read the whole thing.


OT shaggy dog story about caches: In the 1980's I remember reading the weekly highlights from another CPU project in our same design center. During checkout they had discovered a performance bug in the cache invalidate equation which required a single OR-gate to fix, and boosted the performance of an important database benchmark by 9% on 4-CPU systems. One senior logic designer on our project, who was a bit of a wag, said: "9% from just one OR-gate! We need to get some of those OR-gates for *our* project!"

reply via email to

[Prev in Thread] Current Thread [Next in Thread]