avr-libc-commit
[Top][All Lists]

## [avr-libc-commit] [2158] Submitted by Jan Waclawek:

 From: Joerg Wunsch Subject: [avr-libc-commit] [2158] Submitted by Jan Waclawek: Date: Thu, 10 Jun 2010 15:48:28 +0000

Revision: 2158
http://svn.sv.gnu.org/viewvc/?view=rev&root=avr-libc&revision=2158
Author:   joerg_wunsch
Date:     2010-06-10 15:48:28 +0000 (Thu, 10 Jun 2010)
Log Message:
-----------
Submitted by Jan Waclawek:
Document issues around compiler optimizations and code reordering
* doc/api/optimize.dox: New file.
* include/avr/cpufunc.h (_MemoryBarrier): xref to optim_code_reorder
* include/avr/interrupt.h: (Ditto.)
* include/util/atomic.h: (Ditto.)

Modified Paths:
--------------
trunk/avr-libc/ChangeLog
trunk/avr-libc/doc/api/doxygen.config.in
trunk/avr-libc/include/avr/cpufunc.h
trunk/avr-libc/include/avr/interrupt.h
trunk/avr-libc/include/util/atomic.h

-----------
trunk/avr-libc/doc/api/optimize.dox

Modified: trunk/avr-libc/ChangeLog
===================================================================
--- trunk/avr-libc/ChangeLog    2010-06-10 15:48:07 UTC (rev 2157)
+++ trunk/avr-libc/ChangeLog    2010-06-10 15:48:28 UTC (rev 2158)
@@ -1,5 +1,19 @@

+       Submitted by Jan Waclawek:
+       Document issues around compiler optimizations and code reordering
+       * doc/api/optimize.dox: New file.
+       * include/avr/cpufunc.h (_MemoryBarrier): xref to optim_code_reorder
+       * include/avr/interrupt.h: (Ditto.)
+       * include/util/atomic.h: (Ditto.)
+
+
+       * doc/api/rel-method.dox: Adapt to our new release numbering scheme.
+
+
* tests/simulate/avr/eeprom-1.c: Remove the old AT90S2313 simulator
bug workaround, as the workaround doesn't work anymore now while the
original code does.

Modified: trunk/avr-libc/doc/api/doxygen.config.in
===================================================================
--- trunk/avr-libc/doc/api/doxygen.config.in    2010-06-10 15:48:07 UTC (rev
2157)
+++ trunk/avr-libc/doc/api/doxygen.config.in    2010-06-10 15:48:28 UTC (rev
2158)
@@ -596,6 +596,7 @@
@top_srcdir@/doc/api/faq.dox \
@top_srcdir@/doc/api/tools-install.dox \
@top_srcdir@/doc/api/using-tools.dox \
+                         @top_srcdir@/doc/api/optimize.dox \
@top_srcdir@/doc/api/using-avrprog.dox \
@top_srcdir@/doc/api/rel-method.dox \
@top_srcdir@/doc/api/acknowledge.dox \

===================================================================
--- trunk/avr-libc/doc/api/optimize.dox                         (rev 0)
+++ trunk/avr-libc/doc/api/optimize.dox 2010-06-10 15:48:28 UTC (rev 2158)
@@ -0,0 +1,137 @@
+/* Copyright (c) 2010 Jan Waclawek
+   Copyright (c) 2010 Joerg Wunsch
+
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are met:
+
+   * Redistributions of source code must retain the above copyright
+     notice, this list of conditions and the following disclaimer.
+   * Redistributions in binary form must reproduce the above copyright
+     notice, this list of conditions and the following disclaimer in
+     the documentation and/or other materials provided with the
+     distribution.
+
+  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+  POSSIBILITY OF SUCH DAMAGE. */
+
+/* $Id$ */
+
+/** \page optimization Compiler optimization
+
+\section optim_code_reorder Problems with reordering code
+\author Jan Waclawek
+
+Programs contain sequences of statements, and a naive compiler would
+execute them exactly in the order as they are written. But an
+optimizing compiler is free to \e reorder the statements - or even
+parts of them - if the resulting "net effect" is the same. The
+"measure" of the "net effect" is what the standard calls "side
+effects", and is accomplished exclusively through accesses (reads and
+writes) to variables qualified as \c volatile. So, as long as all
+volatile reads and writes are to the same addresses and in the same
+order (and writes write the same values), the program is correct,
+regardless of other operations in it. (One important point to note
+here is, that time duration between consecutive volatile accesses is
+not considered at all.)
+
+Unfortunately, there are also operations which are not covered by
+volatile accesses. An example of this in avr-gcc/avr-libc are the
+cli() and sei() macros defined in <avr/interrupt.h>, which convert
+directly to the respective assembler mnemonics through the __asm__()
+statement. These don't constitute a variable access at all, not even
+volatile, so the compiler is free to move them around. Although there
+is a "volatile" qualifier which can be attached to the __asm__()
+statement, its effect on (re)ordering is not clear from the
+documentation (and is more likely only to prevent complete removal by
+the optimiser), as it (among other) states:
+
+<em>Note that even a volatile asm instruction can be moved
+relative to other code, including across jump instructions. [...]
+Similarly, you can't expect a sequence of volatile asm instructions to
+remain perfectly consecutive.</em>
+
+\sa http://gcc.gnu.org/onlinedocs/gcc-4.3.4/gcc/Extended-Asm.html
+
+There is another mechanism which can be used to achieve something
+similar: <em>memory barriers</em>. This is accomplished through adding a
+special "memory" clobber to the inline \c asm statement, and ensures that
+all variables are flushed from registers to memory before the
+statement, and then re-read after the statement. The purpose of memory
+barriers is slightly different than to enforce code ordering: it is
+supposed to ensure that there are no variables "cached" in registers,
+so that it is safe to change the content of registers e.g. when
+switching context in a multitasking OS (on "big" processors with
+out-of-order execution they also imply usage of special instructions
+which force the processor into "in-order" state (this is not the case
+of AVRs)).
+
+However, memory barrier works well in ensuring that all volatile
+accesses before and after the barrier occur in the given order with
+respect to the barrier. However, it does not ensure the compiler
+moving non-volatile-related statements across the barrier. Peter
+Dannegger provided a nice example of this effect:
+
+\code
+#define cli() __asm volatile( "cli" ::: "memory" )
+#define sei() __asm volatile( "sei" ::: "memory" )
+
+unsigned int ivar;
+
+void test2( unsigned int val )
+{
+  val = 65535U / val;
+
+  cli();
+
+  ivar = val;
+
+  sei();
+}
+\endcode
+
+compiles with optimisations switched on (-Os) to
+
+\verbatim
+00000112 <test2>:
+ 112:  bc 01           movw    r22, r24
+ 114:  f8 94           cli
+ 116:  8f ef           ldi     r24, 0xFF       ; 255
+ 118:  9f ef           ldi     r25, 0xFF       ; 255
+ 11a:  0e 94 96 00     call    0x12c   ; 0x12c <__udivmodhi4>
+ 11e:  70 93 01 02     sts     0x0201, r23
+ 122:  60 93 00 02     sts     0x0200, r22
+ 126:  78 94           sei
+ 128:  08 95           ret
+\endverbatim
+
+where the potentially slow multiplication is moved across cli(),
+resulting in interrupts to be disabled longer than intended. Note,
+that the volatile access occurs in order with respect to cli() or
+sei(); so the "net effect" required by the standard is achieved as
+intended, it is "only" the timing which is off. However, for most of
+embedded applications, timing is an important, sometimes critical
+factor.
+
+\sa https://www.mikrocontroller.net/topic/65923
+
+Unfortunately, at the moment, in avr-gcc (nor in the C standard),
+there is no mechanism to enforce complete match of written and
+executed code ordering - except maybe of switching the optimization
+completely off (-O0), or writing all the critical code in assembly.
+
+To sum it up:
+
+\li memory barriers ensure proper ordering of volatile accesses
+\li memory barriers don't ensure statements with no volatile accesses to be
reordered across the barrier
+
+*/

Property changes on: trunk/avr-libc/doc/api/optimize.dox
___________________________________________________________________
+ text/plain
+ Author Id Date
+ native

Modified: trunk/avr-libc/include/avr/cpufunc.h
===================================================================
--- trunk/avr-libc/include/avr/cpufunc.h        2010-06-10 15:48:07 UTC (rev
2157)
+++ trunk/avr-libc/include/avr/cpufunc.h        2010-06-10 15:48:28 UTC (rev
2158)
@@ -71,6 +71,9 @@
registers beyond the barrier.  This can sometimes be more effective
than blocking certain optimizations by declaring some object with a
\c volatile qualifier.
+
+   See \ref optim_code_reorder for things to be taken into account
+   with respect to compiler optimizations.
*/
#define _MemoryBarrier()
#else  /* real code */

Modified: trunk/avr-libc/include/avr/interrupt.h
===================================================================
--- trunk/avr-libc/include/avr/interrupt.h      2010-06-10 15:48:07 UTC (rev
2157)
+++ trunk/avr-libc/include/avr/interrupt.h      2010-06-10 15:48:28 UTC (rev
2158)
@@ -51,7 +51,16 @@
/** \name Global manipulation of the interrupt flag

The global interrupt flag is maintained in the I bit of the status
-    register (SREG).
+    register (SREG).
+
+    Handling interrupts frequently requires attention regarding atomic
+    access to objects that could be altered by code running within an
+    interrupt context, see <util/atomic.h>.
+
+    Frequently, interrupts are being disabled for periods of time in
+    order to perform certain operations without being disturbed; see
+    \ref optim_code_reorder for things to be taken into account with
+    respect to compiler optimizations.
*/

#if defined(__DOXYGEN__)

Modified: trunk/avr-libc/include/util/atomic.h
===================================================================
--- trunk/avr-libc/include/util/atomic.h        2010-06-10 15:48:07 UTC (rev
2157)
+++ trunk/avr-libc/include/util/atomic.h        2010-06-10 15:48:28 UTC (rev
2158)
@@ -184,6 +184,8 @@
entering the ATOMIC_BLOCK, it should be executed with the
parameter ATOMIC_RESTORESTATE rather than ATOMIC_FORCEON.

+    See \ref optim_code_reorder for things to be taken into account
+    with respect to compiler optimizations.
*/

/** \def ATOMIC_BLOCK(type)