[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [avr-gcc-list] Linker script patch with __flashN size checking. [Was
From: |
Georg-Johann Lay |
Subject: |
Re: [avr-gcc-list] Linker script patch with __flashN size checking. [Was: Handling __flash1 and .trampolines] |
Date: |
Sun, 16 Dec 2012 15:03:38 +0100 |
User-agent: |
Thunderbird 2.0.0.24 (Windows/20100228) |
Erik Christiansen schrieb:
On 14.12.12 23:00, Georg-Johann Lay wrote:
The 64-bit ranges come from the ELPM instruction that takes the
_______^^^^^^
16-bit or 64KiB, of course
16-bits of Z-reg and concatenates RAMPZ as bits 16..23 to get a
24-bit address.
The common computer terms "paged" and "page aligned" seem to fit
perfectly here. AIUI, the Z-reg is a page register, providing base 2^16
page addresses, and RAMPZ provides an address within the page.
It is "page aligned", because any RAMPZ == 0 will always be 2^16 aligned
in flash.
The new linker script observes that constraint by starting each __flashN
aligned to 0x10000 page boundaries, i.e. on 0xN0000.
Attached you find my latest draft patch. Attached because it is easy to
mess up a patch by copying a line too much or too less. Even if that
line is empty or you miss one blank, patch will barf.
== Problems it solves ==
- .progmemN is located correctly or will raise an assertion. The
assertion will only be thrown if .progmemN is non-empty.
- Sanity checking for: .lowtext, .trampolines, .jumptables
- There is .progmemx at the end of .text where __memx data will go.
- It seems to work with, say, -Ttext=0x20000 which is not uncommon with
bootloaders.
== Caveats, new problems and problems it does not solve ==
- It is not possible to bump the location only if there is a need to do
so, i.e. if .progmemN is non-empty. This wastes memory and blows the
unconditionally blows binaries as it bumps data and text orphans to
0x30000 or higher, cf. [1].
- The .progmemN stuff is handled within .text so that there is no need
to specify -j .flashN with objcopy.
- If the case of an overflow there is just a fixed text massage without
an indication on how much the sections overflow. The user must
trial-and-error change the source until she finds a configuration that
passes all assertions, cf. [1] again.
- There is no test for progmem. I don't know how .progmem is used. If
it's common practice to use GET_FAR_ADDRESS or similar kludge with
progmem, there must be no assertion or otherwise it will break existing
code. I guess Jan does know?
- size will report insanely high values because the location is bumped
unconditionally, i.e. even for a trivial program the size will be at
least 0x30000.
- It is no more possible to (with the new flash.sx)
-Ttext=0x20000 -DBOOT=10 -Wl,--section-start=.bootloader=0x28000
if one wants to place an orphan. The linker will throw:
ld.exe: section .bootloader loaded at [00028000,00028009]
overlaps section .text loaded at [00020000,0002ffff]
and similar without -Ttext=
- Diagnostic messages are printed twice.
The script moves a comment up that says ctors / dtors is needed in the
lower 128Ki. I don't see a requirement for that. ctors goes like that:
__do_global_ctors reads stubs locations from .ctors, i.e. .ctors
contains an array of gs() stubs. __do_global_ctors handles RAMPZ
correctly while it reads from .ctors (by CALLing __tablejump_elpm__ so
that the ctor will RET to __do_global_ctors. __tablejump_elpm__
performs an EIJMP that targets a stub in .trampolines which JUMPs to the
ctor function.
Ditto for dtors.
== Conclusion ==
All in all, the current draft script is too intrusive and has too many
negative side effects on existing code.
.progmemx will work smooth and can be added without problems, same for
the more restrict .progmem pattern. avr-gcc can be extended to put
__memx into .progmemx.data instead of in .progmem.data which will both
work with the old and a new script (but with a new script it will work
better because no low precious locations are consumed).
The linker script language appears to be too weak to express what's
needed and without disturbing existing code in an inappropriate way.
I extended and attached the flash.sx torture source with the following
defines:
PX : Put data into .progmemx (prior to .progmem1)
BOOT : Put data into some "ax" orphan section (.bootloader)
CTORS : Add some constructors
Many thanks for the "stubs" explanation. I hadn't caught on to its
relationship to trampolines.
.trampolines is subset of [EIND * 2^17, (1+EIND) * 2^17)
.trampolines is subset of [EIND * 2^17, (1+EIND)] ?
No, subset of
[EIND * 2^17, (1+EIND) * 2^17)
which is the same as
[EIND * 2^17, (1+EIND) * 2^17 -1]
The ) at the right matches the [ at the left and means that the
rightmost value is omitted from the range / interval.
Most of the stuff I know (or believe to know) comes from
reverse engineering when I tried to clean up EIND usage in avr-gcc
(PR50820) and to add some notes and caveats to the docs.
Your knowledge is directly from the coalface, then - the most
hard-earned and real kind. I'll try to make the linker scripts dance, to
give us whatever you need to make that job easier.
NACK, the "real kind" of knowledge will not come from reverse
engineering, it will come from proper documentation written by the
original authors.
Reverse-engineering is very error prone and will lead to
misunderstandings and will trick your intuition and experience to the
wrong track. "real" knowledge is by design, not be digging in the dark.
file: avr6.patch
--- avr6.x-2.23 2012-12-13 13:42:18.000000000 +1100
+++ avr6.x-new 2012-12-15 21:55:31.000000000 +1100
@@ -70,13 +70,12 @@
.rel.plt : { *(.rel.plt) }
.rela.plt : { *(.rela.plt) }
/* Internal text space or external memory. */
- .text :
+ .lowtext :
Why that? I don't think we suppose people to objcopy -j .lowtext or
otherwise their binaries will miss essential parts.
{
*(.vectors)
KEEP(*(.vectors))
/* For data that needs to reside in the lower 64k of progmem. */
*(.progmem.gcc*)
- *(.progmem*)
. = ALIGN(2);
__trampolines_start = . ;
/* The jump trampolines for the 16-bit limited relocs will reside here. */
@@ -98,6 +97,28 @@
__dtors_end = . ;
KEEP(SORT(*)(.ctors))
KEEP(SORT(*)(.dtors))
+ *(.progmem.data*) /* Explicitly page 0 input sections */
+ _elowtext = . ;
+ x = ASSERT (. <= 0x10000, "Error: .lowtext (128KiB limit) overflow. Try
shrinking .progmem?") ;
Message does not fit assertion.
Anyway, I really don't know if users abuse .progmem to drop their
non-volatile data that exceeds 64Ki and use GET_FAR_ADDRESS on that.
morepgmspace.h says nothing about how to locate data and is not
supported by avr-libc, see avr-libc!/include/avr which only brings
pgmspace.h.
The GCC docs [3] promise that
"This attribute [progmem] works similar to the section attribute
but adds additional checking."
so that an additional sanity checking should be in order.
+ } > text
+
+ .flash1 0x10000 :
Again this needs objcopy -j .flash1. It's okay for me because it is a
new feature and we don't have to be compatible with old stuff (Makefiles
and alike). However, it's likely people will miss the -j .flash1 and
complain because thy don't read the docs...
+ { *(.progmem1.data*) /* Page 1 */
+ x = ASSERT (. <= 0x10000, "Error: __flash1 (64KiB limit) overflow. Need to
shrink it.") ;
+ } > text
+
+ .flash2 0x20000 :
+ { *(.progmem2.data*) /* Page 2 */
+ _eflash2 = . ;
+ x = ASSERT (_eflash2 <= 0x10000, "Error: __flash2 (64KiB limit) overflow. Need
to shrink it.") ;
+ } > text
+
+ .flash3 0x30000 :
+ { *(.progmem3.data*) /* Page 3 */
+ } > text
+
+ .hightext ALIGN(2) :
Again, need for -j .hightext breaks existing applications.
+ {
/* From this point on, we don't bother about wether the insns are
below or above the 16 bits boundary. */
*(.init0) /* Start here after reset. */
@@ -146,7 +167,8 @@
KEEP (*(.fini0))
_etext = . ;
} > text
- .data : AT (ADDR (.text) + SIZEOF (.text))
+
+ .data : AT (_etext)
Why _etext? This does not take into account orphans, or does it?
Johann
--
[1]
http://sourceware.org/ml/binutils/2012-12/msg00151.html
[2]
http://savannah.nongnu.org/patch/?6352#comment0
[3]
http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html#index-g_t_0040code_007bprogmem_007d-AVR-variable-attribute-2743
diff --git a/avr6.x b/avr6.x
index 3da58e9..57782e0 100644
--- a/avr6.x
+++ b/avr6.x
@@ -76,7 +76,8 @@ SECTIONS
KEEP(*(.vectors))
/* For data that needs to reside in the lower 64k of progmem. */
*(.progmem.gcc*)
- *(.progmem*)
+ *(.progmem)
+ *(.progmem.*)
. = ALIGN(2);
__trampolines_start = . ;
/* The jump trampolines for the 16-bit limited relocs will reside here. */
@@ -85,11 +86,17 @@ SECTIONS
__trampolines_end = . ;
/* For future tablejump instruction arrays for 3 byte pc devices.
We don't relax jump/call instructions within these sections. */
+ __jumptables_start = . ;
*(.jumptables)
*(.jumptables*)
+ __jumptables_end = . ;
/* For code that needs to reside in the lower 128k progmem. */
+ __lowtext_start = . ;
*(.lowtext)
*(.lowtext*)
+ __lowtext_end = . ;
+ /* From this point on, we don't bother about wether the insns are
+ below or above the 16 bits boundary. */
__ctors_start = . ;
*(.ctors)
__ctors_end = . ;
@@ -98,8 +105,6 @@ SECTIONS
__dtors_end = . ;
KEEP(SORT(*)(.ctors))
KEEP(SORT(*)(.dtors))
- /* From this point on, we don't bother about wether the insns are
- below or above the 16 bits boundary. */
*(.init0) /* Start here after reset. */
KEEP (*(.init0))
*(.init1)
@@ -144,6 +149,26 @@ SECTIONS
KEEP (*(.fini1))
*(.fini0) /* Infinite loop after program termination. */
KEEP (*(.fini0))
+ __progmemx_start = . ;
+ *(.progmemx)
+ *(.progmemx.*)
+ __progmemx_end = . ;
+ . = MAX (1 << 16, ABSOLUTE(.)) ;
+ __progmem1_start = . ;
+ *(.progmem1)
+ *(.progmem1.*)
+ __progmem1_end = . ;
+ . = MAX (2 << 16, ABSOLUTE(.)) ;
+ __progmem2_start = . ;
+ *(.progmem2)
+ *(.progmem2.*)
+ __progmem2_end = . ;
+ . = MAX (3 << 16, ABSOLUTE(.)) ;
+ __progmem3_start = . ;
+ *(.progmem3)
+ *(.progmem3.*)
+ __progmem3_end = . ;
+ . = ALIGN(2);
_etext = . ;
} > text
.data : AT (ADDR (.text) + SIZEOF (.text))
@@ -234,3 +259,47 @@ SECTIONS
/* DWARF Extension. */
.debug_macro 0 : { *(.debug_macro) }
}
+
+
+/* Some sanity checking: .trampolines, .lowtext and .jumptables must be
+ reachable with EIND = 0, i.e. must be located withing the first
+ 64Ki word chunk of flash. */
+
+__assert_jumptables
+ = ASSERT (__jumptables_start == __jumptables_end
+ || __jumptables_end <= 1 << 17,
+ "section overflow: .jumptables");
+
+__assert_lowtext
+ = ASSERT (__lowtext_start == __lowtext_end
+ || __lowtext_end <= 1 << 17,
+ "section overflow: .lowtext");
+
+/* Startup code from avr-libc initialized EIND with hh8(pm(.vectors))
+ thus the trampolines will work if they are located in such a way that
+ EIND = hh8(pm(.trampolines)) */
+
+__assert_stubs
+ = ASSERT (__trampolines_start == __trampolines_end
+ || (ADDR (.text) >> 17 == (__trampolines_end - 1) >> 17
+ && ADDR (.text) >> 17 == __trampolines_start >> 17),
+ "section overflow: .trampolines (linker stubs)");
+
+/* Sanity check the progmem stuff: .progmemN must be located within the
+ N-th 64Ki byte chunk. The low end needs no checking because the MAX
+ with .progmemN sets the right start address. */
+
+__assert_flash1
+ = ASSERT (__progmem1_start == __progmem1_end
+ || __progmem1_end <= 2 << 16,
+ "section overflow: .progmem1 (__flash1)");
+
+__assert_flash2
+ = ASSERT (__progmem2_start == __progmem2_end
+ || __progmem2_end <= 3 << 16,
+ "section overflow: .progmem2 (__flash2)");
+
+__assert_flash3
+ = ASSERT (__progmem3_start == __progmem3_end
+ || __progmem3_end <= 4 << 16,
+ "section overflow: .progmem3 (__flash3)");
#ifndef TEXT
#define TEXT 0
#endif
#ifndef LOWTEXT
#define LOWTEXT 0
#endif
#ifndef STUBS
#define STUBS 0
#endif
#ifndef CTORS
#define CTORS 0
#endif
#ifndef P0
#define P0 0
#endif
#ifndef P1
#define P1 0
#endif
#ifndef P2
#define P2 0
#endif
#ifndef P3
#define P3 0
#endif
#ifndef PX
#define PX 0
#endif
#ifndef BOOT
#define BOOT 0
#endif
.macro Flash n num
.section .progmem\n\().data.foo\n\(),"a",@progbits
.global flash\n\()_start
.type flash\n\()_start,@object
flash\n\()_start:
.fill \num
.size flash\n\()_start, .-flash\n\()_start
.endm
Flash , P0
Flash 1, P1
Flash 2, P2
Flash 3, P3
Flash x, PX
.text
.global main
main:
.rept TEXT / 2
nop
.endr
.rept STUBS
0:
ldi r16, lo8(gs(0b))
ldi r17, hi8(gs(0b))
.endr
.size main, .-main
.section .lowtext,"ax",@progbits
.global lowfunc
lowfunc:
.rept LOWTEXT / 2
nop
.endr
.size lowfunc, .-lowfunc
#if CTORS > 0
.global __do_global_ctors
.section .ctors,"a",@progbits
.p2align 1
.rept CTORS / 2
.word gs(main+16)
.endr
#endif /* CTORS */
#if BOOT > 0
.section .bootloader,"ax",@progbits
.global bootloader
bootloader:
.rept BOOT / 2
nop
.endr
#endif /* BOOT */