[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines
Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location]
Fri, 14 Dec 2012 23:00:22 +0100
Thunderbird 220.127.116.11 (Windows/20100228)
Erik Christiansen schrieb:
The goal now seems to be to let .text grow contiguously across pages, if
not obstructed by a __flashN page occupied by .progmemN.data. That means
that any occupied __flashN must be higher than _etext, to avoid overlap.
(And aligned to the 0x10000 page boundary.)
Alignment is neither required nor enough. The 2nd byte (the one
loaded into RAMPZ) must match. See the "subset" condition in the
IIUC, that constraint is met cooperatively. From your first example, I
understand that gcc loads RAMPZ with N for .progmemN.data. So long as
the new linker script sets the VMA for .progmemN.data to 0xN0000, they
match, do they not?
The "subset" constraint seems to say that the page must only span from
0xN0000 to 0xNffff, which I'd already assumed. I'm beginning to suspect
that there's a hidden meaning.
It's the common meaning of subset as you already know from maths:
.progmemN.data must be a subset of [N * 2^16, (N*1) * 2^16)
Notice that the empty set is a subset of any other set.
Even if there is not possible (with reasonable effort) to support a
feature, the linker should complain as much as possible if some
assertion like .lowtext must be in the first 17-bit page.
That sounds like a new one. OK, a 17-bit "page" is two 16-bit physical
Not sure if "page" is helpful wording; "address range" fits better. But
we can use "page" if there is no reason for confusion.
The 64-bit ranges come from the ELPM instruction that takes the 16-bits
of Z-reg and concatenates RAMPZ as bits 16..23 to get a 24-bit address.
That way, the Atmel engineer need not to extend the AVR core; it was
sufficient to draw 8 lines from RAMPZ (just an SFR, not a core register)
to the address bus.
When the compiler generates code to read from address space __flash1, it
loads '1' into RAMPZ. This '1' is literally loaded, i.e. the assembler
sees a '1' and not some modifier like hlo8(var) that is not known before
link time and fixed up by the linker / locater.
This means the compiler assumes that var is located in such a way that
all of its bytes are elements of [0x10000, 0x20000). The compiler puts
var into section .progmem1.data to express this.
If var is not located appropriately, the compiler still loads RAMPZ with
1 and loads Z with the lower 16 bits of &var, but if var is at 0x23456
the compiler will access 0x13456 instead and read garbage.
Similar issues arise with EIJMP / EICALL can EIND. Again, the Atmel
engineers painted some lines from EIND to an address bus -- this time
the bus to fetch code from flash.
As AVR instructions are 2-byte aligned, 16 bits can hold 2^16 word
locations, i.e. the address range for IJMP /ICALL can span 2 * 2^16 =
If code shall be generated for indirect jumps, the compiler is stuck
because addresses are 16 bits wide and thus can only address the first
128KiB of flash.
The implemented solution is that whenever an address of a code label is
taken, the compiler emits a gs(.L) "generate stub" to get a 16-bit value
for .L. If .L is located in the low 128KiB, nothing happens and the
indirect jump that uses gs(.L) in the remainder will work as intended.
If .L it outside the first 128KiB, the linker will generate a stub in
section .trampolines that performs JMP .L and will deliver the address
of the stub in gs(.L). An indirect call EICALL will then call the stub
which jumps to .L.
This works because JMP can address all flash words and because
.trampolines is located appropriately, namely in a way so that
.trampolines is subset of [EIND * 2^17, (1+EIND) * 2^17)
EIND is set to 0 during startup. You can also set it to 1 in the
startup code and locate .trampolines appropriately. Again, notice that
the subset condition is trivially satisfied if there are no stubs, i.e.
.trampolines is empty. BTW: The compiler does not set EIND in the
program, and setting EIND by hand is not supported. It's likely the
user shreds the code if he sets EIND by hand after the init stage.
The situation with the stubs is even more complicated because as the
linker generates more stubs, it will push .text to a higher address, and
that may require more stubs because more gs() carry labels outside the
As said, it's already helpful to have comprehensible diagnose if
.trampolines is pushed across a 17-bit boundary.
Can you express that constraint arithmetically? Or is it that the tail
end of .trampolines must be on the same 2^17 "page". I'll have to try to
understand why it is not a 2^16 page, like __flashN.
See above. Maybe .trampolines needs some margin at the high end because
it can grow as the linker generates new stubs. I actually don't know.
Most of the stuff I know (or believe to know) comes from reverse
engineering when I tried to clean up EIND usage in avr-gcc (PR50820) and
to add some notes and caveats to the docs.
- Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location],
Georg-Johann Lay <=