avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines


From: Georg-Johann Lay
Subject: Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location]
Date: Fri, 14 Dec 2012 23:00:22 +0100
User-agent: Thunderbird 2.0.0.24 (Windows/20100228)

Erik Christiansen schrieb:
The goal now seems to be to let .text grow contiguously across pages, if
not obstructed by a __flashN page occupied by .progmemN.data. That means
that any occupied __flashN must be higher than _etext, to avoid overlap.
(And aligned to the 0x10000 page boundary.)
Alignment is neither required nor enough.  The 2nd byte (the one
loaded into RAMPZ) must match.  See the "subset" condition in the
other post.

IIUC, that constraint is met cooperatively. From your first example, I
understand that gcc loads RAMPZ with N for .progmemN.data. So long as
the new linker script sets the VMA for .progmemN.data to 0xN0000, they
match, do they not?

The "subset" constraint seems to say that the page must only span from
0xN0000 to 0xNffff, which I'd already assumed. I'm beginning to suspect
that there's a hidden meaning.

It's the common meaning of subset as you already know from maths:

.progmemN.data must be a subset of [N * 2^16, (N*1) * 2^16)

Notice that the empty set is a subset of any other set.

Even if there is not possible (with reasonable effort) to support a
feature, the linker should complain as much as possible if some
assertion like .lowtext must be in the first 17-bit page.

That sounds like a new one. OK, a 17-bit "page" is two 16-bit physical
pages?

Not sure if "page" is helpful wording; "address range" fits better. But we can use "page" if there is no reason for confusion.

The 64-bit ranges come from the ELPM instruction that takes the 16-bits of Z-reg and concatenates RAMPZ as bits 16..23 to get a 24-bit address.

That way, the Atmel engineer need not to extend the AVR core; it was sufficient to draw 8 lines from RAMPZ (just an SFR, not a core register) to the address bus.

When the compiler generates code to read from address space __flash1, it loads '1' into RAMPZ. This '1' is literally loaded, i.e. the assembler sees a '1' and not some modifier like hlo8(var) that is not known before link time and fixed up by the linker / locater.

This means the compiler assumes that var is located in such a way that all of its bytes are elements of [0x10000, 0x20000). The compiler puts var into section .progmem1.data to express this.

If var is not located appropriately, the compiler still loads RAMPZ with 1 and loads Z with the lower 16 bits of &var, but if var is at 0x23456 the compiler will access 0x13456 instead and read garbage.

Similar issues arise with EIJMP / EICALL can EIND. Again, the Atmel engineers painted some lines from EIND to an address bus -- this time the bus to fetch code from flash.

As AVR instructions are 2-byte aligned, 16 bits can hold 2^16 word locations, i.e. the address range for IJMP /ICALL can span 2 * 2^16 = 2^17 bytes.

If code shall be generated for indirect jumps, the compiler is stuck because addresses are 16 bits wide and thus can only address the first 128KiB of flash.

The implemented solution is that whenever an address of a code label is taken, the compiler emits a gs(.L) "generate stub" to get a 16-bit value for .L. If .L is located in the low 128KiB, nothing happens and the indirect jump that uses gs(.L) in the remainder will work as intended. If .L it outside the first 128KiB, the linker will generate a stub in section .trampolines that performs JMP .L and will deliver the address of the stub in gs(.L). An indirect call EICALL will then call the stub which jumps to .L.

This works because JMP can address all flash words and because .trampolines is located appropriately, namely in a way so that

    .trampolines is subset of [EIND * 2^17, (1+EIND) * 2^17)

EIND is set to 0 during startup. You can also set it to 1 in the startup code and locate .trampolines appropriately. Again, notice that the subset condition is trivially satisfied if there are no stubs, i.e. .trampolines is empty. BTW: The compiler does not set EIND in the program, and setting EIND by hand is not supported. It's likely the user shreds the code if he sets EIND by hand after the init stage.

The situation with the stubs is even more complicated because as the linker generates more stubs, it will push .text to a higher address, and that may require more stubs because more gs() carry labels outside the EIND range.

As said, it's already helpful to have comprehensible diagnose if
.trampolines is pushed across a 17-bit boundary.

Can you express that constraint arithmetically? Or is it that the tail
end of .trampolines must be on the same 2^17 "page". I'll have to try to
understand why it is not a 2^16 page, like __flashN.

See above. Maybe .trampolines needs some margin at the high end because it can grow as the linker generates new stubs. I actually don't know. Most of the stuff I know (or believe to know) comes from reverse engineering when I tried to clean up EIND usage in avr-gcc (PR50820) and to add some notes and caveats to the docs.

Johann



reply via email to

[Prev in Thread] Current Thread [Next in Thread]