bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug ld/22903] New: [AArch64] Insufficient veneer stub alignment


From: pexu at sourceware dot mail.kapsi.fi
Subject: [Bug ld/22903] New: [AArch64] Insufficient veneer stub alignment
Date: Wed, 28 Feb 2018 15:50:37 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=22903

            Bug ID: 22903
           Summary: [AArch64] Insufficient veneer stub alignment
           Product: binutils
           Version: 2.31 (HEAD)
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P2
         Component: ld
          Assignee: unassigned at sourceware dot org
          Reporter: pexu at sourceware dot mail.kapsi.fi
  Target Milestone: ---

Hi.

It is not currently possible to specify an alignment requirement that will be
used for generated veneer stubs (i.e. far calls for -fpic, -fpie etc. builds).

Currently, the alignment for the stubs is 4 bytes. While this works just fine
for the majority of the systems, it works only because many requisite deeds has
been done beforehand (and a hint of luck, too). 

The problematic veneer template (aarch64_long_branch_stub at
bfd/elfnn-aarch64.c) uses LDR to load the far address. The address itself is
stored after the veneer code block, which does the address loading (via
LDR/ADD) and branching. The template looks like this:

  ldr ip0, 1f # <-- ip0, i.e. X16, i.e. 64-bit register
  adr ip1, #0
  add ip0, ip0, ip1
  br  ip0
  1: .xword <address>

While the address is 8-byte aligned within the stub itself, it will be
misaligned unless the veneer lands on a 8-byte (or more) aligned address.
ARMv8-A ARM clearly states, that unless an address is accessed to the size of
the data element being accessed (i.e. N-bit accesses must be N-bit aligned)
either an Alignment fault is generated or an unaligned access is performed.

It is possible to disable the alignment check, and thus perform an unaligned
access, via system register SCTLR_ELx.A (e.g. the case for Linux). However,
there's a small catch 22 that is well buried into the small details within the
ARM. If the stage 1 address translation is disabled (e.g. MMU disabled),
Device-nGnRnE memory type is assigned to all data accesses (or the address
simply happens to be some type of Device memory, nothing unusual with SoCs).
Unlike Normal memory type, all accesses to any type of Device memory *must* be
aligned, period.

So, if the code has to deal with a large memory area and is not able to use MMU
(say, not available or being set up), and thus no address translation is
enabled, or for whatever reason uses Device memory type, LD's current approach
will generate code, that is highly prone to intermittent failures that could be
difficult to track down (without proper JTAG tools) as no matter how well the
user does his task, the generated code is the source of the failure. Also, it
should be understood that it would be an overkill and highly complex task
trying to recover from this sort of exception (must interpret the bytecode,
then perform aligned access(es), maybe patch the bytecode etc.) while the
proper thing to do is to simply not perform any unaligned accesses when such
accesses are not possible.

Obviously, one can always just generate the long branches by hand, maybe use
static linking where possible, so this is not a roadblocker by no means. As the
subject is rather undocumented and there's apparently a patch readily
available, this should be fixed. Perhaps there is no need to change the default
alignment (without further studies), but it should be possible to change the
alignment nevertheless.

I hope I provided enough background information for this rare, but indeed
curious case!

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]