[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug binutils/31454] New: Add constant tracking to disassembly (objdump
From: |
jakub at redhat dot com |
Subject: |
[Bug binutils/31454] New: Add constant tracking to disassembly (objdump -d, gdb disas) |
Date: |
Thu, 07 Mar 2024 10:26:11 +0000 |
https://sourceware.org/bugzilla/show_bug.cgi?id=31454
Bug ID: 31454
Summary: Add constant tracking to disassembly (objdump -d, gdb
disas)
Product: binutils
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: binutils
Assignee: unassigned at sourceware dot org
Reporter: jakub at redhat dot com
Target Milestone: ---
Consider
unsigned foo (void) { return 0xdeadbeefU; }
unsigned long long bar (void) { return 0xdeadbeefcafebabeULL; }
static int p;
int *baz (void) { return &p; }
int main () {}
When linked on x86_64 with -O2 -fpic, objdump -d and gdb disassemble already
does some
immediate visualization to help user reading the code:
0000000000401140 <baz>:
401140: 48 8d 05 d9 2e 00 00 lea 0x2ed9(%rip),%rax #
404020 <__TMC_END__>
401147: c3 ret
or
Dump of assembler code for function baz:
0x0000000000401140 <+0>: lea 0x2ed9(%rip),%rax # 0x404020 <p>
0x0000000000401147 <+7>: ret
knows to handle lea with immediate and (%rip) to add the 0x2ed9 in there with
end of the instruction and print the resulting immediate and perhaps symbolic
rendering of it in the comment.
The 0xdeadbeef and 0xdeadbeefcafebabe immediates are clearly shown in the
assembly, so there is no need to help users reading that.
Now, let's try the same on other arches, e.g. aarch64:
400140: 5297dde0 mov w0, #0xbeef //
#48879
400144: 72bbd5a0 movk w0, #0xdead, lsl #16
in foo,
400160: d29757c0 mov x0, #0xbabe //
#47806
400164: f2b95fc0 movk x0, #0xcafe, lsl #16
400168: f2d7dde0 movk x0, #0xbeef, lsl #32
40016c: f2fbd5a0 movk x0, #0xdead, lsl #48
in bar and
400180: f00000e0 adrp x0, 41f000 <baz+0x1ee80>
400184: 913fa000 add x0, x0, #0xfe8
in baz. It would be helpful if the disassembly could for a small set of
instructions which are usually involved in constant creations in GPR registers
be able to propagate constants through them; for each GPR register remember if
it is set to a known constant (then also the constant value) or not. When
seeing a start of a function (new symbol?)
reset this knowledge, maybe also reset it on possible conditional/unconditional
jump destinations from the same function (though computing that might require
another pass through the instructions), when seeing a GPR register set with a
handled instruction to constant remember that constant, when seeing a handled
instruction where all the inputs
have known constant values try to evaluate the instruction and remember the
resulting constant and then show in comments like in the lea case above the
immediate plus symbolic rendering if any. And when seeing an unhandled
instruction that sets or clobbers some GPR (or might do that), forget the value
of that register.
So, for foo above, remember that w0 is set to 0xbeef, interpret the movk
instruction that the result is 0xdeadbeef and tell it to the user, ditto for
the second case, similarly remember for adrp and handle the add too, printing
there 41ffe8 <p>.
Now, repeat this on other arches, powerpc{,64,64le}, sparc{,64}, ...
On s390x, one can also see that it loads some constants from
.rodata/.data.rel.ro* and similar sections, those too would be nice to track
and print.
This would help users so that they don't have to scratch their heads
interpreting the instructions or having to actually see what it does at runtime
to find out what it actually computes.
In gdb, sometimes one just disassembles part of a function, not the whole one,
I think it would be perfectly fine to start with nothing known state at the
start of such a block and print only what is discovered in that block.
--
You are receiving this mail because:
You are on the CC list for the bug.
- [Bug binutils/31454] New: Add constant tracking to disassembly (objdump -d, gdb disas),
jakub at redhat dot com <=