libunwind-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Libunwind-devel] Mixed stack unwinding bug on ppc64le


From: Stephen Roberts
Subject: [Libunwind-devel] Mixed stack unwinding bug on ppc64le
Date: Thu, 16 Aug 2018 09:26:17 +0000

Hi libunwind developers,

I'd like some advice about fixing a bug on ppc64 which prevents libunwind from 
unwinding through mixed C/C++ stacks. Unwinding fails on this platform whenever 
libunwind tries to unwind from a callee stack frame without Call Frame 
Information (CFI) into caller frames with CFI information. This is a common 
scenario on ppc64 because gcc does not produce CFI by default on this platform 
while g++ still produces CFI to support exception handling. Any C++ application 
which calls in to C library code compiled with the default flags can trigger 
this issue.

This bug comes from the unw_step function in ppc64/Gstep.c. This function first 
tries to unwind using CFI via dwarf_step and if this fails with -UNW_ENOINFO it 
falls back to following the backchain and clobbers all registers in the 
process. Any dwarf_step failure other than -UNW_ENOINFO causes unw_step to bomb 
out immediately.

The mixed frame bug happens because when libunwind unwinds through the non-CFI 
frame it will fall back to the backchain and clobber all registers. Then, when 
it encounters a frame with CFI it will try to use this information to unwind 
further, which promptly fails because all of the registers have been clobbered. 
The return value of dwarf_step in this case is not -UNW_ENOINFO because 
information was found, so unwinding is abandoned. This bug doesn't impact other 
platforms because they make some attempt to guess the correct values of certain 
registers after they are clobbered, which in most (but not all) cases will be 
sufficient to allow CFI dwarf unwinding to resume.

I have two candidate solutions, both of which solve the issue but make 
different compromises. My first solution is to include an "ignore_cfi" flag 
inside the struct cursor from tdep-ppc64/libunwind_i.h. 
This flag is initially false but set to true whenever dwarf_step fails with 
-UNW_ENOINFO. It has the effect of skipping the dwarf_step logic entirely, 
causing libunwind to stick to using back chain unwinding after it encounters 
its first frame without CFI. This is a robust and simple solution, but I'm not 
sure if this is the libunwind way to approach the issue.

My second solution is a little more invasive, but has the potential to provide 
more information. The dwarf_step function uses dwarf_apply_registers internally 
to 1) compute the new Canonical Frame Address (CFA) and 2) update the register 
values. My fix effectively breaks the logic in dwarf_apply_registers out into 
two helper functions, something like "dwarf_advance_cfa" and 
"dwarf_update_registers".

The dwarf_step function still works exactly as before, but we can now use 
"dwarf_update_registers" to update registers for a user-specified CFA. Given we 
have a valid CFA from backchain unwinding, we call "dwarf_update_registers" 
with our backchain-derived CFA to leave us with some register values even when 
we've unwound from a non-CFA frame. The next time we unwind then we will 
*probably* have the registers we need to continue to unwind successfully using 
CFI. The caveat with this approach is it is possible for compilers to save 
volatile registers in non-volatile registers rather than on the stack.
In this case, dwarf_update_registers would fail to retrieve the register-saved 
value and would fail. This is rare in practice, and is still much better than 
never being able to unwind. If we wanted we could be extra-safe and use both 
solutions, with the ignore_cfi flag being set only when dwarf_update_registers 
failed.

My first question is this: which approach is more in-keeping with the libunwind 
philosophy? Would either (or both) of them be accepted upstream? I've tested 
both approaches and found they both solve my issues on ppc64le, so either way 
I'm happy.

A second question involves writing test cases. I have a test case that I run 
outside the libunwind infrastructure, but it's proving difficult to make it run 
within libunwind's own testing infrastructure. This bug only manifests when a 
frame has no CFI, however the libunwind make system turns on -fexceptions 
everywhere so this can't be tested easily. I know there are test-specific 
CFLAGS, however these come first in the compile command so they are clobbered 
by the global options. I think it makes sense to be able to vary the 
exception/unwind compiler flags as part of testing as these control which 
libunwind code paths are executed. Did I miss something? Is there some other 
mechanism to set these flags for test cases?

Many thanks in advance,
Stephen Roberts


reply via email to

[Prev in Thread] Current Thread [Next in Thread]