[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
The emacs_backtrace "feature"
The emacs_backtrace "feature"
Fri, 21 Sep 2012 12:49:17 +0300
Based on my experience, I expect this "feature" to be hated, by users
and Emacs maintainers alike.
My experience is based on years of working with the DJGPP development
environment. DJGPP (www.delorie.com/djgpp/) is a Posix-compliant
development environment, based on ported GNU tools and an
independently written standard C library, for developing 32-bit
protected-mode programs that run on MS-DOS and compatible systems. In
particular, the MS-DOS build of Emacs uses DJGPP.
In DJGPP, displaying the backtrace on fatal errors is the default,
because core files are not supported. So, when a DJGPP-compiled
program crashes, it displays a register dump and a backtrace. Here's
a typical example (I deliberately truncated the backtrace at the end,
which was much longer in reality):
Exiting due to signal SIGABRT
Raised at eip=0012f2a6
eax=002ee7fc ebx=00000120 ecx=00000000 edx=00000000 esi=003a533d edi=002f4cc0
ebp=002ee8a8 esp=002ee7f8 program=H:\test\emacs-djgpp\emacs\src\temacs.exe
cs: sel=0257 base=02c30000 limit=0104ffff
ds: sel=025f base=02c30000 limit=0104ffff
es: sel=025f base=02c30000 limit=0104ffff
fs: sel=022f base=0001d580 limit=0000ffff
gs: sel=027f base=00000000 limit=0010ffff
ss: sel=025f base=02c30000 limit=0104ffff
App stack: [002eed94..002d5d94] Exceptn stack: [002d5c68..002d3d28]
Call frame traceback EIPs:
A companion utility program captures the addresses and the executable
file name from the screen, and adds the corresponding function name
plus offset to each line (if the executable was not stripped), and
also the source file/line information, if that info is found.
Call frame traceback EIPs:
0x0001039f execute_builtin+191, file c:/djgpp/gnu/bash-2.03/execute_cmd.c,
0x00010840 execute_builtin_or_function+176, file
c:/djgpp/gnu/bash-2.03/execute_cmd.c, line 3173
0x0001011b execute_simple_command+659, file
c:/djgpp/gnu/bash-2.03/execute_cmd.c, line 2745
0x0000de00 execute_command_internal+1876, file
c:/djgpp/gnu/bash-2.03/execute_cmd.c, line 824
0x0000d459 execute_command+69, file c:/djgpp/gnu/bash-2.03/execute_cmd.c,
As nice as this looks, it has several disadvantages:
. Many real-life backtraces are long and quickly scroll off the
screen. If you didn't make a point of setting up very large screen
buffers of your shell windows, or redirect standard error to a
file, you'll lose precious information. Since these precautions
are only taken when one expects a crash, guess how many times these
measures are in place when they are needed.
. Many calls to emacs_backtrace in the current sources limit the
number of backtrace frames to 10, but that is an arbitrary
limitation which will be too small in most, if not all, situations.
Check out the crash backtraces posted to the bug tracker. As an
extreme (but quite frequent) data point, crashes in GC tend to have
many hundreds, and sometimes many thousands, of frames in them. In
reality, there's no way of knowing how many frames will be there,
and how many of them will be needed to get enough useful
information for finding the problem. I predict that more often
than not we will be looking at useless backtraces, while users who
reported those backtraces will rightfully expect us to find the bug
and fix it.
. The backtrace is written to the standard error file handle. Is
that handle always guaranteed to be available and connected to a
screen or a disk file that the user can find afterwards? E.g., if
Emacs is invoked from an environment which redirects that handle to
the null device, the information will be lost. (On MS-Windows, GUI
applications launched by clicking a desktop icon have this handle
closed, so anything written to it disappears without a trace; I
don't know if Posix desktops have something similar.)
. Last, but not least, even if the drawbacks described above are not
an issue in some particular crash report, using the limited
information it provides can be quite difficult, especially if the
crash happened in a binary compiled by a different compiler version
than yours, let alone on an architecture different from the one
used by the person who tries to get some sense out of it. Here's
an example of what emacs_backtrace will produce (slightly edited
from what you see on
It doesn't even show the source line info, like DJGPP did.
Translating myfunc1+0x1a etc. into source-level info is not an easy
task, unless you are lucky and there's only one place where it
calls myfunc2. If not, you are left with guesswork. Making sense
of the backtrace without being able to get at the corresponding
source lines is not for the faint at heart. More often than not,
the Emacs maintainers will be tempted to ignore such a report, and
ask for a GDB backtrace instead.
So given all of the above, I'm asking why do we want this feature?
Why not use the good old core dump files? They have all the
information that is needed for debugging the crash, while the above
falls short of that mark by a large measure. It seems like a step
backward. I always thought that the lack of core files in DJGPP was a
serious limitations, so I'm amazed to see modern environments actually
_wanting_ that limited debug feature in favor of core dumps and real
debuggability. Until now, the only uses I saw for the 'backtrace'
function were when a debugger couldn't be used at all, or the core
file couldn't be produced due to system-level requirements, such as
limited disk space or some stringent time constraints. But here we do
that voluntarily and by default. Why?
Having said all that, I'm not really interested in disputing these
points. I wanted to communicate my own, mostly negative, experience
of many years using a similar feature. If more information is
required, in particular about DJGPP and how it created and used the
backtraces, I will gladly provide answers to any questions.
Otherwise, I guess we will find soon enough whether this is a great
feature or not.