bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#48337: Fwd: 28.0.50; Emacs crashing randomly (possibly minibuffer ac


From: Alan Mackenzie
Subject: bug#48337: Fwd: 28.0.50; Emacs crashing randomly (possibly minibuffer activity related)
Date: Tue, 11 May 2021 19:45:23 +0000

Hello, Eli.

On Tue, May 11, 2021 at 16:42:26 +0300, Eli Zaretskii wrote:
> > From: Alex Bennée <alex.bennee@linaro.org>
> > Date: Tue, 11 May 2021 13:54:02 +0100
> > Cc: 48337@debbugs.gnu.org, Alan Mackenzie <acm@muc.de>

> > (gdb) pp Vminibuffer_list
> > (#<buffer  *Minibuf-0*> #<buffer  *Minibuf-1*>)

> Thanks.

> Alan, the code in nth_minibuffer and its callers is unsafe.  First,
> Fnthcdr can return nil, and then XCAR of that in nth_minibuffer
> crashes.  I fixed that now on the master branch, ....

That Fnthcdr call "can't possibly" return nil, unless there's a bug
somewhere.  Clearly there's a bug somewhere, and the fact it triggered
an abort is a good thing, since it should enable us to find that bug
more easily.

nth_minibuffer is called only with argument DEPTH set to 0 or
minibuf_level.  minibuf_level is initialised to 0 and thereafter only
altered at exactly 2 places, a minibuf_level++ when entering a new MB,
and minibuf_level-- when exiting it.

Vminibuffer_list, the list of minibuffers, is extended by one element
when a new minibuffer level is entered for the first time.  This is done
by function get_minibuffer.  Once *Minibuf-2* has been created, it is
reused every time a recursive MB call at that level happens, and it is
never garbage collected.

My hypothesis at the moment is that minibuf_level++ has happened
(setting its value to 2), but get_minibuffer(2) hasn't happened yet, so
VMinibuffer_list is only 2 elements long, ( *Minibuf-0*  *Minibuf-1*).
Something is trying to call nth_minibuffer (minibuf_level) in that
inconsistent state.  There is a window of ~115 lines of code in
read_minibuf where that could happen.

However, Alex's dump doesn't say what the current positionn in
read_minibuf is.  Instead it says "lisp.h:1008", which is unhelpful in
the extreme.  Why does GDB have to be so "clever"?  Is there any way to
stop GDB doing this and make it report the actual position in the prime
source code as well as the position in some inline function?

I'm going to write to Alex asking him to provide more details - his
posts are lacking a lisp backtrace, a recipe, and so much needed
information is <optimized out>.  Why does GDB fail to display this
information?  Surely it should know what processor registers the
arguments and local variables are stored in, and where in the stack
frame they have been pushed?

> .... but there're more problems: some the callers of nth_minibuffer
> don't seem to be protected from it returning nil.  For example, we
> have this in read_minibuf_unwind:

>   Fset_buffer (nth_minibuffer (minibuf_level));

This, I think, can be justified - if read_minibuf_unwind can't find the
minibuffer it's unwinding, we've got a serious problem and ought to
abort Emacs ASAP.  Should that, perhaps, be an explicit assert?

> and this in minibuffer_unwind:

>       set_window_buffer (window, nth_minibuffer (0), 0, 0);

This is similar: If we're unwinding a minibuffer call,  *Minibuf-0* is
"bound" to exist.  Perhaps there should be an explicit assert here, too?

> In other cases you compare windows' buffers [EZ's textual correction
> incorporated] with nil, which can never be true, so a preliminary test
> for nil would be nice to avoid a loop that can never find anything
> useful.

> Please make this code more robust.

OK.  I will do this.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]