bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33014: 26.1.50; 27.0.50; Fatal error after re-evaluating a thread's


From: Eli Zaretskii
Subject: bug#33014: 26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function
Date: Sat, 13 Oct 2018 09:23:38 +0300

> From: Gemini Lasswell <gazally@runbox.com>
> Cc: 33014@debbugs.gnu.org
> Date: Fri, 12 Oct 2018 13:02:56 -0700
> 
> I've tried to do that without success.  The bug won't reproduce if I put
> all the code added to thread.el by the patch into its own file and load
> it with C-u M-x byte-compile-file, and it also doesn't work to put the
> resulting .elc on my load-path and load it with require.

Did you try loading it as a .el file?

Anyway, it's too bad that the reproduction is so Heisenbug-like.  It
probably won't reproduce on my system anyway.

> I've determined today that having -O2 in CFLAGS is necessary to
> reproduce the bug, and that -O1 or -O0 won't do it.

One more reason why reproduction elsewhere is probably hard.

> The Lisp backtrace is really short:
> 
> Thread 7 (Thread 0x7f1cd4dec700 (LWP 21837)):
> "erb--benchmark-monitor-func" (0x158ec58)

If you succeed in reproducing this when this code is loaded
uncompiled, the backtrace might be more helpful.

> >> #2  0x00000000006122b5 in XHASH_TABLE (a=...) at lisp.h:2241
> >
> > and what was its parent object in the calling frame?
> 
> Those are both optimized out with -O2.  I recompiled bytecode.c with
> "volatile" on the declaration of jmp_table, and got this:
> 
> (gdb) up 3
> #3  exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., 
>     nargs=nargs@entry=0, args=<optimized out>, 
>     args@entry=0x16eacf8 <bss_sbrk_buffer+9926232>) at bytecode.c:1403
> 1403              struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
> (gdb) p jmp_table
> $1 = make_number(514)
> (gdb) p *top
> $3 = XIL(0x42b4d0)
> (gdb) pp *top
> remove

Which one of these is the one that triggers the assertion violation?

> Thread 1 "monitor" hit Hardware watchpoint 7: *(EMACS_INT *) 0x16eac38
> 
> Old value = 60897760
> New value = 24075314
> setup_on_free_list (v=v@entry=0x16eac30 <bss_sbrk_buffer+9926032>, 
>     nbytes=nbytes@entry=272) at alloc.c:3060
> 3060    total_free_vector_slots += nbytes / word_size;
> (gdb) bt 10
> #0  setup_on_free_list (v=v@entry=0x16eac30 <bss_sbrk_buffer+9926032>, 
>     nbytes=nbytes@entry=272) at alloc.c:3060
> #1  0x00000000005a9a24 in sweep_vectors () at alloc.c:3297
> #2  0x00000000005adb2e in gc_sweep () at alloc.c:6872
> #3  garbage_collect_1 (end=<optimized out>) at alloc.c:5860
> #4  Fgarbage_collect () at alloc.c:5989
> #5  0x00000000005ca478 in maybe_gc () at lisp.h:4804
> #6  Ffuncall (nargs=4, args=args@entry=0x7fff210a3bc8) at eval.c:2838
> #7  0x0000000000611e00 in exec_byte_code (bytestr=..., vector=..., 
> maxdepth=..., 
>     args_template=..., nargs=nargs@entry=2, args=<optimized out>, 
>     args@entry=0x9bd128 <pure+781288>) at bytecode.c:632
> #8  0x00000000005cdd32 in funcall_lambda (fun=XIL(0x7fff210a3bc8), 
>     nargs=nargs@entry=2, arg_vector=0x9bd128 <pure+781288>, 
>     arg_vector@entry=0x7fff210a3f00) at eval.c:3057
> #9  0x00000000005ca54b in Ffuncall (nargs=3, args=args@entry=0x7fff210a3ef8)
>     at eval.c:2870
> (More stack frames follow...)

Can you show the Lisp backtrace for the above?

> Note that just as was happening when we were working through bug#32357,
> the thread names which gdb prints are wrong, which I verified with:

Looks like a bug in pthreads version of sys_thread_create: it calls
prctl with first arg PR_SET_NAME, but my reading of the documentation
is that such a call gives the name to the _calling_ thread, which is
not the thread just created.  We should instead call
pthread_setname_np, I think (but I'm not an expert on pthreads).

> Am I correct that the next step is to figure out why the garbage
> collector is not marking this vector?  Presumably it's no longer
> attached to the function definition for erb--benchmark-monitor-func by
> the time the garbage collector runs, but it's supposed to be found by
> mark_stack when called from mark_one_thread for Thread 7, right?

Is this vector the byte-code of erb--benchmark-monitor-func?  If so,
how come it is no longer attached to the function, as long as the
function does exist?

And if this vector isn't the byte-code of erb--benchmark-monitor-func,
then what is it?

IMO, we cannot reason about what GC does or doesn't do until we
understand what data structure it processes, and what is the relation
of that data structure to the symbols in your program and in Emacs.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]