bug#26932: 25.1; Crash triggered a few times a day with network process

From: Vivek Dasmohapatra
Subject: bug#26932: 25.1; Crash triggered a few times a day with network process
Date: Mon, 19 Jun 2017 01:57:54 +0100 (BST)
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)

The faulty object would be a very important clue.  Once we find that
out, we should look at how that object is created and modified.

Not sure I'm doing this right, but here's what I have so far:

#0 mark_object; alloc.c:6450
last_mark_index=12, therefore last_mark[11] must be what we are looking at.

(gdb) p XTYPE(last_marked[11])
$41 = Lisp_Symbol
(gdb) p XSYMBOL(last_marked[11])
$44 = (struct Lisp_Symbol *) 0x3c382c0
(gdb) p *XSYMBOL(last_marked[11])
Cannot access memory at address 0x3c382c0

which matches alloc.c:6540 which is inside a `case Lisp_Symbol:' section

line that fails is:

if (ptr->gcmarkbit)

so we have a duff symbol?
#1 mark_object; alloc.c:6543

inside a case Lisp_Cons:

mark_object (ptr->car);

assuming this is the previous marked object:

(gdb) p XTYPE(last_marked[10])
$45 = Lisp_Cons
(gdb) p XCONS(last_marked[10])
$46 = (struct Lisp_Cons *) 0x2eaf810
(gdb) p *XCONS(last_marked[10])
$47 = {car = 50828624, u = {cdr = 48953347, chain = 0x2eaf803}}

(gdb) p XTYPE(50828624)
$49 = Lisp_Symbol
(gdb) p *XSYMBOL(50828624)
Cannot access memory at address 0x3c382c0 // curses, foiled again

#2 mark_maybe_object; alloc.c:4743

Not interesting: checks for alive-ness and calls mark_object
#3  mark_memory (end=0x7fffffffe218, start=<optimized out>); alloc.c:4895

for (pp = start; (void *) pp < end; pp += GC_POINTER_ALIGNMENT)
      mark_maybe_pointer (*(void **) pp);
      mark_maybe_object (*(Lisp_Object *) pp); // ← this is the entry point

(gdb) p XTYPE(*(Lisp_Object *)pp)
$63 = Lisp_Cons
(gdb) p *XCONS(*(Lisp_Object *)pp)
$65 = {car = 48892595, u = {cdr = 49727219, chain = 0x2f6c6f3}}

This doesn't seem to match what we encounter two frames down in mark_object:
Maybe I've misinterpreted something? Anyway:

following the earlier call to mark_maybe_pointer:

(gdb) call mem_find(*((void **) pp))
$86 = (struct mem_node *) 0x2cce8a0
(gdb) p *(struct mem_node *) 0x2cce8a0
$87 = {left = 0x2cce8e0, right = 0x2e816e0, parent = 0x2d5cb00,
  start = 0x2f6c400, end = 0x2f6c7f0, color = MEM_BLACK, type = MEM_TYPE_CONS}

and later on:

        case MEM_TYPE_CONS:
      if (live_cons_p (m, p) && !CONS_MARKED_P ((struct Lisp_Cons *) p))
          XSETCONS (obj, p);

so we've copied the cons cell into obj (I think).

And then finally:

    if (!NILP (obj))
        mark_object (obj);

so maybe last_marked[9] is involved?

idx 9 seems to be a list with every car being:

(gdb) p last_marked[9]
$120 = 48953379
(gdb) p XTYPE(last_marked[9])
$121 = Lisp_Cons
(gdb) p *XCONS(last_marked[9])
$122 = {car = 8760836, u = {cdr = 48953363, chain = 0x2eaf813}}
(gdb) p XTYPE(8760836)
$123 = Lisp_String
(gdb) p *XSTRING(8760836)
$124 = {size = 4, size_byte = -1, intervals = 0x0,
  data = 0xb374bb <pure+2999995> "DEAD"}

So... a reaped list? Not helpful anyway. Nothing identifiable here.

#4 mark_stack

Nothing of note here

Going up the backtrace all I find is that we're in the modeline display
code and we're _about_ to eval the mode-line-frame-identification

(:eval (mode-line-frame-control))

But GC happens before we actually call it.

Not sure where to go from here: Any advice?

