bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60220: 29.0.60; macOS 13.1 crash shortly after starting Emacs


From: Aaron Jensen
Subject: bug#60220: 29.0.60; macOS 13.1 crash shortly after starting Emacs
Date: Tue, 20 Dec 2022 22:47:46 -0500

On Tue, Dec 20, 2022 at 12:22 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Aaron Jensen <aaronjensen@gmail.com>
> > Date: Tue, 20 Dec 2022 10:59:20 -0500
> > Cc: 60220@debbugs.gnu.org
> >
> > On Tue, Dec 20, 2022 at 10:40 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > >
> > > AFAIU, it says that Emacs was _loading_ rng-loc.  That doesn't mean
> > > the problem is in rng-loc's code.  The fatal signal comes from the
> > > maxOS implementation of dlopen, so I suspect that the way we restart
> > > Emacs messes up some OS data structures regarding loaded shared
> > > libraries or something.
> > >
> > > Note that the previous crash you posted also crashes inside dlopen.
> > >
> > > So I think it's safe to say that restarting Emacs makes loading of
> > > *.eln files (and maybe share libraries in general) fragile and tending
> > > to crash, for some reason.  Maybe we should explicitly unload all the
> > > *.eln files when we restart?
> >
> > Interesting. If I restart while launched from lldb, I get the below.
> > It happens right away and it doesn't actually restart. This is not the
> > behavior I see if I launch it normally or in xcode. I should note that
> > occasionally when I restart Emacs it just quits and does not restart.
> > I have to restart it manually. Perhaps this is all connected to the
> > issue you're suggesting.
> >
> > (lldb) process launch
> > Process 19414 launched: 'src/emacs' (arm64)
> > Process 19414 stopped
> > * thread #8, stop reason = exec
> >     frame #0: 0x0000000100b2c950 dyld`_dyld_start
> > dyld`:
> > ->  0x100b2c950 <+0>:  mov    x0, sp
> >     0x100b2c954 <+4>:  and    sp, x0, #0xfffffffffffffff0
> >     0x100b2c958 <+8>:  mov    x29, #0x0
> >     0x100b2c95c <+12>: mov    x30, #0x0
> > Target 0: (emacs) stopped.
> > (lldb) thread list
> > Process 19414 stopped
> > * thread #8: tid = 0x3026c, 0x0000000100b2c950 dyld`_dyld_start, stop
> > reason = exec
> > (lldb) thread backtrace
> > * thread #8, stop reason = exec
> >   * frame #0: 0x0000000100b2c950 dyld`_dyld_start
> > (lldb) continue
> > Process 19414 resuming
> > Process 19414 exited with status = 14 (0x0000000e) Terminated due to signal 
> > 14
>
> What is "signal 14" on macOS?

This is all I could find:

     14    SIGALRM      terminate process    real-time timer expired

I'm able to reproduce the above without native compilation as well.
This particular thing only happens when in a lldb and doesn't affect
me in practice.

> Anyway, look at the code: we restart by calling execvp.  You or
> someone who knowns macOS internals should take a look at what that
> means for shared libraries which were loaded by the program that calls
> execvp -- what happens with those libraries in the execvp'ed process.
> I'm guessing that they are not being unloaded and re-loaded by the new
> process, or something to that effect.
>
> Or maybe the way we load the *.eln files causes this, triggered by
> 'execvp'?

This is out of my depth. I did a tiny bit of digging and didn't find
anything. I'll keep looking but if someone is more familiar w/ this
they'd have more luck than me I'm sure.

> Can you try running for a while Emacs built without native compilation
> and restarting it?  That could tell us whether the *.eln files are the
> problem.

I wasn't able to reproduce it w/o native compilation. I'm going to try
running w/ native compilation for a while *without* doing any restarts
and see if I can get it to crash. I've seen crashes take an hour+
after a restart (though most happen w/in 30 seconds). I can't say
definitively that all crashes have happened in a restarted process.

Aaron





reply via email to

[Prev in Thread] Current Thread [Next in Thread]