bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33174: 27.0.50; Dump fails on GNU/Linux ppc64le


From: Thomas Fitzsimmons
Subject: bug#33174: 27.0.50; Dump fails on GNU/Linux ppc64le
Date: Tue, 30 Oct 2018 05:30:47 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Paul Eggert <eggert@cs.ucla.edu> writes:

> Thomas Fitzsimmons wrote:
>> BTW, let me know if you don't think it's useful to debug this further.
>> I'm OK just disabling randomization when I build Emacs for the time
>> being and waiting until the portable dumper work lands, but I'm happy to
>> continue if you think it will lead to a general fix.
>
> It's not clear when the portable dumper will land; it might not ever
> land, unfortunately. So I would like to work on bug#33174 a bit
> longer, if only so that we can put something intelligible into the
> PROBLEMS file.

OK.

>> It seems like it's crashing when trying to memcpy over the BSS area, on
>> this line in unexelf.c (see below):
>
> By the time the memcpy is run the damage has already been done: the
> memory layout is messed up and we can't fix that simply by passing
> different arguments to memcpy. We have to prevent the memory layout
> from being messed up in the first place by disabling undesirable
> address space layout randomization and doing this very early in
> execution.

Ah, OK, so the goal is to programmatically do something similar to
echo'ing to randomize_va_space, but just for the temacs process.

> The key question for me is in this set of system calls:
>
>> 58215 personality(0xffffffff)           = 0 (PER_LINUX)
>> 58215 personality(PER_LINUX|ADDR_NO_RANDOMIZE) = 0 (PER_LINUX)
>> 58215 personality(0xffffffff)           = 0x40000 
>> (PER_LINUX|ADDR_NO_RANDOMIZE)
>> 58215 brk(NULL)                         = 0x27070000
>> 58215 dup2(0, 0)                        = 0
>> 58215 dup2(1, 1)                        = 1
>> 58215 dup2(2, 2)                        = 2
>
> Surely the call to disable_address_randomization () must have returned
> true, but can you verify that, either via GDB or (shudder) by
> inserting print statements?

(I sorted out glibc source code and debug symbols so they'll be accurate
now).  Yes, disable_address_randomization returns true:

[...]
(gdb) finish
Run till exit from #0  0x0000000010136d9c in disable_address_randomization () 
at sysdep.c:165
0x0000000010016c94 in main (argc=<optimized out>, argv=0x7fffd4430178) at 
emacs.c:710
710       if (disable_aslr && disable_address_randomization ()
Value returned is $1 = true
[...]

> Also, the call from 'main' to getenv ("EMACS_HEAP_EXEC") must have
> returned NULL. Can you also verify this?

(gdb) c
Continuing.

Breakpoint 4, 0x00007fff9dc1ef98 in __GI_getenv (name=0x10274ce8 
"EMACS_HEAP_EXEC") at getenv.c:34
34      {
(gdb) finish
Run till exit from #0  0x00007fff9dc1ef98 in __GI_getenv (name=0x10274ce8 
"EMACS_HEAP_EXEC") at getenv.c:34
0x0000000010017870 in main (argc=<optimized out>, argv=0x7ffff4883248) at 
emacs.c:711
711           && !getenv ("EMACS_HEAP_EXEC"))
Value returned is $2 = 0x7ffff488fe49 "true"

Actually, EMACS_HEAP_EXEC is true!  If I unset it, then the bootstrap
works with and without "Fix bootstrap infloop in GNU/Linux alpha"
applied.

I'm building Emacs inside Emacs via M-x shell.  "EMACS_HEAP_EXEC=true"
is in process-environment.  Given that I'm also running EXWM, no matter
what build shell I start up, even an xterm, EMACS_HEAP_EXEC is set to
"true" in the environment.

Ah, by running the "outer" Emacs via a serial console (i.e., not from
within Emacs, and starting with EMACS_HEAP_EXEC unset in the
environment), I think I see what happened.  Because of the ifdef just
above the randomization disablement code:

# ifdef __PPC64__
  bool disable_aslr = true;
# else
  bool disable_aslr = dumping;
# endif

randomization is unconditionally disabled on PPC64, and so
EMACS_HEAP_EXEC is unconditionally set to true in the outer build
Emacs's initial-environment.  With "Fix bootstrap infloop in GNU/Linux
alpha" applied, building Emacs within Emacs on PPC64 will no longer work
because the re-exec will be skipped during bootstrap.

Maybe can you try building Emacs within Emacs on one of those CentOS
machines to confirm?

Thanks,
Thomas





reply via email to

[Prev in Thread] Current Thread [Next in Thread]