emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MPS: Win64 testers?


From: Pip Cet
Subject: Re: MPS: Win64 testers?
Date: Tue, 03 Sep 2024 10:37:40 +0000

"Kien Nguyen" <kien.n.quang@gmail.com> writes:

> I got another two crashes related to mps.
> The backtrace is attached.  The emacs source code is here[1].
> Hope this helps.

Thank you, it does.  I found the bug, but fixing it is going to require
changes to your MPS patch series.

The disassembled code that's causing the problem reads:

   0x000000040013c950 <+1216>:  mov    (%r12),%rdx
   0x000000040013c954 <+1220>:  movq   %rax,%xmm1
   0x000000040013c959 <+1225>:  mov    0x278(%rdx),%rax
   0x000000040013c960 <+1232>:  sub    $0x1,%rbx
   0x000000040013c964 <+1236>:  mov    $0x3,%edx
   0x000000040013c969 <+1241>:  movq   (%rdi,%rbx,8),%xmm6
   0x000000040013c96e <+1246>:  mov    $0x18,%ecx
   0x000000040013c973 <+1251>:  mov    0x20(%rax),%r8
   0x000000040013c977 <+1255>:  punpcklqdq %xmm1,%xmm6
   0x000000040013c97b <+1259>:  call   0x4002061e0 <alloc_impl.lto_priv.0>
   0x000000040013c980 <+1264>:  movups %xmm6,0x8(%rax)
   0x000000040013c984 <+1268>:  add    $0x3,%rax
   0x000000040013c988 <+1272>:  test   %rbx,%rbx
   0x000000040013c98b <+1275>:  jne    0x40013c950 <Fmatch_data+1216>

During the call to <alloc_impl.lto_priv.0> at +1259, the future value of
the cons cell lives only in XMM registers (%xmm1 and %xmm6); while it
was returned from the previous iteration in %rax, that register is
overwritten by the mov at +1225.

This isn't a problem for the first iteration, when the cdr is Qnil,
which is safe to use, but after the first generation, this is the only
place that holds a reference to the list we're building.

alloc_impl can call GC, which will then fail to find a reference to the
list that's being built and collect it, but return to +1264 where the
reference to the freed cons cell is written back to memory.

IOW, we need to make sure that the callee-saved %xmm registers are
properly spilled to the stack and marked conservatively.

However, this patch:

https://github.com/kiennq/emacs-build/blob/main/patches/mps/0004-Fix-register-scanning-on-FreeBSD-and-Linux.patch

replaces setjmp() by a series of assembler statements:

#define STACK_CONTEXT_SAVE(sc)                                  \
  BEGIN                                                         \
    Word *_save = (sc)->calleeSave;                             \
    __asm__ volatile ("mov %%rbp, %0" : "=m" (_save[0]));       \
    __asm__ volatile ("mov %%rbx, %0" : "=m" (_save[1]));       \
    __asm__ volatile ("mov %%r12, %0" : "=m" (_save[2]));       \
    __asm__ volatile ("mov %%r13, %0" : "=m" (_save[3]));       \
    __asm__ volatile ("mov %%r14, %0" : "=m" (_save[4]));       \
    __asm__ volatile ("mov %%r15, %0" : "=m" (_save[5]));       \
  END

This assumes the SysV ABI is in use, which doesn't have callee-saved XMM
registers.  However, GCC (correctly) uses the Microsoft x64 ABI, which
also requires XMM registers %xmm6-%xmm15, as well as general registers
%rdi and %rsi, to be callee-saved
(https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention).

So this patch doesn't work for Windows 64 builds. I'm not entirely sure
that simply removing this patch will make things work, though, because I
don't know how Microsoft's UCRT implements setjmp.  MPS relies on it to
store the value of the callee-saved registers without any further
mangling, on the stack, in aligned words.

A safe (but very slightly slower) solution would be to simply store the
registers twice, once using setjmp() and once using the assembler
statements.  Would that be acceptable for your build?

Pip




reply via email to

[Prev in Thread] Current Thread [Next in Thread]