bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#28211: Stack marking issue in multi-threaded code


From: Mark H Weaver
Subject: bug#28211: Stack marking issue in multi-threaded code
Date: Fri, 29 Jun 2018 12:54:47 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Hi Ludovic,

address@hidden (Ludovic Courtès) writes:

> I have a fix for some (most?) of the crashes we were seeing while
> running multi-threaded code such as (guix build compile), and,
> presumably, the grafting code mentioned at the beginning of this bug
> report, although I haven’t checked yet.
>
> So, ‘scm_i_vm_mark_stack’ marks the stack precisely, but contrary to
> what I suspected, precise marking is not at fault.
>
> Instead, the problem has to do with the fact that some VM instructions
> change the frame pointer (vp->fp) before they have set up the dynamic
> link for that new frame.
>
> As a consequence, if a stop-the-world GC is triggered after vp->fp has
> been changed and before its dynamic link has been set, the stack-walking
> loop in ‘scm_i_vm_mark_stack’ could stop very early, leaving a lot of
> objects unmarked.
>
> The patch below fixes the problem for me.  \o/

That's great news!  Thanks for investigating.

> I’m thinking we could perhaps add a compiler barrier before ‘vp->fp = new_fp’
> statements, but in practice it’s not necessary here (x86_64, gcc 7).
>
> Thoughts?

Indeed, we need something to ensure that the compiler won't reorder
these writes.  My recommendation would be to use the 'volatile'
qualifier in the declarations of both the 'fp' field of 'struct scm_vm'
and the stack element modified by SCM_FRAME_SET_DYNAMIC_LINK.

Maybe something like this:

--8<---------------cut here---------------start------------->8---
diff --git a/libguile/frames.h b/libguile/frames.h
index ef2db3df5..8554f886b 100644
--- a/libguile/frames.h
+++ b/libguile/frames.h
@@ -89,6 +89,7 @@
 union scm_vm_stack_element
 {
   scm_t_uintptr as_uint;
+  volatile scm_t_uintptr as_volatile_uint;
   scm_t_uint32 *as_ip;
   SCM as_scm;
   double as_f64;
@@ -104,7 +105,7 @@ union scm_vm_stack_element
 #define SCM_FRAME_RETURN_ADDRESS(fp)    ((fp)[0].as_ip)
 #define SCM_FRAME_SET_RETURN_ADDRESS(fp, ra) ((fp)[0].as_ip = (ra))
 #define SCM_FRAME_DYNAMIC_LINK(fp)      ((fp) + (fp)[1].as_uint)
-#define SCM_FRAME_SET_DYNAMIC_LINK(fp, dl) ((fp)[1].as_uint = ((dl) - (fp)))
+#define SCM_FRAME_SET_DYNAMIC_LINK(fp, dl) ((fp)[1].as_volatile_uint = ((dl) - 
(fp)))
 #define SCM_FRAME_SLOT(fp,i)            ((fp) - (i) - 1)
 #define SCM_FRAME_LOCAL(fp,i)           (SCM_FRAME_SLOT (fp, i)->as_scm)
 #define SCM_FRAME_NUM_LOCALS(fp, sp)    ((fp) - (sp))
diff --git a/libguile/vm.h b/libguile/vm.h
index a1cac391f..0a6179c19 100644
--- a/libguile/vm.h
+++ b/libguile/vm.h
@@ -38,7 +38,7 @@ enum {
 struct scm_vm {
   scm_t_uint32 *ip;            /* instruction pointer */
   union scm_vm_stack_element *sp; /* stack pointer */
-  union scm_vm_stack_element *fp; /* frame pointer */
+  union scm_vm_stack_element *volatile fp; /* frame pointer */
   union scm_vm_stack_element *stack_limit; /* stack limit address */
   int trace_level;              /* traces enabled if trace_level > 0 */
   union scm_vm_stack_element *sp_min_since_gc; /* deepest sp since last gc */
--8<---------------cut here---------------end--------------->8---

> I’d like to push this real soon.  I’ll also do more testing on real
> workloads from Guix, and then I’d like to release 2.2.4, hopefully
> within a few days.

Sounds good to me.

     Thanks,
       Mark





reply via email to

[Prev in Thread] Current Thread [Next in Thread]