bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#28211: Stack marking issue in multi-threaded code


From: Mark H Weaver
Subject: bug#28211: Stack marking issue in multi-threaded code
Date: Sat, 30 Jun 2018 17:49:03 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Hi Ludovic,

address@hidden (Ludovic Courtès) writes:

> Mark H Weaver <address@hidden> skribis:
>
>> address@hidden (Ludovic Courtès) writes:
>
> [...]
>
>>>> Indeed, we need something to ensure that the compiler won't reorder
>>>> these writes.  My recommendation would be to use the 'volatile'
>>>> qualifier in the declarations of both the 'fp' field of 'struct scm_vm'
>>>> and the stack element modified by SCM_FRAME_SET_DYNAMIC_LINK.
>>>>
>>>> Maybe something like this:
>>>>
>>>> diff --git a/libguile/frames.h b/libguile/frames.h
>>>> index ef2db3df5..8554f886b 100644
>>>> --- a/libguile/frames.h
>>>> +++ b/libguile/frames.h
>>>> @@ -89,6 +89,7 @@
>>>>  union scm_vm_stack_element
>>>>  {
>>>>    scm_t_uintptr as_uint;
>>>> +  volatile scm_t_uintptr as_volatile_uint;
>>>
>>> [...]
>>>
>>>> +#define SCM_FRAME_SET_DYNAMIC_LINK(fp, dl) ((fp)[1].as_volatile_uint = 
>>>> ((dl) - (fp)))
>>>
>>> I’m not sure this is exactly what we need.
>>>
>>> What I had in mind, to make sure the vp->fp assignment really happens
>>> after the SCM_FRAME_SET_DYNAMIC_LINK, was to do something like this:
>>>
>>>   SCM_FRAME_SET_RETURN_ADDRESS (new_fp, …);
>>>   SCM_FRAME_SET_DYNAMIC_LINK (new_fp, …);
>>>
>>>   asm volatile ("" : : : "memory");
>>>
>>>   vp->fp = new_fp;
>>>
>>> I think that more accurately reflects what we want, WDYT?
>>>
>>> It’s GCC-specific though (I don’t think Clang implements ‘asm’ in a
>>> compatible way, does it?) and I suppose we’d have to ignore the non-GCC
>>> case.
>>
>> Right, the problem is that the "asm volatile" solution is not portable.
>>
>> To my mind, it's fine to optionally use GCC extensions for improved
>> performance, but I think we should ensure that Guile works reliably when
>> compiled with any conforming C compiler.
>
> I agree, of course (that said, most compilers implement most GCC
> extensions nowadays, but ‘asm’ is probably an exception).
>
>> What problem do you see with my proposed portable solution?  If I
>> understand C99 section 5.1.2.3 paragraph 2 correctly, compilers are not
>> permitted to reorder accesses to volatile objects as long as there is a
>> sequence point between them.
>
> My understand is that what you propose is (almost*) equivalent to the
> asm trick, provided SCM_FRAME_SET_DYNAMIC_LINK is the last statement
> before the vp->fp assignment.  So we’d have to make sure we keep things
> in this order, right?

I don't think it matters.  This would only ensure the relative ordering
of the SCM_FRAME_SET_DYNAMIC_LINK and the vp->fp assignment.  IIUC, the
non-volatile memory accesses could be reordered relative to either of
them.

The "asm volatile" memory barrier would be stronger, guaranteeing that
_all_ accesses before the barrier would be performed before _all_
accessed after the barrier, within that thread.

> [*] The ‘volatile’ qualifier on the field does more than just an
>     instruction reordering barrier: AIUI, it forces the compiler to emit
>     a load and store instruction right at the assignment point, which is
>     a stricter constraint than what we need, I think.

I'm not sure what "right at the assignment point" means exactly, given
that the compiler is free to reorder things quite a bit.

In addition to the C99 standard which I cited in my previous email, you
might also find the following link informative:

  https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html

In particular:

  [...]  The minimum requirement is that at a sequence point all
  previous accesses to volatile objects have stabilized and no
  subsequent accesses have occurred.  Thus an implementation is free to
  reorder and combine volatile accesses that occur between sequence
  points, but cannot do so for accesses across a sequence point.  The
  use of volatile does not allow you to violate the restriction on
  updating objects multiple times between two sequence points.
  
  Accesses to non-volatile objects are not ordered with respect to
  volatile accesses.  You cannot use a volatile object as a memory
  barrier to order a sequence of writes to non-volatile memory.  For
  instance:
  
    int *ptr = something;
    volatile int vobj;
    *ptr = something;
    vobj = 1;
  
  Unless *ptr and vobj can be aliased, it is not guaranteed that the
  write to *ptr occurs by the time the update of vobj happens.  If you
  need this guarantee, you must use a stronger memory barrier [...]

>> I should say that I'm not confident that _either_ of these proposed
>> solutions will adequately address all of the possible problems that
>> could occur when GC is performed on VM threads stopped at arbitrary
>> points in their execution.
>
> Yeah, as discussed on IRC with Andy, we’d be better off if we were sure
> that each stack is marked by the thread it belongs to, in terms of data
> locality, and thus also in terms of being sure that vp->fp is up-to-date
> when the marker reads it.  It’s not something we can change now, though.

I'm not sure it matters what thread the marking is done in, because when
the actual collection happens, all threads are first stopped in their
signal handlers, and presumably the appropriate memory barriers are
performed so that all threads are synchronized before the full
collection.

Now, I'm aware that libgc does incremental marking as well, but I'm
optimistically hoping that they've done their homework to ensure that if
anything might have changed between the incremental marking and the full
collection, that the relevant objects will be marked again before the
full collection.  However, I admit that I don't know enough of libgc
internals to know if this is really the case.

What do you think?

> Alternately, we could use atomics when accessing vp->fp to ensure memory
> consistency (I tried that during my debugging journey…).  It could be
> costly.

This would be far too costly to consider, in my opinion.

> Anyway, I don’t think we’ll have the final word on all this before
> 2.2.4.  The way I see it we should keep working on improving it, but
> there are difficult choices to make, so it will probably take a bit of
> time.

Sounds good.

    Thanks,
      Mark





reply via email to

[Prev in Thread] Current Thread [Next in Thread]