[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 3/5] include/qemu/atomic.h: default to __atom

From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH v1 3/5] include/qemu/atomic.h: default to __atomic functions
Date: Mon, 4 Apr 2016 19:03:48 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 04/04/2016 18:26, Pranith Kumar wrote:
> Hi Paolo,
> On Mon, Apr 4, 2016 at 4:14 AM, Paolo Bonzini <address@hidden> wrote:
>> The issue is that atomic_thread_fence() only affects other atomic
>> operations, while smp_rmb() and smp_wmb() affect normal loads and stores
>> as well.
> That is different from what I understand what atomic_thread_fence()
> does. AFAIK, it should order all loads/stores and not only atomic
> operations.
> Quoting 
> http://www.inf.pucrs.br/~flash/progeng2/cppreference/w/cpp/atomic/atomic_thread_fencehtml.html:
> "Establishes memory synchronization ordering of non-atomic and relaxed
> atomic accesses"

You can find some information at
https://bugzilla.redhat.com/show_bug.cgi?id=1142857 and

I've looked at private email from that time and I was pointed to this
sentence in GCC's manual, which says the opposite:

    "Note that in the C++11 memory model, fences (e.g.,
    ‘__atomic_thread_fence’) take effect in combination with other
    atomic operations on specific memory locations (e.g., atomic loads);
    operations on specific memory locations do not necessarily affect
    other operations in the same way."

Basically, the __atomic fences are essentially a way to break up a
non-relaxed atomic access into a relaxed atomic access and the
ordering-constraining fence.  According to those emails, atomic
load-acquires and store-releases can provide synchronization for plain
loads and stores (not just for relaxed atomic loads and stores).
However, an atomic thread fence cannot do this.

>> In the GCC implementation, atomic operations (even relaxed ones) access
>> memory as if the pointer was volatile.  By doing this, GCC can remove
>> the acquire and release fences altogether on TSO architectures.  We
>> actually observed a case where the compiler subsequently inverted the
>> order of two writes around a smp_wmb().
> That is a GCC bug in that case. GCC should remove smp_wmb() only after
> the optimization passes have run in the codegen phase. Can you please
> point me to the mailing list thread/bug where this was discovered? Or
> is it easily reproducible with reverting that patch? In that case, the
> location of the bug will be helpful to analyse this further.

It depends on the compiler you're using.  With some luck, reverting the
patch will cause accesses across smp_wmb() or smp_rmb() to get
reordered.  In our case it was in thread-pool.c; Kevin Wolf did the

>> In principle it could do the same on architectures that are sequentially
>> consistent; even if none exists in practice, keeping the barriers for
>> smp_mb() is consistent with the other barriers.
> I understand one barrier(), but having two on both sides seems
> unnecessary. I would prefer we clean up and have just one. Although, I
> think it is just an annoyance since performance wise there is no
> difference. Two consecutive barrier()'s should behave just like one.

Yes, this is okay to change.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]