qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translati


From: Emilio G. Cota
Subject: Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation
Date: Thu, 1 Oct 2015 15:32:36 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Sep 30, 2015 at 06:44:32 +0200, Paolo Bonzini wrote:
> I have a doubt about your patches for ll/sc emulation, that I hope you
> can clarify.
> 
> From 10000ft, both approaches rely on checking a flag during stores.
> This is split between the TLB and the CPUState for Alvise's patches (in
> order to exploit the existing fast-path checks), and entirely in the
> radix tree for Emilio's.  However, the idea is the same.

Not quite the same idea, see below.

> Now, the patch are okay for serial emulation, but I am not sure if it's
> possible to do lock-free ll/sc emulation, because there is a race.
> 
> If we check the flag before the store, the race is as follows:
> 
>    CPU0        CPU1
>    -------------------------------------------------------
>    check flag
>                load locked:
>                   set flag
>                   load value (normal load on CPU)
>    store
>                store conditional (normal store on CPU)
> 
> where the sc doesn't fail.  For completeness, if we check it afterwards
> (which would be possible with Emilio's approach, though not for the
> TLB-based one):
> 
>    CPU0        CPU1
>    ------------------------------------------------------
>                load locked
>                   set bit
>                   load value (normal load on CPU)
>    store
>                store conditional (normal store on CPU)
>    check flag
> 
> and again the sc doesn't fail.

(snip)
> Tell me I'm wrong. :)

The difference between Alvise's implementation and what I submitted is
that in the radix tree a cache line that has *ever* had an atomic op performed
on it, is marked as "slow path" for the rest of the execution, meaning
that *all* subsequent stores to said cache line will take the cache line
entry's lock.

This does not fix the race completely (e.g. you could have a store
and an atomic op concurrently executing on a line that hasn't yet
had an atomic op on it), but it significantly closes it. My guess
is that it would be very hard to trigger by practical programs.

                E.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]