Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translati

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translati

From:	Emilio G. Cota
Subject:	Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation
Date:	Thu, 1 Oct 2015 15:32:36 -0400
User-agent:	Mutt/1.5.21 (2010-09-15)

On Wed, Sep 30, 2015 at 06:44:32 +0200, Paolo Bonzini wrote:
> I have a doubt about your patches for ll/sc emulation, that I hope you
> can clarify.
> 
> From 10000ft, both approaches rely on checking a flag during stores.
> This is split between the TLB and the CPUState for Alvise's patches (in
> order to exploit the existing fast-path checks), and entirely in the
> radix tree for Emilio's.  However, the idea is the same.

Not quite the same idea, see below.

> Now, the patch are okay for serial emulation, but I am not sure if it's
> possible to do lock-free ll/sc emulation, because there is a race.
> 
> If we check the flag before the store, the race is as follows:
> 
>    CPU0        CPU1
>    -------------------------------------------------------
>    check flag
>                load locked:
>                   set flag
>                   load value (normal load on CPU)
>    store
>                store conditional (normal store on CPU)
> 
> where the sc doesn't fail.  For completeness, if we check it afterwards
> (which would be possible with Emilio's approach, though not for the
> TLB-based one):
> 
>    CPU0        CPU1
>    ------------------------------------------------------
>                load locked
>                   set bit
>                   load value (normal load on CPU)
>    store
>                store conditional (normal store on CPU)
>    check flag
> 
> and again the sc doesn't fail.

(snip)
> Tell me I'm wrong. :)

The difference between Alvise's implementation and what I submitted is
that in the radix tree a cache line that has *ever* had an atomic op performed
on it, is marked as "slow path" for the rest of the execution, meaning
that *all* subsequent stores to said cache line will take the cache line
entry's lock.

This does not fix the race completely (e.g. you could have a store
and an atomic op concurrently executing on a line that hasn't yet
had an atomic op on it), but it significantly closes it. My guess
is that it would be very hard to trigger by practical programs.

                E.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation, Paolo Bonzini, 2015/10/08
- Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation, alvise rigo, 2015/10/08
  - Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation, Paolo Bonzini, 2015/10/08
- Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation, Emilio G. Cota <=

Prev by Date: Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface
Next by Date: [Qemu-devel] [Bug 1484925] Re: Segfault with custom vnc client
Previous by thread: Re: [Qemu-devel] [RFC v5 0/6] Slow-path for atomic instruction translation
Next by thread: [Qemu-devel] [PATCHv4] target-arm: Implement AArch64 OSLAR/OSLSR_EL1 sysregs
Index(es):
- Date
- Thread