qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory bar


From: Sergey Fedorov
Subject: Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier
Date: Mon, 6 Jun 2016 18:44:38 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

On 03/06/16 21:27, Pranith Kumar wrote:
> On Thu, Jun 2, 2016 at 5:18 PM, Richard Henderson <address@hidden> wrote:
>> Hum.  That does seem helpful-ish.  But I'm not certain how helpful it is to
>> complicate the helper functions even further.
>>
>> What if we have tcg_canonicalize_memop (or some such) split off the barriers
>> into separate opcodes.  E.g.
>>
>> MO_BAR_LD_B = 32        // prevent earlier loads from crossing current op
>> MO_BAR_ST_B = 64        // prevent earlier stores from crossing current op
>> MO_BAR_LD_A = 128       // prevent later loads from crossing current op
>> MO_BAR_ST_A = 256       // prevent later stores from crossing current op
>> MO_BAR_LDST_B = MO_BAR_LD_B | MO_BAR_ST_B
>> MO_BAR_LDST_A = MO_BAR_LD_A | MO_BAR_ST_A
>> MO_BAR_MASK = MO_BAR_LDST_B | MO_BAR_LDST_A
>>
>> // Match Sparc MEMBAR as the most flexible host.
>> TCG_BAR_LD_LD = 1       // #LoadLoad barrier
>> TCG_BAR_ST_LD = 2       // #StoreLoad barrier
>> TCG_BAR_LD_ST = 4       // #LoadStore barrier
>> TCG_BAR_ST_ST = 8       // #StoreStore barrier
>> TCG_BAR_SYNC  = 64      // SEQ_CST barrier
> I really like this format. I would also like to add to the frontend:
>
> MO_BAR_ACQUIRE
> MO_BAR_RELEASE
>
> and the following to the backend:
>
> TCG_BAR_ACQUIRE
> TCG_BAR_RELEASE
>
> since these are one-way barriers and the previous barrier types do not
> cover them.

Actually, the acquire barrier is a combined load-load + load-store
barrier; and the release barrier is a combo of load-store + store-store
barriers.

Kind regards,
Sergey

>
>> where
>>
>>   tcg_gen_qemu_ld_i32(x, y, i, m | MO_BAR_LD_BEFORE | MO_BAR_ST_AFTER)
>>
>> emits
>>
>>   mb            TCG_BAR_LD_LD
>>   qemu_ld_i32   x, y, i, m
>>   mb            TCG_BAR_LD_ST
>>
>> We can then add an optimization pass which folds barriers with no memory
>> operations in between, so that duplicates are eliminated.
>>
> Yes, folding/eliding these barriers in an optimization pass sounds
> like a good idea.
>
> Thanks,




reply via email to

[Prev in Thread] Current Thread [Next in Thread]