qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory bar


From: Richard Henderson
Subject: Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier
Date: Thu, 2 Jun 2016 14:18:04 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0

On 06/02/2016 01:38 PM, Sergey Fedorov wrote:
On 02/06/16 23:36, Richard Henderson wrote:
On 06/02/2016 09:30 AM, Sergey Fedorov wrote:
I think we need to extend TCG load/store instruction attributes to
provide information about guest ordering requirements and leave this TCG
operation only for explicit barrier instruction translation.

I do not agree.  I think separate barriers are much cleaner and easier
to manage and reason with.


How are we going to emulate strongly-ordered guests on weakly-ordered
hosts then? I think if every load/store operation must specify which
ordering it implies then this task would be quite simple.

Hum. That does seem helpful-ish. But I'm not certain how helpful it is to complicate the helper functions even further.

What if we have tcg_canonicalize_memop (or some such) split off the barriers into separate opcodes. E.g.

MO_BAR_LD_B = 32        // prevent earlier loads from crossing current op
MO_BAR_ST_B = 64        // prevent earlier stores from crossing current op
MO_BAR_LD_A = 128       // prevent later loads from crossing current op
MO_BAR_ST_A = 256       // prevent later stores from crossing current op
MO_BAR_LDST_B = MO_BAR_LD_B | MO_BAR_ST_B
MO_BAR_LDST_A = MO_BAR_LD_A | MO_BAR_ST_A
MO_BAR_MASK = MO_BAR_LDST_B | MO_BAR_LDST_A

// Match Sparc MEMBAR as the most flexible host.
TCG_BAR_LD_LD = 1       // #LoadLoad barrier
TCG_BAR_ST_LD = 2       // #StoreLoad barrier
TCG_BAR_LD_ST = 4       // #LoadStore barrier
TCG_BAR_ST_ST = 8       // #StoreStore barrier
TCG_BAR_SYNC  = 64      // SEQ_CST barrier

where

  tcg_gen_qemu_ld_i32(x, y, i, m | MO_BAR_LD_BEFORE | MO_BAR_ST_AFTER)

emits

  mb            TCG_BAR_LD_LD
  qemu_ld_i32   x, y, i, m
  mb            TCG_BAR_LD_ST

We can then add an optimization pass which folds barriers with no memory operations in between, so that duplicates are eliminated.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]