[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory bar

From: Sergey Fedorov
Subject: Re: [Qemu-devel] [RFC v2 PATCH 01/13] Introduce TCGOpcode for memory barrier
Date: Fri, 3 Jun 2016 19:06:42 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

On 03/06/16 18:45, Richard Henderson wrote:
> On 06/03/2016 08:16 AM, Sergey Fedorov wrote:
>> On 03/06/16 04:08, Richard Henderson wrote:
>> So your suggestion is to generate different TCG opcode sequences
>> depending on the underlying target architecture? And you are against
>> forwarding this task further, to the backend code?
> Yes, I would prefer to have, in the opcode stream, a separate opcode
> for barriers.  This aids both the common case, where most of our hosts
> require separate barriers, as well as simplicity.
> I am not opposed to letting the translators describe the memory model
> with barrier data along with memory operations, but I'd really prefer
> that those be split apart during initial opcode generation.
>>>> So I would just focus on translating only explicit memory barrier
>>>> operations for now.
>>> Then why did you bring it up?
>> I'm not sure I got the question right. I suggested to avoid using this
>> TCG operation to emulate guest's memory ordering requirements for
>> loads/stores that can be supplied with memory ordering requirement
>> information which each backend can decide how to translate together with
>> the load/store (possible just ignore it as it is the case for
>> strongly-ordered hosts). I think we just need to translate explicit
>> memory barrier instructions.
>> For example, emulating ARM guest on x86 host requires ARM dmb
>> instruction to be translated to x86 mfence instruction to prevent
>> store-after-load reordering. At the same time, we don't have to generate
>> anything special for loads/stores since x86 is a strongly-ordered
>> architecture.
> Ah, so you'd prefer that we not think about optimizing barriers at the
> moment. Fine, but I'd prefer to think about *how* they might be
> optimized now, so that we *can* later.

Not exactly. We need to have a TCG operation for various types of
explicit barriers in order to translate guest explicit barrier
instructions. I like your idea to follow Sparc's way to specify membar
instruction attributes which can be used by the backed for generating
optimal instructions. I think we also need to associate memory ordering
attributes with load/store TCG operations. I'm not sure how would be
best to handle load/store implicit memory ordering requirements, but it
is probably out of the scope of this series. I'd propagate this
attribute up to the backend and let it decide what kind of instructions
to generate. I'd prefer to see only explicit barrier operations in the
TCG opcode stream.

Kind regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]