Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available

From:	Richard Henderson
Subject:	Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available
Date:	Sun, 22 Dec 2013 08:38:40 -0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 12/22/2013 04:24 AM, Aurelien Jarno wrote:
> On Sat, Dec 21, 2013 at 03:08:21PM +0100, Paolo Bonzini wrote:
>> Il 21/12/2013 00:00, Richard Henderson ha scritto:
>>> +        if (real_bswap && have_movbe) {
>>> +            tcg_out_modrm_offset(s, OPC_MOVBE_GyMy + P_DATA16 + seg,
>>> +                                 datalo, base, ofs);
>>> +            tcg_out_ext16u(s, datalo, datalo);
>>
>> Do partial register stalls still exist on Atom and Haswell?  I don't
>> remember exactly what you had to do to prevent them, but IIRC you first
>> moved zero to the register and then overwrote the the low 16 bits.
> 
> Note that for unsigned 16-bit load you can do either movzw + bswap or 
> movbe + movzw.

>From the July 2013 Intel Opt Ref Manual,

"Delay of partial register stall is small in ... Intel Core and NetBurst
microarchitectures".  And for Atom "partial register access does not cause
additional delay".

While I agree with Paulo that xor + movbe is probably technically the best, one
has to check for output register overlap and have a fallback.  Thus I think we
can just discard that idea.

As for movzw + bswap, that forces a partial register stall on subsequent 32-bit
access to the value, while movbe + movzw does not.  In the later case we refer
to the unmerged portion of the register in the movzw.

But the optimization note suggests that it shouldn't matter much either way.

r~

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available, Richard Henderson, 2013/12/20
- Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available, Paolo Bonzini, 2013/12/21
  - Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available, Aurelien Jarno, 2013/12/22
    - Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available, Paolo Bonzini, 2013/12/22
    - Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available, Richard Henderson <=
- Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available, Aurelien Jarno, 2013/12/22

Prev by Date: Re: [Qemu-devel] [PATCH v2] softfloat: Fix factor 2 error for scalbn on denormal inputs
Next by Date: Re: [Qemu-devel] [PATCH v2] softfloat: Fix factor 2 error for scalbn on denormal inputs
Previous by thread: Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available
Next by thread: Re: [Qemu-devel] [PATCH] tcg-i386: Use MOVBE if available
Index(es):
- Date
- Thread