qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 15/15] tcg: use ext op for deposit


From: Aurelien Jarno
Subject: Re: [Qemu-devel] [PATCH 15/15] tcg: use ext op for deposit
Date: Sun, 10 Apr 2011 22:08:12 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

On Sun, Apr 10, 2011 at 09:25:33PM +0200, Alexander Graf wrote:
> 
> On 10.04.2011, at 21:23, Aurelien Jarno wrote:
> 
> > On Tue, Apr 05, 2011 at 09:55:09AM +0200, Alexander Graf wrote:
> >> 
> >> On 05.04.2011, at 06:54, Aurelien Jarno wrote:
> >> 
> >>> On Mon, Apr 04, 2011 at 04:32:24PM +0200, Alexander Graf wrote:
> >>>> With the s390x target we use the deposit instruction to store 32bit 
> >>>> values
> >>>> into 64bit registers without clobbering the upper 32 bits.
> >>>> 
> >>>> This specific operation can be optimized slightly by using the ext 
> >>>> operation
> >>>> instead of an explicit and in the deposit instruction. This patch adds 
> >>>> that
> >>>> special case to the generic deposit implementation.
> >>>> 
> >>>> Signed-off-by: Alexander Graf <address@hidden>
> >>>> ---
> >>>> tcg/tcg-op.h |    6 +++++-
> >>>> 1 files changed, 5 insertions(+), 1 deletions(-)
> >>> 
> >>> Have you really measuring a difference here? This should already be
> >>> handled, at least on x86, by this code:
> >>> 
> >>>       if (TCG_TARGET_REG_BITS == 64) {
> >>>           if (val == 0xffffffffu) {
> >>>               tcg_out_ext32u(s, r0, r0);
> >>>               return;
> >>>           }
> >>>           if (val == (uint32_t)val) {
> >>>               /* AND with no high bits set can use a 32-bit operation.  */
> >>>               rexw = 0;
> >>>           }
> >>>       }
> >> 
> >> I've certainly looked at the -d op logs and seen that instead of creating 
> >> a const tcg variable plus an AND there was now an extu opcode issued, yes. 
> >> No idea why the case up there didn't trigger.
> >> 
> > 
> > The question there is looking at -d out_asm. They should be the same at
> > the end as the code I pasted above is from tcg/i386/tcg-target.c.
> 
> Yes. I was trying to optimize for maximum op length. TCG defines a maximum 
> number of tcg ops to be issued by each target instruction. Since s390 is very 
> CISCy, there are instructions that translate into lots of microops, but are 
> still faster than a C call (register save/restore mostly).
> 
> Without this patch, there are some places where we hit that number :).

Is it on 32-bit on or 64-bit? If we reach this number, it's probably
better to either implement this instruction with an helper, or maybe
increase the number of maximum ops. What is this instruction?

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
address@hidden                 http://www.aurel32.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]