qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 33/33] tcg-ppc64: Handle deposit of zero


From: Aurelien Jarno
Subject: Re: [Qemu-devel] [PATCH v4 33/33] tcg-ppc64: Handle deposit of zero
Date: Mon, 15 Apr 2013 10:14:03 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Apr 04, 2013 at 05:56:26PM -0500, Richard Henderson wrote:
> The TCG optimizer does great work when inserting constants, being able
> to fold the open-coded deposit expansion to just an AND or an OR.  Avoid
> a bit the regression caused by having the deposit opcode by expanding
> deposit of zero as an AND.
> 
> Signed-off-by: Richard Henderson <address@hidden>
> ---
>  tcg/ppc64/tcg-target.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c
> index 9583cf9..772f4ac 100644
> --- a/tcg/ppc64/tcg-target.c
> +++ b/tcg/ppc64/tcg-target.c
> @@ -1928,12 +1928,22 @@ static void tcg_out_op (TCGContext *s, TCGOpcode opc, 
> const TCGArg *args,
>          break;
>  
>      case INDEX_op_deposit_i32:
> -        tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3],
> -                    32 - args[3] - args[4], 31 - args[3]);
> +        if (const_args[2]) {
> +            uint32_t mask = ((2u << (args[4] - 1)) - 1) << args[3];
> +            tcg_out_andi32(s, args[0], args[0], ~mask);
> +        } else {
> +            tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3],
> +                        32 - args[3] - args[4], 31 - args[3]);
> +        }
>          break;
>      case INDEX_op_deposit_i64:
> -        tcg_out_rld(s, RLDIMI, args[0], args[2], args[3],
> -                    64 - args[3] - args[4]);
> +        if (const_args[2]) {
> +            uint64_t mask = ((2ull << (args[4] - 1)) - 1) << args[3];
> +            tcg_out_andi64(s, args[0], args[0], ~mask);
> +        } else {
> +            tcg_out_rld(s, RLDIMI, args[0], args[2], args[3],
> +                        64 - args[3] - args[4]);
> +        }
>          break;
>  
>      case INDEX_op_movcond_i32:
> @@ -2136,8 +2146,8 @@ static const TCGTargetOpDef ppc_op_defs[] = {
>      { INDEX_op_bswap32_i64, { "r", "r" } },
>      { INDEX_op_bswap64_i64, { "r", "r" } },
>  
> -    { INDEX_op_deposit_i32, { "r", "0", "r" } },
> -    { INDEX_op_deposit_i64, { "r", "0", "r" } },
> +    { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
> +    { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
>  
>      { INDEX_op_add2_i64, { "r", "r", "r", "rI", "r", "rZM" } },
>      { INDEX_op_sub2_i64, { "r", "r", "rI", "r", "rZM", "r" } },

I first thought this should go into the middle end, but OTOH it will
de-optimize some TCG targets which have a zero register like MIPS.

In the long term I think we should allow the middle end to query the
backend for constraints, and take the right decision (which would also
improve constant propagation on some hosts). In the meantime it looks
like the right thing to do, so:

Reviewed-by: Aurelien Jarno <address@hidden>

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
address@hidden                 http://www.aurel32.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]