[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 2/7] tcg: Optimize add2 + sub2
From: |
Aurelien Jarno |
Subject: |
Re: [Qemu-devel] [PATCH 2/7] tcg: Optimize add2 + sub2 |
Date: |
Fri, 28 Sep 2012 01:20:15 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Thu, Sep 27, 2012 at 10:19:52AM -0700, Richard Henderson wrote:
> We can't do complete constant folding because we lack "mov2",
> or the ability to insert opcodes in the stream. But we can
> at least canonicalize add2 operand ordering and simplify
> add2 to add when the lowpart adds a constant 0.
>
> Signed-off-by: Richard Henderson <address@hidden>
> ---
> tcg/optimize.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 55f2a24..004c336 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -470,6 +470,11 @@ static TCGArg *tcg_constant_folding(TCGContext *s,
> uint16_t *tcg_opc_ptr,
> if (swap_commutative(args[0], &args[4], &args[3])) {
> args[5] = tcg_invert_cond(args[5]);
> }
> + break;
> + case INDEX_op_add2_i32:
> + swap_commutative(args[0], &args[2], &args[4]);
> + swap_commutative(args[1], &args[3], &args[5]);
> + break;
> default:
> break;
> }
> @@ -522,6 +527,32 @@ static TCGArg *tcg_constant_folding(TCGContext *s,
> uint16_t *tcg_opc_ptr,
> continue;
> }
> break;
> + case INDEX_op_add2_i32:
> + case INDEX_op_sub2_i32:
> + /* Simplify op rl, rh, al, ah, 0, bh => op rh, ah, bh.
> + The zero implies there will be no carry into the high part.
> + But only when rl == al, since we can't insert the extra move
> + that would be required. */
> + if (temps[args[4]].state == TCG_TEMP_CONST
> + && temps[args[4]].val == 0
> + && temps_are_copies(args[0], args[2])) {
> + if (temps[args[5]].state == TCG_TEMP_CONST
> + && temps[args[5]].val == 0
> + && temps_are_copies(args[1], args[3])) {
> + gen_opc_buf[op_index] = INDEX_op_nop;
> + } else {
> + gen_opc_buf[op_index] = (op == INDEX_op_add2_i32
> + ? INDEX_op_add_i32
> + : INDEX_op_sub_i32);
> + args[0] = args[1];
> + args[1] = args[3];
> + args[2] = args[5];
> + gen_args += 3;
> + }
> + args += 6;
> + continue;
> + }
> + break;
> default:
> break;
> }
> --
> 1.7.11.4
>
I understand that we can't easily insert an instruction, so the
limitation comes from here, but is it really something happening often?
Doing an optimization has a CPU cost, so if it is not used often, it
might be worse than without.
--
Aurelien Jarno GPG: 1024D/F1BCDB73
address@hidden http://www.aurel32.net
- [Qemu-devel] [PATCH 0/7] Double-word tcg/optimize improvements, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 1/7] tcg: Split out swap_commutative as a subroutine, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 2/7] tcg: Optimize add2 + sub2, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 4/7] tcg: Optimize double-word comparisons against zero, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 7/7] tcg: Do constant folding on double-word comparisons, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 6/7] tcg: Tidy brcond optimization, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 3/7] tcg: Swap commutative double-word comparisons, Richard Henderson, 2012/09/27
- [Qemu-devel] [PATCH 5/7] tcg: Split out subroutines from do_constant_folding_cond, Richard Henderson, 2012/09/27