[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH v3 09/10] target/i386: optimize indirect branches
From: |
Emilio G. Cota |
Subject: |
[Qemu-devel] [PATCH v3 09/10] target/i386: optimize indirect branches |
Date: |
Wed, 26 Apr 2017 02:23:22 -0400 |
Speed up indirect branches by jumping to the target if it is valid.
Softmmu measurements (see later commit for user-mode numbers):
Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.
- SPECint06 (test set), x86_64-softmmu (Ubuntu 16.04 guest).
Host: Intel i7-4790K @ 4.00GHz
2.4x
+-+--------------------------------------------------------------------------------------------------------------+-+
|
|
| cross
|
2.2x
+cross+jr..........................................................................+++...........................+-+
|
| |
|
+++ | |
2x
+-+..............................................................................|..|............................+-+
|
| | |
|
| | |
1.8x
+-+..............................................................................|####...........................+-+
|
|# |# |
|
**** |# |
1.6x
+-+............................................................................*.|*.|#...........................+-+
|
* |* |# |
|
* |* |# |
1.4x
+-+.......................................................................+++..*.|*.|#...........................+-+
| ++++++
#### * |*++# +++ |
| +++ | |
#++# *++* # +++ | |
1.2x
+-+......................###.....####....+++............|..|...........****..#.*..*..#....####...|.###.....####..+-+
| +++ **** # **** # #### ***###
*++* # * * # #++# ****|# +++#++# |
| ****### +++ *++* # *++* # ++# # #### *|* |# +++ *
* # * * # *** # *| *|# **** # |
1x
+-++-*++*++#++***###++*++*+#++*+-*++#+****++#++***++#+-*+*++#-+****##++*++*-+#+*++*-+#++*+*++#++*-+*+#++*++*++#-++-+
| * * # * * # * * # * * # * * # * * # *|* |# *++* # *
* # * * # * * # * * # * * # |
| * * # * * # * * # * * # * * # * * # *+*++# * * # *
* # * * # * * # * * # * * # |
0.8x
+-+--****###--***###--****##--****###-****###--***###--***###--****##--****###-****###--***###--****##--****###--+-+
astar bzip2 gcc gobmk h264ref hmmlibquantum mcf
omnetpperlbench sjengxalancbmk hmean
png: http://imgur.com/DU36YFU
NB. 'cross' represents the previous commit.
Signed-off-by: Emilio G. Cota <address@hidden>
---
target/i386/translate.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/target/i386/translate.c b/target/i386/translate.c
index a065271..3bd80c0 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4988,7 +4988,7 @@ static target_ulong disas_insn(CPUX86State *env,
DisasContext *s,
gen_push_v(s, cpu_T1);
gen_op_jmp_v(cpu_T0);
gen_bnd_jmp(s);
- gen_eob(s);
+ gen_jr(s, cpu_T0);
break;
case 3: /* lcall Ev */
gen_op_ld_v(s, ot, cpu_T1, cpu_A0);
@@ -5006,7 +5006,8 @@ static target_ulong disas_insn(CPUX86State *env,
DisasContext *s,
tcg_const_i32(dflag - 1),
tcg_const_i32(s->pc - s->cs_base));
}
- gen_eob(s);
+ tcg_gen_ld_tl(cpu_tmp4, cpu_env, offsetof(CPUX86State, eip));
+ gen_jr(s, cpu_tmp4);
break;
case 4: /* jmp Ev */
if (dflag == MO_16) {
@@ -5014,7 +5015,7 @@ static target_ulong disas_insn(CPUX86State *env,
DisasContext *s,
}
gen_op_jmp_v(cpu_T0);
gen_bnd_jmp(s);
- gen_eob(s);
+ gen_jr(s, cpu_T0);
break;
case 5: /* ljmp Ev */
gen_op_ld_v(s, ot, cpu_T1, cpu_A0);
@@ -5029,7 +5030,8 @@ static target_ulong disas_insn(CPUX86State *env,
DisasContext *s,
gen_op_movl_seg_T0_vm(R_CS);
gen_op_jmp_v(cpu_T1);
}
- gen_eob(s);
+ tcg_gen_ld_tl(cpu_tmp4, cpu_env, offsetof(CPUX86State, eip));
+ gen_jr(s, cpu_tmp4);
break;
case 6: /* push Ev */
gen_push_v(s, cpu_T0);
@@ -6409,7 +6411,7 @@ static target_ulong disas_insn(CPUX86State *env,
DisasContext *s,
/* Note that gen_pop_T0 uses a zero-extending load. */
gen_op_jmp_v(cpu_T0);
gen_bnd_jmp(s);
- gen_eob(s);
+ gen_jr(s, cpu_T0);
break;
case 0xc3: /* ret */
ot = gen_pop_T0(s);
@@ -6417,7 +6419,7 @@ static target_ulong disas_insn(CPUX86State *env,
DisasContext *s,
/* Note that gen_pop_T0 uses a zero-extending load. */
gen_op_jmp_v(cpu_T0);
gen_bnd_jmp(s);
- gen_eob(s);
+ gen_jr(s, cpu_T0);
break;
case 0xca: /* lret im */
val = cpu_ldsw_code(env, s->pc);
--
2.7.4
- Re: [Qemu-devel] [PATCH v3 02/10] tcg: introduce goto_ptr opcode, (continued)
- [Qemu-devel] [PATCH v3 05/10] target/arm: optimize cross-page direct jumps in softmmu, Emilio G. Cota, 2017/04/26
- [Qemu-devel] [PATCH v3 08/10] target/i386: optimize cross-page direct jumps in softmmu, Emilio G. Cota, 2017/04/26
- [Qemu-devel] [PATCH v3 04/10] tcg/i386: implement goto_ptr op, Emilio G. Cota, 2017/04/26
- [Qemu-devel] [PATCH v3 07/10] target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr, Emilio G. Cota, 2017/04/26
- [Qemu-devel] [PATCH v3 09/10] target/i386: optimize indirect branches,
Emilio G. Cota <=
- [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Emilio G. Cota, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Paolo Bonzini, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Richard Henderson, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Emilio G. Cota, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Richard Henderson, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Emilio G. Cota, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Emilio G. Cota, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Emilio G. Cota, 2017/04/26
- Re: [Qemu-devel] [PATCH v3 01/10] tcg-runtime: add lookup_tb_ptr helper, Emilio G. Cota, 2017/04/26