[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 14/17] tcg/i386: use softmmu fast path for unaligned
From: |
Richard Henderson |
Subject: |
[Qemu-devel] [PATCH 14/17] tcg/i386: use softmmu fast path for unaligned accesses |
Date: |
Mon, 17 Aug 2015 12:38:37 -0700 |
From: Aurelien Jarno <address@hidden>
Softmmu unaligned load/stores currently goes through through the slow
path for two reasons:
- to support unaligned access on host with strict alignement
- to correctly handle accesses crossing pages
x86 is only concerned by the second reason. Unaligned accesses are
avoided by compilers, but are not uncommon. We therefore would like
to see them going through the fast path, if they don't cross pages.
For that we can use the fact that two adjacent TLB entries can't contain
the same page. Therefore accessing the TLB entry corresponding to the
first byte, but comparing its content to page address of the last byte
ensures that we don't cross pages. We can do this check without adding
more instructions in the TLB code (but increasing its length by one
byte) by using the LEA instruction to combine the existing move with the
size addition.
On an x86-64 host, this gives a 3% boot time improvement for a powerpc
guest and 4% for an x86-64 guest.
[rth: Tidied calculation of the offset mask]
Signed-off-by: Aurelien Jarno <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
---
tcg/i386/tcg-target.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
index 7648f7e..ff55499 100644
--- a/tcg/i386/tcg-target.c
+++ b/tcg/i386/tcg-target.c
@@ -1172,7 +1172,7 @@ static void * const qemu_st_helpers[16] = {
First argument register is clobbered. */
static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg
addrhi,
- int mem_index, TCGMemOp s_bits,
+ int mem_index, TCGMemOp opc,
tcg_insn_unit **label_ptr, int which)
{
const TCGReg r0 = TCG_REG_L0;
@@ -1180,6 +1180,8 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg
addrlo, TCGReg addrhi,
TCGType ttype = TCG_TYPE_I32;
TCGType htype = TCG_TYPE_I32;
int trexw = 0, hrexw = 0;
+ int s_mask = (1 << (opc & MO_SIZE)) - 1;
+ bool aligned = (opc & MO_AMASK) == MO_ALIGN || s_mask == 0;
if (TCG_TARGET_REG_BITS == 64) {
if (TARGET_LONG_BITS == 64) {
@@ -1193,13 +1195,19 @@ static inline void tcg_out_tlb_load(TCGContext *s,
TCGReg addrlo, TCGReg addrhi,
}
tcg_out_mov(s, htype, r0, addrlo);
- tcg_out_mov(s, ttype, r1, addrlo);
+ if (aligned) {
+ tcg_out_mov(s, ttype, r1, addrlo);
+ } else {
+ /* For unaligned access check that we don't cross pages using
+ the page address of the last byte. */
+ tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask);
+ }
tcg_out_shifti(s, SHIFT_SHR + hrexw, r0,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
tgen_arithi(s, ARITH_AND + trexw, r1,
- TARGET_PAGE_MASK | ((1 << s_bits) - 1), 0);
+ TARGET_PAGE_MASK | (aligned ? s_mask : 0), 0);
tgen_arithi(s, ARITH_AND + hrexw, r0,
(CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS, 0);
@@ -1545,7 +1553,6 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg
*args, bool is64)
TCGMemOp opc;
#if defined(CONFIG_SOFTMMU)
int mem_index;
- TCGMemOp s_bits;
tcg_insn_unit *label_ptr[2];
#endif
@@ -1558,9 +1565,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg
*args, bool is64)
#if defined(CONFIG_SOFTMMU)
mem_index = get_mmuidx(oi);
- s_bits = opc & MO_SIZE;
- tcg_out_tlb_load(s, addrlo, addrhi, mem_index, s_bits,
+ tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc,
label_ptr, offsetof(CPUTLBEntry, addr_read));
/* TLB Hit. */
@@ -1687,7 +1693,6 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg
*args, bool is64)
TCGMemOp opc;
#if defined(CONFIG_SOFTMMU)
int mem_index;
- TCGMemOp s_bits;
tcg_insn_unit *label_ptr[2];
#endif
@@ -1700,9 +1705,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg
*args, bool is64)
#if defined(CONFIG_SOFTMMU)
mem_index = get_mmuidx(oi);
- s_bits = opc & MO_SIZE;
- tcg_out_tlb_load(s, addrlo, addrhi, mem_index, s_bits,
+ tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc,
label_ptr, offsetof(CPUTLBEntry, addr_write));
/* TLB Hit. */
--
2.4.3
- [Qemu-devel] [PATCH 00/17] queued tcg improvements, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 01/17] tcg/optimize: fix constant signedness, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 04/17] tcg/optimize: track const/copy status separately, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 05/17] tcg/optimize: allow constant to have copies, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 03/17] tcg/optimize: add temp_is_const and temp_is_copy functions, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 06/17] tcg: rename trunc_shr_i32 into trunc_shr_i64_i32, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 07/17] tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 14/17] tcg/i386: use softmmu fast path for unaligned accesses,
Richard Henderson <=
- [Qemu-devel] [PATCH 15/17] tcg/ppc: Improve unaligned load/store handling on 64-bit backend, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 12/17] tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 16/17] tcg/s390: Use softmmu fast path for unaligned accesses, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 11/17] tcg: update README about size changing ops, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 17/17] tcg/aarch64: Use softmmu fast path for unaligned accesses, Richard Henderson, 2015/08/17
- Re: [Qemu-devel] [PATCH 00/17] queued tcg improvements, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 09/17] tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 08/17] tcg: implement real ext_i32_i64 and extu_i32_i64 ops, Richard Henderson, 2015/08/17
- [Qemu-devel] [PATCH 10/17] tcg/optimize: do not remember garbage high bits for 32-bit ops, Richard Henderson, 2015/08/18