[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH] Huge TLB performance improvement
From: |
Thiemo Seufer |
Subject: |
[Qemu-devel] [PATCH] Huge TLB performance improvement |
Date: |
Mon, 6 Mar 2006 14:59:29 +0000 |
User-agent: |
Mutt/1.5.11+cvs20060126 |
Hello All,
this patch vastly improves TLB performance on MIPS, and probably also
on other architectures. I measured a Linux boot-shutdown cycle,
including userland init.
With minimal jump cache invalidation:
real 11m43.429s
user 9m51.975s
sys 0m1.375s
64.19 1476.81 1476.81 20551904 0.00 0.00 tlb_flush_page
6.72 1631.36 154.55 184346 0.00 0.00 cpu_mips_exec
4.35 1731.46 100.10 3550500 0.00 0.00 dyngen_code
3.66 1815.77 84.31 90897893 0.00 0.00 decode_opc
2.89 1882.21 66.44 11170487 0.00 0.00
gen_intermediate_code_internal
1.72 1921.80 39.59 29919267 0.00 0.00 map_address
1.52 1956.66 34.86 7619987 0.00 0.00 tb_find_pc
0.96 1978.85 22.19 26361969 0.00 0.00 tlb_set_page_exec
0.96 2000.84 21.99 __ldl_mmu
0.90 2021.59 20.75 27279747 0.00 0.00 gen_arith_imm
With global jump cache kill:
real 6m19.811s
user 4m23.650s
sys 0m0.617s
21.67 188.78 188.78 146571 0.00 0.00 cpu_mips_exec
11.37 287.88 99.10 3393051 0.00 0.00 dyngen_code
9.59 371.45 83.57 89839869 0.00 0.00 decode_opc
7.68 438.33 66.88 10989930 0.00 0.00
gen_intermediate_code_internal
4.24 475.26 36.93 30124659 0.00 0.00 map_address
3.80 508.33 33.07 7596879 0.00 0.00 tb_find_pc
2.74 532.22 23.89 27781692 0.00 0.00 tlb_set_page_exec
2.62 555.02 22.80 39891573 0.00 0.00 cpu_mips_handle_mmu_fault
2.55 577.25 22.23 __ldl_mmu
2.30 597.26 20.01 26968709 0.00 0.00 gen_arith_imm
Thiemo
Index: qemu-work/exec.c
===================================================================
--- qemu-work.orig/exec.c 2006-03-06 01:30:09.000000000 +0000
+++ qemu-work/exec.c 2006-03-06 01:30:28.000000000 +0000
@@ -1247,7 +1247,6 @@
void tlb_flush_page(CPUState *env, target_ulong addr)
{
int i;
- TranslationBlock *tb;
#if defined(DEBUG_TLB)
printf("tlb_flush_page: " TARGET_FMT_lx "\n", addr);
@@ -1261,14 +1260,10 @@
tlb_flush_entry(&env->tlb_table[0][i], addr);
tlb_flush_entry(&env->tlb_table[1][i], addr);
- for(i = 0; i < TB_JMP_CACHE_SIZE; i++) {
- tb = env->tb_jmp_cache[i];
- if (tb &&
- ((tb->pc & TARGET_PAGE_MASK) == addr ||
- ((tb->pc + tb->size - 1) & TARGET_PAGE_MASK) == addr)) {
- env->tb_jmp_cache[i] = NULL;
- }
- }
+ /* We throw away the jump cache altogether. This is cheaper than
+ trying to be smart by invalidating only the entries in the
+ affected address range. */
+ memset (env->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
#if !defined(CONFIG_SOFTMMU)
if (addr < MMAP_AREA_END)
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] [PATCH] Huge TLB performance improvement,
Thiemo Seufer <=