[Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_cpu_s

From: Richard Henderson
Date: Wed, 13 Feb 2019 20:06:48 -0800

We've talked about this before, caching state to reduce the
amount of computation that happens looking up each TB.

I know that Peter has been concerned that we would not be able to 
reliably maintain all of the places that need to be updates to
keep this up-to-date.

Well, modulo dirty tricks within linux-user, it appears as if
exception delivery and return, plus after every TB-ending write
to a system register is sufficient.

There seems to be a noticable improvement, although wall-time
is harder to come by -- all of my system-level measurements
include user input, and my user-level measurements seem to be
too small to matter.


Richard Henderson (4):
  target/arm: Split out recompute_hflags et al
  target/arm: Rebuild hflags at el changes and MSR writes
  target/arm: Assert hflags is correct in cpu_get_tb_cpu_state
  target/arm: Rely on hflags correct in cpu_get_tb_cpu_state

 target/arm/cpu.h           |  22 ++-
 target/arm/helper.h        |   3 +
 target/arm/internals.h     |   4 +
 linux-user/syscall.c       |   1 +
 target/arm/cpu.c           |   1 +
 target/arm/helper-a64.c    |   3 +
 target/arm/helper.c        | 267 ++++++++++++++++++++++---------------
 target/arm/machine.c       |   1 +
 target/arm/op_helper.c     |   1 +
 target/arm/translate-a64.c |   6 +-
 target/arm/translate.c     |  14 +-
 11 files changed, 204 insertions(+), 119 deletions(-)


