[Qemu-devel] [RFC v2 00/34] Multi Architecture System Emulation

From: Peter Crosthwaite
Subject: [Qemu-devel] [RFC v2 00/34] Multi Architecture System Emulation
Date: Sat, 30 May 2015 23:11:33 -0700

** Note: Very different to V1 **

Hi All,

This is target-multi, a system-mode build that can support multiple

Two architectures are initially converted. Microblaze and ARM. Step
by step conversion in done for each. A microblaze is added to
Xilinx Zynq platform as a test case. This will be elaborted more in
future spins. This use case is valid, as Microblazes can be added (any
number of them!) in Zynq FPGA programmable logic configuration.

The general approach (radically different to approach in V1 RFC) is to build
and prelink an object (arch-obj.o) per-arch containing:

1: target-foo/*
2: All uses of env internals and CPU_GET_ENV
    * cputlb, translate-all, cpu-exec
    * TCG backend

This means cputlb and friends are compiled multiple times fo each arch. The
symbols for each of these pre-links are then localised to avoid link time name
collisions. This is based on Paolo's suggestion to templatify cputlb and
friends. Just the net of what to multi-compile is widened to incude the TCG
stuff as well now.

Despite being some "major surgery" this approach actually solves many of big
the problems raised in V1. Big problems sovled:

1: With the multi-compile TCG backends there are now multiple tcg_ctx's for
each architecture. This solves the issue PMM raised WRT false positives on TB
hashing as archs no longer share translation context.

2: There is no longer a need to reorder the CPU_COMMON within the ENV or the ENV
within the CPU. This was flagged as a performance issue by multiple people in 
All users of the env internals as well as ENV_GET_CPU are now in multi-compile
code and so multi-arch does not need to define a generic ENV nor does in need to
def the problematic ENV_GET_CPU.

3: With the prelink symbol localisation, link time namespace collision of
helpers from multiple arches is no longer an issue. No need to bloat all the
function names with arch specific prefixes.

4: The architecture specifics used/defined by cpu-defs can now vary from arch to
arch (incl. target_ulong) greatly reducing coversion effort needed. The list
of restrictions for multi-arch capability is much reduced since V1. No
target_long issues anymore.

The approach trades in the big problems of last series for a number of smaller
problems. Some I have decided not to tackle until I have some list uptime. Check
the patches marked HACK, which have commit messages detailing individual
problems (e.g what do we do about TCG profiling with multiple tcg_ctx?).

include/exec/*.h and some of the common code needs some refactoring to setup
this single vs multi compile split. Mostly code movements.

The interface between the multi compile and single compiled files needs to be
virtualised using QOM cpu functions. But this is now a very low footprint
change as most of the virtualised hooks are now in mutli-compiled code (they
only exist as text once). There are more new hooks than before, but the per
target change pattern is reduced.

There is a lot more core code changes and less target-foo changes this time.
Full coversion is looking more feasible for one QEMU that can do everything.

For the implementation of the series, the trickiest part is (still) cpu.h
inclusion management. There are now more than one cpu.h's and different
parts of the tree need a different include scheme. target-multi defines
it's own cpu.h which is bare minimum defs as needed by core code only.
target-foo/cpu.h are mostly the same but refactored to avoid collisions
with other cpu.h's. Inclusion scheme goes something like
this (for the multi-arch build):

*: Core code includes only target-multi/cpu.h
*: target-foo/ implementation code includes target-foo/cpu.h locally
*: System level code (e.g. mach models) can use multiple target-foo/cpu.h's

The hardest unasnwered Q is (still) what to do about bootloading. Currently
each arch has it's own architecture specific bootloading which may assume a
single architecture. I have applied some hacks to at least get this
RFC testable using a -kernel -firmware split but going forward being
able to associate an elf/image with a cpu explictitly needs to be

No support for KVM, im not sure if a mix of TCG and KVM is supported even for
a single arch? (which would be prerequisite to MA KVM).

Depends (not heavily) on some already on list patches:

memory_mapping: Use qemu_common.h include
configure: Unify arm and aarch64 disas configury
Makefile.target: set master BUILD_DIR
cpus: Change exec_init arg to cpu, not env
cpus: Change tcg_cpu_exec arg to cpu, not env
gdbserver: _fork: Change fn to accept cpu instead of env
translate-all: Change tb_flush env argument to cpu
microblaze: s3adsp: Instantiate CPU using QOM
disas: cris: QOMify target specific disas setup
disas: cris: Fix 0 buffer length case
disas: microblaze: QOMify target specific disas setup
disas: arm: QOMify target specific disas setup
disas: arm-a64: Make printfer and stream variable
disas: QOMify target specific setup
disas: Add print_insn to disassemble info
disas: Remove uses of CPU env
monitor: Split mon_get_cpu fn to remove ENV_GET_CPU
device-tree: Make a common-obj

These deps do not really inhibit at least a high level review of this series.


Changed since v1:
Near total rewrite.

Peter Crosthwaite (34):
  cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg
  cpu-exec: Purge all uses of CPU_GET_ENV
  Makefile.target: Introduce arch-obj
  cpu-exec: Migrate some generic fns to cpus.c
  translate: Listify tcg_exec_init
  cpu-common: Define tb_page_addr_t for everyone
  exec-all: Move cpu_can_do_io to qom/cpu.h
  translate-all: Move tcg_handle_interrupt to -common
  include/exec: Move standard exceptions to cpu-all.h
  include/exec: Split target_long def to new header
  include/exec: Move cputlb exec.c defs out
  include/exec: Move tb hash functions out
  cpu-defs: Move out TB_JMP defines
  cpu-defs: Allow multiple inclusions
  HACK: monitor: Comment out TCG profile ops
  HACK: Disable list_cpus
  HACK: globalise TCG page size variables
  HACK: monitor: uninclude cpu_ldst
  HACK: disas: Defeature print_target_address
  HACK: exec: comment out use of cpu_get_tb_cpu_from_state
  core: virtualise CPU interfaces completely
  microblaze: enable multi-arch
  arm: cpu: static inline cpu_arm_init
  target-arm: Split cp helper API to new C file
  arm: enable multi-arch
  core: Introduce multi-arch build
  hw: arm: Explicitly include cpu.h for consumers
  arm: Remove ELF_MACHINE from cpu.h
  hw: mb: Explicitly include cpu.h for consumers
  mb: Remove ELF_MACHINE from cpu.h
  arm: boot: Don't assume all CPUs are ARM
  arm: xilinx_zynq: Add a Microblaze
  HACK: mb: boot: Assume using -firmware for mb software
  HACK: mb: boot: Disable dtb load in multi-arch

 Makefile.objs                     |   1 +
 Makefile.target                   |  34 +++-
 arch_init.c                       |   4 +-
 configure                         |  39 ++++-
 cpu-exec.c                        | 101 ++++--------
 cpus.c                            |  54 ++++++-
 cputlb.c                          |  40 +++--
 default-configs/multi-softmmu.mak |   2 +
 disas.c                           |  12 +-
 exec.c                            |  40 +++--
 gdbstub.c                         |   2 +-
 hw/arm/armv7m.c                   |   2 +-
 hw/arm/boot.c                     |   8 +-
 hw/arm/strongarm.h                |   2 +
 hw/arm/xilinx_zynq.c              |  15 ++
 hw/microblaze/boot.c              |  12 +-
 hw/microblaze/boot.h              |   2 +
 include/exec/cpu-all.h            |   6 +
 include/exec/cpu-common.h         |   4 +
 include/exec/cpu-defs.h           |  50 ++----
 include/exec/cputlb.h             |  16 --
 include/exec/exec-all.h           |  73 ++-------
 include/exec/target-long.h        |  52 ++++++
 include/exec/tb-hash.h            |  51 ++++++
 include/hw/arm/arm.h              |   3 +
 include/hw/arm/digic.h            |   2 +
 include/hw/arm/exynos4210.h       |   2 +
 include/hw/arm/omap.h             |   2 +
 include/hw/arm/pxa.h              |   2 +
 include/qemu-common.h             |   5 +
 include/qom/cpu.h                 |  84 ++++++++++
 include/sysemu/arch_init.h        |   1 +
 linux-user/elfload.c              |   3 +
 monitor.c                         |   5 +-
 qom/cpu.c                         |   1 +
 stubs/Makefile.objs               |   1 +
 stubs/cpu-qom.c                   |  76 +++++++++
 target-arm/Makefile.objs          |  24 +--
 target-arm/cpu-qom.h              |   2 +
 target-arm/cpu.c                  |   1 +
 target-arm/cpu.h                  |  70 +++++++-
 target-arm/helper.c               | 331 --------------------------------------
 target-arm/hw/Makefile.objs       |   1 +
 target-arm/hw/cp.c                | 330 +++++++++++++++++++++++++++++++++++++
 target-microblaze/Makefile.objs   |   6 +-
 target-microblaze/cpu-qom.h       |   2 +
 target-microblaze/cpu.c           |   1 +
 target-microblaze/cpu.h           |  44 ++++-
 target-multi/cpu.h                |  16 ++
 target-multi/helper.h             |   1 +
 tcg/tcg.h                         |   7 +-
 tcg/tci/tcg-target.h              |   3 +-
 tci.c                             |   2 +-
 translate-all.c                   |  45 +-----
 translate-all.h                   |   2 -
 translate-common.c                |  89 ++++++++++
 56 files changed, 1131 insertions(+), 655 deletions(-)
 create mode 100644 default-configs/multi-softmmu.mak
 create mode 100644 include/exec/target-long.h
 create mode 100644 include/exec/tb-hash.h
 create mode 100644 stubs/cpu-qom.c
 create mode 100644 target-arm/hw/Makefile.objs
 create mode 100644 target-arm/hw/cp.c
 create mode 100644 target-multi/cpu.h
 create mode 100644 target-multi/helper.h
 create mode 100644 translate-common.c


