[PATCH v2 00/93] TCI fixes and cleanups

From: Richard Henderson
Subject: [PATCH v2 00/93] TCI fixes and cleanups
Date: Wed, 3 Feb 2021 15:43:36 -1000

Almost 7 years ago I detailed 5 major problems in tci[1], of
which three still remain:

  * Unaligned accesses to the bytecode stream, which means
    that we immediately SIGBUS on any host requiring alignment.
  * Non-portable calls to helper functions.
  * Full of useless ifdefs and TODOs.

To my mind, this means the code is unmaintained, despite what it
says in MAINTAINERS.  Thus tci *should* be simply removed.
However, every time removal is suggested, someone comes out of the
woodwork and says we should keep it, because it's useful for $FOO.

Anyway, if we're not going to remove tci, then we have to fix it.

Previously, I rewrote tci all in one lump.  Which was obviously a
mistake, since it meant that the patch was never going to get
reviewed.  This time I've done the rewrite in tiny pieces.

Previously, I invented a moderately complex encoding scheme that
allowed any operand to encode a register or an int32_t immediate.
This time the encoding is quite simple: with only 4 exceptions
all operands encoded into 4-bit slots (registers & conditions).
I rely on the new TEMP_CONST optimization to do a decent job of
loading an immediate value into a register once (via movi), and
reusing that across the TB.

There is a disassembler built into tcg/tci.c, replacing the stub
in disas/tci.c, which reuses the same decoding helpers that are
used by the interpreter.  So finally -d out_asm is useful.

This is good enough to pass make check check-tcg with all of the
docker cross-compilers enabled.  I can boot linux with aarch64,
alpha, and s390x guests.


Based-on: 20210203021550.375058-1-richard.henderson@linaro.org
("[PULL 00/24] tcg patch queue")

[1] https://lists.gnu.org/archive/html/qemu-devel/2014-05/msg02594.html

Richard Henderson (91):
  gdbstub: Fix handle_query_xfer_auxv
  tcg: Split out tcg_raise_tb_overflow
  configure: Fix --enable-tcg-interpreter
  tcg: Manage splitwx in tc_ptr_to_region_tree by hand
  tcg/tci: Make tci_tb_ptr thread-local
  tcg/tci: Inline tci_write_reg32s into the only caller
  tcg/tci: Inline tci_write_reg8 into its callers
  tcg/tci: Inline tci_write_reg16 into the only caller
  tcg/tci: Inline tci_write_reg32 into all callers
  tcg/tci: Inline tci_write_reg64 into 64-bit callers
  tcg/tci: Merge INDEX_op_ld8u_{i32,i64}
  tcg/tci: Merge INDEX_op_ld8s_{i32,i64}
  tcg/tci: Merge INDEX_op_ld16u_{i32,i64}
  tcg/tci: Merge INDEX_op_ld16s_{i32,i64}
  tcg/tci: Merge INDEX_op_{ld_i32,ld32u_i64}
  tcg/tci: Merge INDEX_op_st8_{i32,i64}
  tcg/tci: Merge INDEX_op_st16_{i32,i64}
  tcg/tci: Move stack bounds check to compile-time
  tcg/tci: Merge INDEX_op_{st_i32,st32_i64}
  tcg/tci: Use g_assert_not_reached
  tcg/tci: Remove dead code for TCG_TARGET_HAS_div2_*
  tcg/tci: Implement 64-bit division
  tcg/tci: Remove TODO as unused
  tcg/tci: Restrict TCG_TARGET_NB_REGS to 16
  tcg/tci: Fix TCG_REG_R4 misusage
  tcg/tci: Use bool in tcg_out_ri*
  tcg/tci: Remove TCG_CONST
  tcg/tci: Merge identical cases in generation
  tcg/tci: Remove tci_read_r8
  tcg/tci: Remove tci_read_r8s
  tcg/tci: Remove tci_read_r16
  tcg/tci: Remove tci_read_r16s
  tcg/tci: Remove tci_read_r32s
  tcg/tci: Reduce use of tci_read_r64
  tcg/tci: Merge basic arithmetic operations
  tcg/tci: Merge extension operations
  tcg/tci: Remove ifdefs for TCG_TARGET_HAS_ext32[us]_i64
  tcg/tci: Merge bswap operations
  tcg/tci: Merge mov, not and neg operations
  tcg/tci: Rename tci_read_r to tci_read_rval
  tcg/tci: Split out tci_args_rrs
  tcg/tci: Split out tci_args_rr
  tcg/tci: Split out tci_args_rrr
  tcg/tci: Split out tci_args_rrrc
  tcg/tci: Split out tci_args_l
  tcg/tci: Split out tci_args_rrrrrc
  tcg/tci: Split out tci_args_rrcl and tci_args_rrrrcl
  tcg/tci: Split out tci_args_ri and tci_args_rI
  tcg/tci: Reuse tci_args_l for calls.
  tcg/tci: Reuse tci_args_l for exit_tb
  tcg/tci: Reuse tci_args_l for goto_tb
  tcg/tci: Split out tci_args_rrrrrr
  tcg/tci: Split out tci_args_rrrr
  tcg/tci: Clean up deposit operations
  tcg/tci: Reduce qemu_ld/st TCGMemOpIdx operand to 32-bits
  tcg/tci: Split out tci_args_{rrm,rrrm,rrrrm}
  tcg/tci: Hoist op_size checking into tci_args_*
  tcg/tci: Remove tci_disas
  tcg/tci: Implement the disassembler properly
  tcg: Build ffi data structures for helpers
  tcg/tci: Use ffi for calls
  tcg/tci: Improve tcg_target_call_clobber_regs
  tcg/tci: Move call-return regs to end of tcg_target_reg_alloc_order
  tcg/tci: Push opcode emit into each case
  tcg/tci: Split out tcg_out_op_rrs
  tcg/tci: Split out tcg_out_op_l
  tcg/tci: Split out tcg_out_op_p
  tcg/tci: Split out tcg_out_op_rr
  tcg/tci: Split out tcg_out_op_rrr
  tcg/tci: Split out tcg_out_op_rrrc
  tcg/tci: Split out tcg_out_op_rrrrrc
  tcg/tci: Split out tcg_out_op_rrrbb
  tcg/tci: Split out tcg_out_op_rrcl
  tcg/tci: Split out tcg_out_op_rrrrrr
  tcg/tci: Split out tcg_out_op_rrrr
  tcg/tci: Split out tcg_out_op_rrrrcl
  tcg/tci: Split out tcg_out_op_{rrm,rrrm,rrrrm}
  tcg/tci: Split out tcg_out_op_v
  tcg/tci: Split out tcg_out_op_np
  tcg/tci: Split out tcg_out_op_r[iI]
  tcg/tci: Reserve r13 for a temporary
  tcg/tci: Emit setcond before brcond
  tcg/tci: Remove tci_write_reg
  tcg/tci: Change encoding to uint32_t units
  tcg/tci: Implement goto_ptr
  tcg/tci: Implement movcond
  tcg/tci: Implement andc, orc, eqv, nand, nor
  tcg/tci: Implement extract, sextract
  tcg/tci: Implement clz, ctz, ctpop
  tcg/tci: Implement mulu2, muls2
  tcg/tci: Implement add2, sub2

Stefan Weil (2):
  tcg/tci: Implement INDEX_op_ld16s_i32
  tcg/tci: Implement INDEX_op_ld8s_i64

 configure                              |    5 +-
 meson.build                            |    9 +-
 include/exec/exec-all.h                |    2 +-
 include/exec/helper-ffi.h              |  115 ++
 include/exec/helper-tcg.h              |   24 +-
 include/tcg/tcg-opc.h                  |    6 +-
 include/tcg/tcg.h                      |    1 +
 target/hppa/helper.h                   |    2 +
 target/i386/ops_sse_header.h           |    6 +
 target/m68k/helper.h                   |    1 +
 target/ppc/helper.h                    |    3 +
 tcg/tci/tcg-target-con-set.h           |    8 +-
 tcg/tci/tcg-target.h                   |  118 +-
 disas/tci.c                            |   61 -
 gdbstub.c                              |   17 +-
 tcg/tcg-common.c                       |    4 -
 tcg/tcg.c                              |  117 +-
 tcg/tci.c                              | 1695 +++++++++++++-----------
 tcg/tci/tcg-target.c.inc               |  989 +++++++-------
 tcg/tci/README                         |   20 +-
 tests/docker/dockerfiles/fedora.docker |    1 +
 21 files changed, 1734 insertions(+), 1470 deletions(-)
 create mode 100644 include/exec/helper-ffi.h
 delete mode 100644 disas/tci.c


