[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex |
Date: |
Wed, 17 Aug 2016 14:18:21 -0400 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Wed, Aug 17, 2016 at 13:58:00 -0400, Emilio G. Cota wrote:
> due to my glaring lack of TCG competence.
A related note that might be of interest.
I benchmarked an alternative implementation that *does* instrument
stores. I wrapped every tcg_gen_qemu_st_i64 (those are enough, right?
tcg_gen_st_i64 are stores for the host memory, which I presume are
not "explicit" guest stores and therefore would not go through
the soft TLB) with a pre/post pair of helpers.
These helpers first check a bitmap given a masked subset of the physical
address of the access, and if the bit is set, then check a QHT with the full
physaddr. If an entry exists, they lock/unlock the entry's spinlock around
the store, so that no race is possible with an ongoing atomic (atomics always
take their corresponding lock). Overhead is not too bad over cmpxchg, but
most of it comes from the helpers--see these numbers for SPEC:
(NB. the "QEMU" baseline does *not* include QHT for tb_htable and therefore
takes tb_lock around tb_find_fast, that's why it's so slow)
http://imgur.com/a/SoSHQ
"QHT only" means a QHT lookup is performed on every guest store. The win of
having the bitmap before hitting the QHT is quite large. I wonder
if things could be sped up further by performing the bitmap check in
TCG code. Would that be worth exploring? If so, any help on that would
be appreciated (i386 host at least)--I tried, but I'm way out of my element.
E.
- [Qemu-devel] MTTCG status updates, benchmark results and KVM forum plans, Alex Bennée, 2016/08/15
- Re: [Qemu-devel] MTTCG status updates, benchmark results and KVM forum plans, Peter Maydell, 2016/08/15
- Re: [Qemu-devel] MTTCG status updates, benchmark results and KVM forum plans, Emilio G. Cota, 2016/08/15
- [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex, Emilio G. Cota, 2016/08/15
- Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex, Richard Henderson, 2016/08/17
- Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex, Emilio G. Cota, 2016/08/17
- Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex,
Emilio G. Cota <=
- Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex, Richard Henderson, 2016/08/17
- Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex, Richard Henderson, 2016/08/18
- Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex, Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 1/8] cpu list: convert to RCU QLIST, Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 3/8] rcu: add rcu_read_lock_held(), Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 7/8] htm: add powerpc64 intrinsics, Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 6/8] htm: add header to abstract Hardware Transactional Memory intrinsics, Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 8/8] target-arm/a64: use HTM with stop-the-world fall-back path, Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 2/8] cpu-exec: remove tb_lock from hot path, Emilio G. Cota, 2016/08/24
- [Qemu-devel] [PATCH 4/8] target-arm: helper fixup for paired atomics, Emilio G. Cota, 2016/08/24