[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atom
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically |
Date: |
Thu, 4 Oct 2018 00:01:47 -0400 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Wed, Oct 03, 2018 at 16:04:54 -0400, Emilio G. Cota wrote:
> Updates can come from other threads, so readers that do not
> take tlb_lock must use atomic_read to avoid undefined
> behaviour (UB).
>
> This and the previous commit result in a small performance decrease,
> but this is a fair price for removing UB.
(snip)
> That is, a ~2% slowdown for the aarch64 bootup+shutdown test.
I've run more tests. This slowdown is much more pronounced on
memory-heavy workloads. These are the numbers for SPEC06int:
Speedup over master
1.05 +-+--+----+----+----+----+----+----+---+----+----+----+----+----+--+-+
| +++ || +++ |
|tlb-lock-noatomic +++ | **| |+++ |
| +atomic | ++++ | **## | | |
1 +-+..+++...............++##.***#...|..**|#......**|................+-+
| ### ***++ ***# *+*# +++ **+# +++ **## |
| # # *+*# *|*# *+*# || ** # **## **|# |
| # # * *#+ *+*# * *# || ** # **+#+**|# +** ++### |
0.95 +-+..#.#.....*.*#......*.*#.*.*#.***#.**.#.**.#.**|#......**##***+#+-+
| # # * *# * *# * *# *|*# ** # ** # **+# **+#* * # |
| # # * *# * *# * *# *|*# ** # ** # ** #+++++ ** #* * # |
0.9 +-+***.#..+++*.*#......*.*#.*.*#.*+*#.**.#.**.#.**.#+**|..**.#*.*.#+-+
| * * #***##* *# * *# * *# * *# ** # ** # ** # **## ** #* * # |
| * * #* *+#* *# +++* *# * *# * *# ** # ** # ** # **|# ** #* * # |
| * * #* * #* *# ***# * *# * *# *+*# ** # ** # ** # **+# ** #* * # |
0.85 +-+*.*.#*.*.#*.*#.*.*#+*.*#.*.*#.*.*#.**.#.**.#.**.#.**.#.**.#*.*.#+-+
| * * #* * #* *# * *# * *# * *# * *# ** # ** # ** # ** # ** #* * # |
| * * #* * #* *# * *# * *# * *# * *# ** # ** # ** # ** # ** #* * # |
| * * #* * #* *# * *# * *# * *# * *# ** # ** # ** # ** # ** #* * # |
0.8 +-+***##***##***#-***#-***#-***#-***#-**##-**##-**##-**##-**##***##+-+
401.bzi403.g429445.g456.462.libq464.h471.omn4483.xalancbgeomean
That is, a 5% average slowdown, with a max slowdown of ~14% for
mcf :-(
I'll profile tomorrow and see where the slowdown comes from.
If the lock is the issue, we might be better off shifting
all the work to the cross-vCPU call (e.g. doing a round of
synchronous cross-vCPU calls via run_on_cpu), if the assumption
that those calls are very rare is correct.
Emilio
- [Qemu-devel] [PATCH v2 0/4] per-TLB lock, Emilio G. Cota, 2018/10/03
- [Qemu-devel] [PATCH v2 1/4] exec: introduce tlb_init, Emilio G. Cota, 2018/10/03
- [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically, Emilio G. Cota, 2018/10/03
- Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically,
Emilio G. Cota <=
- [Qemu-devel] [PATCH v2 2/4] cputlb: fix assert_cpu_is_self macro, Emilio G. Cota, 2018/10/03
- [Qemu-devel] [PATCH v2 3/4] cputlb: serialize tlb updates with env->tlb_lock, Emilio G. Cota, 2018/10/03
- Re: [Qemu-devel] [PATCH v2 0/4] per-TLB lock, Alex Bennée, 2018/10/04