Groundwork for supporting multiple TCG contexts.
While at it, also allocate temps_used directly as a bitmap of the
required size, instead of having a bitmap of TCG_MAX_TEMPS via
TCGTempSet.
Performance-wise we lose about 2% in a translation-heavy workload
such as booting+shutting down debian-arm:
Performance counter stats for 'taskset -c 0 arm-softmmu/qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=die-on-boot.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):
Before:
19489.126318 task-clock # 0.960 CPUs utilized
( +- 0.96% )
23,697 context-switches # 0.001 M/sec
( +- 0.51% )
1 CPU-migrations # 0.000 M/sec
19,953 page-faults # 0.001 M/sec
( +- 0.40% )
56,214,402,410 cycles # 2.884 GHz
( +- 0.95% ) [83.34%]
25,516,669,513 stalled-cycles-frontend # 45.39% frontend cycles idle
( +- 0.69% ) [83.33%]
17,266,165,747 stalled-cycles-backend # 30.71% backend cycles idle
( +- 0.59% ) [66.66%]
79,007,843,327 instructions # 1.41 insns per cycle
# 0.32 stalled cycles per
insn ( +- 1.19% ) [83.34%]
13,136,600,416 branches # 674.048 M/sec
( +- 1.29% ) [83.34%]
274,715,270 branch-misses # 2.09% of all branches
( +- 0.79% ) [83.33%]
20.300335944 seconds time elapsed
( +- 0.55% )
After:
19917.737030 task-clock # 0.955 CPUs utilized
( +- 0.74% )
23,973 context-switches # 0.001 M/sec
( +- 0.37% )
1 CPU-migrations # 0.000 M/sec
19,824 page-faults # 0.001 M/sec
( +- 0.38% )
57,380,269,537 cycles # 2.881 GHz
( +- 0.70% ) [83.34%]
26,462,452,508 stalled-cycles-frontend # 46.12% frontend cycles idle
( +- 0.65% ) [83.34%]
17,970,546,047 stalled-cycles-backend # 31.32% backend cycles idle
( +- 0.64% ) [66.67%]
79,527,238,334 instructions # 1.39 insns per cycle
# 0.33 stalled cycles per
insn ( +- 0.79% ) [83.33%]
13,272,362,192 branches # 666.359 M/sec
( +- 0.83% ) [83.34%]
278,357,773 branch-misses # 2.10% of all branches
( +- 0.65% ) [83.33%]
20.850558455 seconds time elapsed
( +- 0.55% )
That is, 2.70% slowdown.