We’ve noticed inconsistent behavior when running a large number of aspeed
ast2600 executions, that seems to be tied to a race condition in the smp boot
when executing on TCG-QEMU, and were wondering what a good mediation strategy
might be.
The problem first shows up as part of SMP boot. On a run that’s likely to later
run into issues, we’ll see something like:
```
[ 0.008350] smp: Bringing up secondary CPUs ...
[ 1.168584] CPU1: failed to come online
[ 1.187277] smp: Brought up 1 node, 1 CPU
```
Compared to the more likely to succeed:
```
[ 0.080313] smp: Bringing up secondary CPUs ...
[ 0.093166] smp: Brought up 1 node, 2 CPUs
[ 0.093345] SMP: Total of 2 processors activated (4800.00 BogoMIPS).
```
It’s somewhat reliably reproducible by running the ast2600-evb with an OpenBMC
image, using ‘-icount auto’ to slow execution and make the race condition more
frequent (it happens without this, just easier to debug if we can reproduce):
```
./aarch64-softmmu/qemu-system-aarch64 -machine ast2600-evb -nographic -drive
file=~/bmc-bin/image-obmc-ast2600,if=mtd,bus=0,unit=0,snapshot=on -nic user
-icount auto
```
Our current hypothesis is that the problem comes up in the platform uboot. As
part of the boot, the secondary core waits for the smp mailbox to get a magic
number written by the primary core:
https://github.com/AspeedTech-BMC/u-boot/blob/aspeed-master-v2019.04/arch/arm/mach-aspeed/ast2600/platform.S#L168
<https://github.com/AspeedTech-BMC/u-boot/blob/aspeed-master-v2019.04/arch/arm/mach-aspeed/ast2600/platform.S#L168>
However, this memory address is cleared on boot:
https://github.com/AspeedTech-BMC/u-boot/blob/aspeed-master-v2019.04/arch/arm/mach-aspeed/ast2600/platform.S#L146
<https://github.com/AspeedTech-BMC/u-boot/blob/aspeed-master-v2019.04/arch/arm/mach-aspeed/ast2600/platform.S#L146>
The race condition occurs if the primary core runs far ahead of the secondary
core: if the primary core gets to the point where it signals the secondary
core’s mailbox before the secondary core gets past the point where it does the
initial reset and starts waiting, the reset will clear the signal, and then the
secondary core will never get past the point where it’s looping in
`poll_smp_mbox_ready`.
We’ve observed this race happening by dumping all SCU reads and writes, and
validated that this is the problem by using a modified `platform.S` that
doesn’t clear the =SCU_SMP_READY mailbox on reset, but would rather not have to
use a modified version of SMP boot just for QEMU-TCG execution.