Hi,
An interesting bug was reported on #qemu today. It was bisected to
8d04fb55 (drop global lock for TCG) and only occurred when QEMU was
run
with taskset -c 0. Originally the fingers where pointed at mttcg
but it
occurs in both single and multi-threaded modes.
I think the problem is qemu_system_reset_request() is certainly racy
when resetting a running CPU. AFAICT:
- Guest resets board, writing to some hw address (e.g.
arm_sysctl_write)
- This triggers qemu_system_reset_request
(SHUTDOWN_CAUSE_GUEST_RESET)
- We exit iowrite and drop the BQL
- vl.c schedules qemu_system_reset-
>qemu_devices_reset...arm_cpu_reset
- we start writing new values to CPU env while still in TCG code
- CHAOS!
The general solution for this is to ensure these sort of tasks are
done
with safe work in the CPUs context when we know nothing else is
running.
It seems this is probably best done by modifying
qemu_system_reset_request to queue work up on current_cpu and
execute it
as safe work - I don't think the vl.c thread should ever be messing
about with calling cpu_reset directly.