[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Torn read/write possible on aarch64/x86-64 MTTCG?
From: |
Andrew Baumann |
Subject: |
[Qemu-devel] Torn read/write possible on aarch64/x86-64 MTTCG? |
Date: |
Mon, 24 Jul 2017 19:05:33 +0000 |
Hi all,
I'm trying to track down what appears to be a translation bug in either the
aarch64 target or x86_64 TCG (in multithreaded mode). The symptoms are entirely
consistent with a torn read/write -- that is, a 64-bit load or store that was
translated to two 32-bit loads and stores -- but that's obviously not what
happens in the common path through the translation for this code, so I'm
wondering: are there any cases in which qemu will split a 64-bit memory access
into two 32-bit accesses?
The code:
Guest CPU A writes a 64-bit value to an aligned memory location that was
previously 0, using a regular store; e.g.:
f9000034 str x20,[x1]
Guest CPU B (who is busy-waiting) reads a value from the same location:
f9400280 ldr x0,[x20]
The symptom:
CPU B loads a value that is neither NULL nor the value written. Instead, x0
gets only the low 32-bits of the value written (high bits are all zero). By the
time this value is dereferenced (a few instructions later) and the exception
handlers run, the memory location from which it was loaded has the correct
64-bit value with a non-zero upper half.
Obviously on a real ARM memory barriers are critical, and indeed the code has
such barriers in it, but I'm assuming that any possible mistranslation of the
barriers is irrelevant because for a 64-bit load and a 64-bit store you should
get all or nothing. Other clues that may be relevant: the code is _near_ a
LDREX/STREX pair (the busy-waiting is used to resolve a race when updating
another variable), and the busy-wait loop has a yield instruction in it
(although those appear to be no-ops with MTTCG).
The bug repros more easily with more guest VCPUs, and more load on the host
(i.e. more context switching to expose the race). It doesn't repro for the
single-threaded TCG. Unfortunately it's hard to get detailed trace information,
because the bug only repros roughly every one in 40 attempts, and it's a long
way into the guest OS boot before it arises.
I'm not yet 100% convinced this is a qemu bug -- the obvious path through the
translator for those instructions does 64-bit memory accesses on the host --
but at the same time, it has never been seen outside qemu, and after staring
long and hard at the guest code, we're pretty sure it's correct. It's also
extremely unlikely to be a wild write, given that it occurs on a wide variety
of guest call-stacks, and the memory is later inconsistent with what was loaded.
Any clues or debugging suggestions appreciated!
Thanks,
Andrew
- [Qemu-devel] Torn read/write possible on aarch64/x86-64 MTTCG?,
Andrew Baumann <=