qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tcg: fix guest memory ordering enforcement


From: Richard Henderson
Subject: Re: [PATCH] tcg: fix guest memory ordering enforcement
Date: Sat, 30 Apr 2022 15:03:36 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

On 4/28/22 04:32, Redha Gouicem wrote:
This commit allows memory ordering enforcement to be performed more
precisely. The previous scheme with fences always inserted before the memory
access made it impossible to correctly enforce the x86 model on weakly ordered
architectures such as arm. With this change, the memory models of guests can be
defined more precisely, with a fence before and a fence after the access. This
allows for a precise mapping of the ordering, that relies less on what type of
fences the host architecture provides.

Signed-off-by: Redha Gouicem <redha.gouicem@gmail.com>
---
  target/alpha/cpu.h  |  4 ++++
  target/arm/cpu.h    |  4 ++++
  target/avr/cpu.h    |  4 ++++
  target/hppa/cpu.h   |  4 ++++
  target/i386/cpu.h   |  4 ++++
  target/mips/cpu.h   |  4 ++++
  target/ppc/cpu.h    |  4 ++++
  target/riscv/cpu.h  |  4 ++++
  target/s390x/cpu.h  |  4 ++++
  target/xtensa/cpu.h |  4 ++++
  tcg/tcg-op.c        | 19 ++++++++++++-------
  11 files changed, 52 insertions(+), 7 deletions(-)

First of all, patches that don't compile aren't particularly helpful. You've missed out on updating 10 of 20 targets.

I do not believe your assertion that the current scheme "makes it impossible to correctly enforce the memory model". Please provide your rationale.

  /* Alpha processors have a weak memory model */
  #define TCG_GUEST_DEFAULT_MO      (0)
+#define TCG_GUEST_MO_BEF_LD       (0)
+#define TCG_GUEST_MO_AFT_LD       (0)
+#define TCG_GUEST_MO_BEF_ST       (0)
+#define TCG_GUEST_MO_AFT_ST       (0)

There's no new information here.

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9661f9fbd1..c6a7052d58 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -29,6 +29,10 @@
/* The x86 has a strong memory model with some store-after-load re-ordering */
  #define TCG_GUEST_DEFAULT_MO      (TCG_MO_ALL & ~TCG_MO_ST_LD)
+#define TCG_GUEST_MO_BEF_LD       (0)
+#define TCG_GUEST_MO_AFT_LD       (TCG_MO_LD_ST | TCG_MO_LD_LD)
+#define TCG_GUEST_MO_BEF_ST       (TCG_MO_ST_ST)
+#define TCG_GUEST_MO_AFT_ST       (0)

Or even here -- you're making the executive decision that barriers go after loads and before stores. Note that

  TCG_GUEST_MO_AFT_LD == (TCG_GUEST_DEFAULT_MO & (TCG_MO_LD_ST | TCG_MO_LD_LD))
  TCG_GUEST_MO_BEF_ST == (TCG_GUEST_DEFAULT_MO & TCG_MO_ST_ST)
  TCG_GUEST_MO_AFT_ST == (TCG_GUEST_DEFAULT_MO & TCG_MO_ST_LD)

You could just as easily make this selection in tcg-op.c alone.

I'm not sure why putting a ST_ST barrier before a store is better or worse than putting a ST_ST barrier after the store. I could imagine that putting wmb barriers before the operation schedules better with the surrounding code, but I'd want that decision explicitly stated.

There *are* still missing barriers with the cpu_ld*_data* and cpu_st*_data* functions, as used by out-of-line helpers. Certainly I expect those to have less impact than the generated code, but they are still required.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]