[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v10 1/8] memory: prevent dma-reentracy issues
|
From: |
Igor Mammedov |
|
Subject: |
Re: [PATCH v10 1/8] memory: prevent dma-reentracy issues |
|
Date: |
Tue, 27 Aug 2024 17:49:04 +0200 |
On Wed, 21 Aug 2024 15:25:18 +0200
Igor Mammedov <imammedo@redhat.com> wrote:
> On Thu, 27 Apr 2023 17:10:06 -0400
> Alexander Bulekov <alxndr@bu.edu> wrote:
>
> > Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
> > This flag is set/checked prior to calling a device's MemoryRegion
> > handlers, and set when device code initiates DMA. The purpose of this
> > flag is to prevent two types of DMA-based reentrancy issues:
> >
> > 1.) mmio -> dma -> mmio case
> > 2.) bh -> dma write -> mmio case
>
> Alexander, with 9.0
> we are getting
>
> warning: Blocked re-entrant IO on MemoryRegion: acpi-cpu-hotplug at addr:
> 0x0
>
> during CPU hot-unplug, to my knowledge there shouldn't be any DMA involved
> there.
> The only access should be either from guest kernel or firmware(this one is
> under SMM mode)).
>
> Question is how this could happen on MMIO access which should be guarded by
> BQL?
For prosperity, reproducer (RHEL9.4 Haswell host + upstream QEMU/edk
https://issues.redhat.com/browse/RHEL-56154):
./qemu-system-x86_64 --enable-kvm -smp
2,maxcpus=24,cores=12,threads=1,dies=1,sockets=2 \
-cpu host \
-blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename":
"/tmp/qemu_build/pc-bios/edk2-x86_64-secure-code.fd", "auto-read-only": true,
"discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only":
true, "file": "file_ovmf_code"}' \
-blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename":
"/tmp/edk_VARS.raw", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only":
false, "file": "file_ovmf_vars"}' \
-M q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-m 4G \
-monitor stdio \
rhel9.5.raw
once booted
(qemu) device_add
host-x86_64-cpu,id=vcpu1,socket-id=1,core-id=10,die-id=0,thread-id=0
(qemu) device_add
host-x86_64-cpu,id=vcpu2,socket-id=1,core-id=11,die-id=0,thread-id=0
(qemu) device_del vcpu1
(qemu) qemu-system-x86_64: warning: Blocked re-entrant IO on MemoryRegion:
acpi-cpu-hotplug at addr: 0x0
> And where to start digging to find out if it's a genuine issue,
> or whether it's safe to use big hammer and disable reentrancy guard?
I'm hesitant to use hammer so far (though it would make problem nop).
What happens is that cpu_remove_sync() temporarily releases BQL
which lets conflicting access to happen.
But I think unexpected access shouldn't be there in the 1st place,
so guard looks pretty legit at this point.
Lets see what Gerd finds out from edk2 point of view.
>
>
> > These issues have led to problems such as stack-exhaustion and
> > use-after-frees.
> >
> > Summary of the problem from Peter Maydell:
> > https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
> >
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
> > Resolves: CVE-2023-0330
> >
> > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > Reviewed-by: Thomas Huth <thuth@redhat.com>
> > ---
> > include/exec/memory.h | 5 +++++
> > include/hw/qdev-core.h | 7 +++++++
> > softmmu/memory.c | 16 ++++++++++++++++
> > 3 files changed, 28 insertions(+)
> >
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 15ade918ba..e45ce6061f 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -767,6 +767,8 @@ struct MemoryRegion {
> > bool is_iommu;
> > RAMBlock *ram_block;
> > Object *owner;
> > + /* owner as TYPE_DEVICE. Used for re-entrancy checks in MR access
> > hotpath */
> > + DeviceState *dev;
> >
> > const MemoryRegionOps *ops;
> > void *opaque;
> > @@ -791,6 +793,9 @@ struct MemoryRegion {
> > unsigned ioeventfd_nb;
> > MemoryRegionIoeventfd *ioeventfds;
> > RamDiscardManager *rdm; /* Only for RAM */
> > +
> > + /* For devices designed to perform re-entrant IO into their own IO MRs
> > */
> > + bool disable_reentrancy_guard;
> > };
> >
> > struct IOMMUMemoryRegion {
> > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> > index bd50ad5ee1..7623703943 100644
> > --- a/include/hw/qdev-core.h
> > +++ b/include/hw/qdev-core.h
> > @@ -162,6 +162,10 @@ struct NamedClockList {
> > QLIST_ENTRY(NamedClockList) node;
> > };
> >
> > +typedef struct {
> > + bool engaged_in_io;
> > +} MemReentrancyGuard;
> > +
> > /**
> > * DeviceState:
> > * @realized: Indicates whether the device has been fully constructed.
> > @@ -194,6 +198,9 @@ struct DeviceState {
> > int alias_required_for_version;
> > ResettableState reset;
> > GSList *unplug_blockers;
> > +
> > + /* Is the device currently in mmio/pio/dma? Used to prevent
> > re-entrancy */
> > + MemReentrancyGuard mem_reentrancy_guard;
> > };
> >
> > struct DeviceListener {
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index b1a6cae6f5..fe23f0e5ce 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -542,6 +542,18 @@ static MemTxResult access_with_adjusted_size(hwaddr
> > addr,
> > access_size_max = 4;
> > }
> >
> > + /* Do not allow more than one simultaneous access to a device's IO
> > Regions */
> > + if (mr->dev && !mr->disable_reentrancy_guard &&
> > + !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
> > + if (mr->dev->mem_reentrancy_guard.engaged_in_io) {
> > + warn_report("Blocked re-entrant IO on "
> > + "MemoryRegion: %s at addr: 0x%" HWADDR_PRIX,
> > + memory_region_name(mr), addr);
> > + return MEMTX_ACCESS_ERROR;
> > + }
> > + mr->dev->mem_reentrancy_guard.engaged_in_io = true;
> > + }
> > +
> > /* FIXME: support unaligned access? */
> > access_size = MAX(MIN(size, access_size_max), access_size_min);
> > access_mask = MAKE_64BIT_MASK(0, access_size * 8);
> > @@ -556,6 +568,9 @@ static MemTxResult access_with_adjusted_size(hwaddr
> > addr,
> > access_mask, attrs);
> > }
> > }
> > + if (mr->dev) {
> > + mr->dev->mem_reentrancy_guard.engaged_in_io = false;
> > + }
> > return r;
> > }
> >
> > @@ -1170,6 +1185,7 @@ static void memory_region_do_init(MemoryRegion *mr,
> > }
> > mr->name = g_strdup(name);
> > mr->owner = owner;
> > + mr->dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
> > mr->ram_block = NULL;
> >
> > if (name) {
>