[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 36/52] virtio-mem: Support "prealloc=on" option
From: |
Michael S. Tsirkin |
Subject: |
[PULL 36/52] virtio-mem: Support "prealloc=on" option |
Date: |
Thu, 6 Jan 2022 08:17:59 -0500 |
From: David Hildenbrand <david@redhat.com>
For scarce memory resources, such as hugetlb, we want to be able to
prealloc such memory resources in order to not crash later on access. On
simple user errors we could otherwise easily run out of memory resources
an crash the VM -- pretty much undesired.
For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.
Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.
A common use case for hugetlb will be using "reserve=off,prealloc=off" for
the memory backend and "prealloc=on" for the virtio-mem device. This
way, no huge pages will be reserved for the process, but we can recover
if there are no actual huge pages when plugging memory. Libvirt is
already prepared for this.
Note that preallocation cannot protect from the OOM killer -- which
holds true for any kind of preallocation in QEMU. It's primarily useful
only for scarce memory resources such as hugetlb, or shared file-backed
memory. It's of little use for ordinary anonymous memory that can be
swapped, KSM merged, ... but we won't forbid it.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
include/hw/virtio/virtio-mem.h | 4 ++++
hw/virtio/virtio-mem.c | 39 ++++++++++++++++++++++++++++++----
2 files changed, 39 insertions(+), 4 deletions(-)
diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h
index a5dd6a493b..0ac7bcb3b6 100644
--- a/include/hw/virtio/virtio-mem.h
+++ b/include/hw/virtio/virtio-mem.h
@@ -30,6 +30,7 @@ OBJECT_DECLARE_TYPE(VirtIOMEM, VirtIOMEMClass,
#define VIRTIO_MEM_REQUESTED_SIZE_PROP "requested-size"
#define VIRTIO_MEM_BLOCK_SIZE_PROP "block-size"
#define VIRTIO_MEM_ADDR_PROP "memaddr"
+#define VIRTIO_MEM_PREALLOC_PROP "prealloc"
struct VirtIOMEM {
VirtIODevice parent_obj;
@@ -62,6 +63,9 @@ struct VirtIOMEM {
/* block size and alignment */
uint64_t block_size;
+ /* whether to prealloc memory when plugging new blocks */
+ bool prealloc;
+
/* notifiers to notify when "size" changes */
NotifierList size_change_notifiers;
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 341c3fa2c1..ab975ff566 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -429,10 +429,40 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem,
uint64_t start_gpa,
return -EBUSY;
}
virtio_mem_notify_unplug(vmem, offset, size);
- } else if (virtio_mem_notify_plug(vmem, offset, size)) {
- /* Could be a mapping attempt resulted in memory getting populated. */
- ram_block_discard_range(vmem->memdev->mr.ram_block, offset, size);
- return -EBUSY;
+ } else {
+ int ret = 0;
+
+ if (vmem->prealloc) {
+ void *area = memory_region_get_ram_ptr(&vmem->memdev->mr) + offset;
+ int fd = memory_region_get_fd(&vmem->memdev->mr);
+ Error *local_err = NULL;
+
+ os_mem_prealloc(fd, area, size, 1, &local_err);
+ if (local_err) {
+ static bool warned;
+
+ /*
+ * Warn only once, we don't want to fill the log with these
+ * warnings.
+ */
+ if (!warned) {
+ warn_report_err(local_err);
+ warned = true;
+ } else {
+ error_free(local_err);
+ }
+ ret = -EBUSY;
+ }
+ }
+ if (!ret) {
+ ret = virtio_mem_notify_plug(vmem, offset, size);
+ }
+
+ if (ret) {
+ /* Could be preallocation or a notifier populated memory. */
+ ram_block_discard_range(vmem->memdev->mr.ram_block, offset, size);
+ return -EBUSY;
+ }
}
virtio_mem_set_bitmap(vmem, start_gpa, size, plug);
return 0;
@@ -1108,6 +1138,7 @@ static void virtio_mem_instance_init(Object *obj)
static Property virtio_mem_properties[] = {
DEFINE_PROP_UINT64(VIRTIO_MEM_ADDR_PROP, VirtIOMEM, addr, 0),
DEFINE_PROP_UINT32(VIRTIO_MEM_NODE_PROP, VirtIOMEM, node, 0),
+ DEFINE_PROP_BOOL(VIRTIO_MEM_PREALLOC_PROP, VirtIOMEM, prealloc, false),
DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev,
TYPE_MEMORY_BACKEND, HostMemoryBackend *),
DEFINE_PROP_END_OF_LIST(),
--
MST
- [PULL 17/52] vhost-backend: avoid overflow on memslots_limit, (continued)
- [PULL 17/52] vhost-backend: avoid overflow on memslots_limit, Michael S. Tsirkin, 2022/01/06
- [PULL 19/52] vhost-vdpa: stick to -errno error return convention, Michael S. Tsirkin, 2022/01/06
- [PULL 20/52] vhost-user: stick to -errno error return convention, Michael S. Tsirkin, 2022/01/06
- [PULL 22/52] vhost-user-blk: propagate error return from generic vhost, Michael S. Tsirkin, 2022/01/06
- [PULL 18/52] vhost-backend: stick to -errno error return convention, Michael S. Tsirkin, 2022/01/06
- [PULL 23/52] pci: Export the pci_intx() function, Michael S. Tsirkin, 2022/01/06
- [PULL 25/52] smbios: Rename SMBIOS_ENTRY_POINT_* enums, Michael S. Tsirkin, 2022/01/06
- [PULL 21/52] vhost: stick to -errno error return convention, Michael S. Tsirkin, 2022/01/06
- [PULL 26/52] hw/smbios: Use qapi for SmbiosEntryPointType, Michael S. Tsirkin, 2022/01/06
- [PULL 37/52] virtio: signal after wrapping packed used_idx, Michael S. Tsirkin, 2022/01/06
- [PULL 36/52] virtio-mem: Support "prealloc=on" option,
Michael S. Tsirkin <=
- [PULL 28/52] hw/vhost-user-blk: turn on VIRTIO_BLK_F_SIZE_MAX feature for virtio blk device, Michael S. Tsirkin, 2022/01/06
- [PULL 33/52] util/oslib-posix: Avoid creating a single thread with MADV_POPULATE_WRITE, Michael S. Tsirkin, 2022/01/06
- [PULL 40/52] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, Michael S. Tsirkin, 2022/01/06
- [PULL 27/52] hw/i386: expose a "smbios-entry-point-type" PC machine property, Michael S. Tsirkin, 2022/01/06
- [PULL 24/52] pcie_aer: Don't trigger a LSI if none are defined, Michael S. Tsirkin, 2022/01/06
- [PULL 29/52] util/oslib-posix: Let touch_all_pages() return an error, Michael S. Tsirkin, 2022/01/06
- [PULL 35/52] util/oslib-posix: Forward SIGBUS to MCE handler under Linux, Michael S. Tsirkin, 2022/01/06
- [PULL 34/52] util/oslib-posix: Support concurrent os_mem_prealloc() invocation, Michael S. Tsirkin, 2022/01/06
- [PULL 46/52] tests: acpi: SLIC: update expected blobs, Michael S. Tsirkin, 2022/01/06
- [PULL 31/52] util/oslib-posix: Introduce and use MemsetContext for touch_all_pages(), Michael S. Tsirkin, 2022/01/06