[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1 2/3] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIB
From: |
Michal Prívozník |
Subject: |
Re: [PATCH v1 2/3] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE |
Date: |
Tue, 30 Nov 2021 16:34:12 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.1 |
On 11/30/21 10:28, David Hildenbrand wrote:
> With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading
> unplugged memory is not supported. We have to fail feature negotiation
> in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.
>
> First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle
> memory backends (or architectures) without support for the shared zeropage
> in the hypervisor cleanly. Without the shared zeropage, even reading an
> unpopulated virtual memory location can populate real memory and
> consequently consume memory in the hypervisor. We have a guaranteed shared
> zeropage only on MAP_PRIVATE anonymous memory.
>
> Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default
> long-term as even populating the shared zeropage can be problematic: for
> example, without THP support (possible) or without support for the shared
> huge zeropage with THP (unlikely), the PTE page tables to hold the shared
> zeropage entries can consume quite some memory that cannot be reclaimed
> easily.
>
> Third, there are other optimizations+features (e.g., protection of
> unplugged memory, reducing the total memory slot size and bitmap sizes)
> that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.
>
> We really only support x86 targets with virtio-mem for now (and
> Linux similarly only support x86), but that might change soon, so prepare
> for different targets already.
>
> Add a new "unplugged-inaccessible" tristate property for x86 targets:
> - "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy
> guests working.
> - "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests
> from using the device.
> - "auto" selects the default based on support for the shared zeropage.
>
> Warn in case the property is set to "off" and we don't have support for the
> shared zeropage.
>
> For existing compat machines, the property will default to "off", to
> not change the behavior but eventually warn about a problematic setup.
> Short-term, we'll set the property default to "auto" for new QEMU machines.
> Mid-term, we'll set the property default to "on" for new QEMU machines.
> Long-term, we'll deprecate the parameter and disallow legacy
> guests completely.
>
> The property has to match on the migration source and destination. "auto"
> will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long
> as the qemu command line (esp. memdev) match -- so "auto" is good enough
> for migration purposes and the parameter doesn't have to be migrated
> explicitly.
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> hw/virtio/virtio-mem.c | 63 ++++++++++++++++++++++++++++++++++
> include/hw/virtio/virtio-mem.h | 8 +++++
> 2 files changed, 71 insertions(+)
>
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index d5a578142b..1e57156e81 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -32,6 +32,14 @@
> #include CONFIG_DEVICES
> #include "trace.h"
>
> +/*
> + * We only had legacy x86 guests that did not support
> + * VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE. Other targets don't have legacy
> guests.
> + */
> +#if defined(TARGET_X86_64) || defined(TARGET_I386)
> +#define VIRTIO_MEM_HAS_LEGACY_GUESTS
> +#endif
> +
> /*
> * Let's not allow blocks smaller than 1 MiB, for example, to keep the
> tracking
> * bitmap small.
> @@ -110,6 +118,19 @@ static uint64_t virtio_mem_default_block_size(RAMBlock
> *rb)
> return MAX(page_size, VIRTIO_MEM_MIN_BLOCK_SIZE);
> }
>
> +#if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
> +static bool virtio_mem_has_shared_zeropage(RAMBlock *rb)
> +{
> + /*
> + * We only have a guaranteed shared zeropage on ordinary MAP_PRIVATE
> + * anonymous RAM. In any other case, reading unplugged *can* populate a
> + * fresh page, consuming actual memory.
> + */
> + return !qemu_ram_is_shared(rb) && rb->fd < 0 &&
> + qemu_ram_pagesize(rb) == qemu_real_host_page_size;
> +}
> +#endif /* VIRTIO_MEM_HAS_LEGACY_GUESTS */
> +
> /*
> * Size the usable region bigger than the requested size if possible. Esp.
> * Linux guests will only add (aligned) memory blocks in case they fully
> @@ -653,15 +674,29 @@ static uint64_t virtio_mem_get_features(VirtIODevice
> *vdev, uint64_t features,
> Error **errp)
> {
> MachineState *ms = MACHINE(qdev_get_machine());
> + VirtIOMEM *vmem = VIRTIO_MEM(vdev);
>
> if (ms->numa_state) {
> #if defined(CONFIG_ACPI)
> virtio_add_feature(&features, VIRTIO_MEM_F_ACPI_PXM);
> #endif
> }
> + assert(vmem->unplugged_inaccessible != ON_OFF_AUTO_AUTO);
> + if (vmem->unplugged_inaccessible == ON_OFF_AUTO_ON) {
> + virtio_add_feature(&features, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE);
> + }
> return features;
> }
>
> +static int virtio_mem_validate_features(VirtIODevice *vdev)
> +{
> + if (virtio_host_has_feature(vdev, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE) &&
> + !virtio_vdev_has_feature(vdev, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE))
> {
> + return -EFAULT;
> + }
> + return 0;
> +}
> +
> static void virtio_mem_system_reset(void *opaque)
> {
> VirtIOMEM *vmem = VIRTIO_MEM(opaque);
> @@ -716,6 +751,29 @@ static void virtio_mem_device_realize(DeviceState *dev,
> Error **errp)
> rb = vmem->memdev->mr.ram_block;
> page_size = qemu_ram_pagesize(rb);
>
> +#if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
> + switch (vmem->unplugged_inaccessible) {
> + case ON_OFF_AUTO_AUTO:
> + if (virtio_mem_has_shared_zeropage(rb)) {
> + vmem->unplugged_inaccessible = ON_OFF_AUTO_OFF;
> + } else {
> + vmem->unplugged_inaccessible = ON_OFF_AUTO_ON;
> + }
> + break;
> + case ON_OFF_AUTO_OFF:
> + if (!virtio_mem_has_shared_zeropage(rb)) {
> + warn_report("'%s' property set to 'off' with a memdev that does"
> + " not support the shared zeropage.",
> + VIRTIO_MEM_UNPLUGGED_INACCESSIBLE_PROP);
> + }
> + break;
> + default:
> + break;
> + }
> +#else /* VIRTIO_MEM_HAS_LEGACY_GUESTS */
> + vmem->unplugged_inaccessible = ON_OFF_AUTO_ON;
> +#endif /* VIRTIO_MEM_HAS_LEGACY_GUESTS */
> +
> /*
> * If the block size wasn't configured by the user, use a sane default.
> This
> * allows using hugetlbfs backends of any page size without manual
> @@ -1109,6 +1167,10 @@ static Property virtio_mem_properties[] = {
> DEFINE_PROP_UINT32(VIRTIO_MEM_NODE_PROP, VirtIOMEM, node, 0),
> DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev,
> TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> +#if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
> + DEFINE_PROP_ON_OFF_AUTO(VIRTIO_MEM_UNPLUGGED_INACCESSIBLE_PROP,
> VirtIOMEM,
> + unplugged_inaccessible, ON_OFF_AUTO_OFF),
In theory, libvirt likes properties to be arch agnostic, but that boat
sailed away already. And honestly, the default is reasonable so I don't
think we will need to query for this capability.
> +#endif
> DEFINE_PROP_END_OF_LIST(),
> };
>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal