[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1 00/13] Ram blocks with resizable anonymous allocations und
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [PATCH v1 00/13] Ram blocks with resizable anonymous allocations under POSIX |
Date: |
Fri, 7 Feb 2020 15:28:47 +0000 |
User-agent: |
Mutt/1.13.3 (2020-01-12) |
* David Hildenbrand (address@hidden) wrote:
>
>
> > Am 06.02.2020 um 21:11 schrieb Dr. David Alan Gilbert <address@hidden>:
> >
> > * David Hildenbrand (address@hidden) wrote:
> >> We already allow resizable ram blocks for anonymous memory, however, they
> >> are not actually resized. All memory is mmaped() R/W, including the memory
> >> exceeding the used_length, up to the max_length.
> >>
> >> When resizing, effectively only the boundary is moved. Implement actually
> >> resizable anonymous allocations and make use of them in resizable ram
> >> blocks when possible. Memory exceeding the used_length will be
> >> inaccessible. Especially ram block notifiers require care.
> >>
> >> Having actually resizable anonymous allocations (via mmap-hackery) allows
> >> to reserve a big region in virtual address space and grow the
> >> accessible/usable part on demand. Even if "/proc/sys/vm/overcommit_memory"
> >> is set to "never" under Linux, huge reservations will succeed. If there is
> >> not enough memory when resizing (to populate parts of the reserved region),
> >> trying to resize will fail. Only the actually used size is reserved in the
> >> OS.
> >>
> >> E.g., virtio-mem [1] wants to reserve big resizable memory regions and
> >> grow the usable part on demand. I think this change is worth sending out
> >> individually. Accompanied by a bunch of minor fixes and cleanups.
> >>
> >> [1] https://lore.kernel.org/kvm/address@hidden/
> >
> > There's a few bits I've not understood from skimming the patches:
> >
>
> Thanks for having a look!
>
> > a) Am I correct in thinking you PROT_NONE the extra space so you can
> > gkrow/shrink it?
>
> Yes!
>
> > b) What does kvm see - does it have a slot for the whole space or for
> > only the used space?
>
> Only the used space. Resizing triggers a resize of the memory region. That
> triggers memory notifiers, which remove the old kvm memslot and re-add the
> new kvm memslot. (That‘s existing handling, so nothing new).
>
> So KVM will not see PROT_NONE when creating a slot.
OK, that's easy then.
> > I ask because we found with virtiofs/DAX experiments that on Power,
> > kvm gets upset if you give it a mapping with PROT_NONE.
> > (That maybe less of an issue if you change the mapping after the
> > slot is created).
>
> That should work as expected. Resizing *while* kvm is running is tricky, but
> that‘s not part of this series and a different story :) right now, resizing
> is only valid on reboot/incoming migration.
Hmm 'when' during an incoming migration; I ask because of userfaultfd
setup for postcopy. Also note those things can combine - i.e. a reboot
that happens during a migration (we've already got a pile of related
bugs).
> >
> > c) It's interesting this is keyed off the RAMBlock notifiers - do
> > memory_listener's on the address space the block is mapped into get
> > triggered? I'm wondering how vhost (and vhost-user) in particular
> > see this.
>
> Yes, memory listeners get triggered. Old region is removed, new one is added.
> Nothing changed on that front.
>
> The issue with ram block notifiers is that they did not do a „remove old, add
> new“ on resizes. They only added the full ram block. Bad. E.g., vfio wants to
> pin all memory - which would fail on PROT_NONE.
>
> E.g., for HAX, there is no kernel ioctl to remove a ram block ... for SEV
> there is, but I am not sure about the implications when converting back and
> forth between encrypted/unencrypted. So SEV and HAX require legacy handling.
I guess for a memory listener it just sees a new layout after the commit
and then can figure out what changed.
Dave
> Cheers!
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- [PATCH v1 06/13] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve(), (continued)