qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] memory: pause all vCPUs for the duration of memory trans


From: Laszlo Ersek
Subject: Re: [PATCH RFC] memory: pause all vCPUs for the duration of memory transactions
Date: Wed, 4 Nov 2020 19:09:02 +0100

On 11/03/20 17:37, Peter Xu wrote:
> On Tue, Nov 03, 2020 at 02:07:09PM +0100, Vitaly Kuznetsov wrote:
>> In case it is a normal access from the guest, yes, but AFAIR here
>> guest's CR3 is pointing to non existent memory and when KVM detects that
>> it injects #PF by itself without a loop through userspace.
> 
> I see, thanks Vitaly.  I think this kind of answered my previous confusion on
> why we can't just bounce all these to QEMU since I thought QEMU should try to
> take the bql if it's mmio access - probably because there're quite a lot of
> references to guest memslots in kernel that cannot be naturally treated as
> guest MMIO access (especially for nested, maybe?) so that maybe it's very hard
> to cover all of them.  Paolo has mentioned quite a few times that he'd prefer 
> a
> kernel solution for this; I feel like I understand better on the reason now..
> 
> Have any of us tried to collect the requirements on this new kernel interface
> (if to be proposed)?  I'm kind of thinking how it would look like to solve all
> the pains we have right now.
> 
> Firstly, I think we'd likely want to have the capability to handle "holes" in
> memslots, either to punch a hole, which iiuc is the problem behind this patch.
> Or the reversed operation, which is to fill up a whole that we've just 
> punched.
> The other major one could be virtio-mem who would like to extend or shrink an
> existing memslot.  However IIUC that's also doable with the "hole" idea in 
> that
> we can create the memslot with the maximum supported size, then "punch a hole"
> at the end of the memslot just like it shrinked.  When extend, we shrink the
> hole instead rather than the memslot.
> 
> Then there's the other case where we want to keep the dirty bitmap when
> punching a hole on existing ram.  If with the "hole" idea in the kernel, it
> seems easy too - when we punch the hole, we drop dirty bitmaps only for the
> range covered by the hole.  Then we won't lose the rest bitmaps that where the
> RAM still makes sense, since the memslot will use the same bitmap before/after
> punching a hole.
> 
> So now an simple idea comes to my mind (I think we can have even more better,
> or more complicated ideas, e.g., to make kvm memslot a tree? But I'll start
> with the simple): maybe we just need to teach each kvm memslot to take "holes"
> within itself.  By default, there's no holes with KVM_SET_USER_MEMORY_REGION
> configured kvm memslots, then we can configure holes for each memslot using a
> new flag (assuming, KVM_MEM_SET_HOLE) of the same ioctl (after LOG_DIRTY_PAGES
> and READ_ONLY).  Initially we may add a restriction on how many holes we need,
> so the holes can also be an array.
> 
> Thoughts?

My only one (and completely unwashed / uneducated) thought is that this
resembles the fact (?) that VMAs are represented as rbtrees. So maybe
don't turn a single KVM memslot into a tree, but represent the full set
of KVM memslots as an rbtree?

My understanding is that "interval tree" is one of the most efficient
data structures for tracking a set of (discontiguous) memory regions,
and that an rbtree can be generalized into an interval tree. I'm super
rusty on the theory (after having contributed a genuine rbtree impl to
edk2 in 2014, sic transit gloria mundi :/), but I think that's what the
VMA stuff in the kernel does, effectively.

Perhaps it could apply to KVM memslots too.

Sorry if I'm making no sense, of course. (I'm going out on a limb with
posting this email, but whatever...)

Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]