qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v2 PATCH 01/13] mm/shmem: Introduce F_SEAL_GUEST


From: Paolo Bonzini
Subject: Re: [RFC v2 PATCH 01/13] mm/shmem: Introduce F_SEAL_GUEST
Date: Tue, 23 Nov 2021 09:54:52 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

On 11/19/21 14:47, Chao Peng wrote:
+static void guest_invalidate_page(struct inode *inode,
+                                 struct page *page, pgoff_t start, pgoff_t end)
+{
+       struct shmem_inode_info *info = SHMEM_I(inode);
+
+       if (!info->guest_ops || !info->guest_ops->invalidate_page_range)
+               return;
+
+       start = max(start, page->index);
+       end = min(end, page->index + thp_nr_pages(page)) - 1;
+
+       info->guest_ops->invalidate_page_range(inode, info->guest_owner,
+                                              start, end);
+}

The lack of protection makes the API quite awkward to use;
the usual way to do this is with refcount_inc_not_zero (aka kvm_get_kvm_safe).

Can you use the shmem_inode_info spinlock to protect against this? If register/unregister take the spinlock, the invalidate and fallocate can take a reference under the same spinlock, like this:

        if (!info->guest_ops)
                return;

        spin_lock(&info->lock);
        ops = info->guest_ops;
        if (!ops) {
                spin_unlock(&info->lock);
                return;
        }

        /* Calls kvm_get_kvm_safe.  */
        r = ops->get_guest_owner(info->guest_owner);
        spin_unlock(&info->lock);
        if (r < 0)
                return;

        start = max(start, page->index);
        end = min(end, page->index + thp_nr_pages(page)) - 1;

        ops->invalidate_page_range(inode, info->guest_owner,
                                               start, end);
        ops->put_guest_owner(info->guest_owner);

Considering that you have to take a mutex anyway in patch 13, and that the critical section here is very small, the extra indirect calls are cheaper than walking the vm_list; and it makes the API clearer.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]