[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options

From: Michael S. Tsirkin
Subject: [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options
Date: Sun, 28 Feb 2010 23:01:28 +0200
User-agent: Mutt/1.5.19 (2009-01-05)

On Sun, Feb 28, 2010 at 02:57:56PM -0600, Anthony Liguori wrote:
> On 02/28/2010 11:19 AM, Michael S. Tsirkin wrote:
>>> Both have  security implications so I think it's important that they
>>> be addressed.   Otherwise, I'm pretty happy with how things are.
>> Care suggesting some solutions?
> The obvious thing to do would be to use the memory notifier in vhost to  
> keep track of whenever something remaps the ring's memory region and if  
> that happens, issue an ioctl to vhost to change the location of the  
> ring.

It would be easy to do, but what I wondered about, is what happens in the
guest meanwhile. Which ring address has the correct descriptors: the old
one?  The new one? Both?  This question leads me to the belief that well-behaved
guest will never encounter this.

>  Also, you would need to merge the vhost slot management code with  
> the KVM slot management code.
> I'm sympathetic to your arguments though.  As qemu is today, the above  
> is definitely the right thing to do.  But ram is always ram and ram  
> always has a fixed (albeit non-linear) mapping within a guest.  We can  
> probably be smarter in qemu.
> There are areas of MMIO/ROM address space that *sometimes* end up  
> behaving like ram, but that's a different case.  The one other case to  
> consider is ram hot add/remove in which case, ram may be removed or  
> added (but it's location will never change during its lifetime).
> Here's what I'll propose, and I'd really like to hear what Paul think  
> about it before we start down this path.
> I think we should add a new API that's:
> void cpu_ram_add(target_phys_addr_t start, ram_addr_t size);
> This API would do two things.  It would call qemu_ram_alloc() and  
> cpu_register_physical_memory() just as code does today.  It would also  
> add this region into a new table.
> There would be:
> void *cpu_ram_map(target_phys_addr_t start, ram_addr_t *size);
> void cpu_ram_unmap(void *mem);
> These calls would use this new table to lookup ram addresses.  These  
> mappings are valid as long as the guest is executed.  Within the table,  
> each region would have a reference count.  When it comes time to do hot  
> add/remove, we would wait to remove a region until the reference count  
> went to zero to avoid unmapping during DMA.
> cpu_ram_add() never gets called with overlapping regions.  We'll modify  
> cpu_register_physical_memory() to ensure that a ram mapping is never  
> changed after initial registration.
> vhost no longer needs to bother keeping the dynamic table up to date so  
> it removes all of the slot management code from vhost.  KVM still needs  
> the code to handle rom/ram mappings but we can take care of that next.   
> virtio-net's userspace code can do the same thing as vhost and only map  
> the ring once which should be a big performance improvement.
> It also introduces a place to do madvise() reset registrations.
> This is definitely appropriate for target-i386.  I suspect it is for  
> other architectures too.
> Regards,
> Anthony Liguori
>>>>> Furthermore, vhost reduces a virtual machine's security.  It offers an
>>>>> impressive performance boost (particularly when dealing with 10gbit+
>>>>> networking) but for a user that doesn't have such strong networking
>>>>> performance requirements, I think it's reasonable for them to not want
>>>>> to make a security trade off.
>>>> It's hard for me to see how it reduces VM security. If it does, it's
>>>> not by design and will be fixed.
>>> If you have a bug in vhost-net (would never happen of course) then it's
>>> a host-kernel exploit whereas if we have a bug in virtio-net userspace,
>>> it's a local user exploit.  We have a pretty robust architecture to deal
>>> with local user exploits (qemu can run unprivilieged, SELinux enforces
>>> mandatory access control) but a host-kernel can not be protected against.
>>> I'm not saying that we should never put things in the kernel, but
>>> there's definitely a security vs. performance trade off here.
>>> Regards,
>>> Anthony Liguori
>> Not sure I get the argument completely. Any kernel service with a bug
>> might be exploited for priveledge escalation. Yes, more kernel code
>> gives you more attack surface, but given we use rich interfaces such as
>> ones exposed by kvm, I am not sure by how much.
>> Also note that vhost net does not take qemu out of the equation for
>> everything, just for datapath operations.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]