qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 03/18] pci: isolated address space for PCI bus


From: Alex Williamson
Subject: Re: [PATCH v5 03/18] pci: isolated address space for PCI bus
Date: Thu, 27 Jan 2022 14:22:53 -0700

On Thu, 27 Jan 2022 08:30:13 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Wed, Jan 26, 2022 at 04:13:33PM -0500, Michael S. Tsirkin wrote:
> > On Wed, Jan 26, 2022 at 08:07:36PM +0000, Dr. David Alan Gilbert wrote:  
> > > * Stefan Hajnoczi (stefanha@redhat.com) wrote:  
> > > > On Wed, Jan 26, 2022 at 05:27:32AM +0000, Jag Raman wrote:  
> > > > > 
> > > > >   
> > > > > > On Jan 25, 2022, at 1:38 PM, Dr. David Alan Gilbert 
> > > > > > <dgilbert@redhat.com> wrote:
> > > > > > 
> > > > > > * Jag Raman (jag.raman@oracle.com) wrote:  
> > > > > >> 
> > > > > >>   
> > > > > >>> On Jan 19, 2022, at 7:12 PM, Michael S. Tsirkin <mst@redhat.com> 
> > > > > >>> wrote:
> > > > > >>> 
> > > > > >>> On Wed, Jan 19, 2022 at 04:41:52PM -0500, Jagannathan Raman 
> > > > > >>> wrote:  
> > > > > >>>> Allow PCI buses to be part of isolated CPU address spaces. This 
> > > > > >>>> has a
> > > > > >>>> niche usage.
> > > > > >>>> 
> > > > > >>>> TYPE_REMOTE_MACHINE allows multiple VMs to house their PCI 
> > > > > >>>> devices in
> > > > > >>>> the same machine/server. This would cause address space 
> > > > > >>>> collision as
> > > > > >>>> well as be a security vulnerability. Having separate address 
> > > > > >>>> spaces for
> > > > > >>>> each PCI bus would solve this problem.  
> > > > > >>> 
> > > > > >>> Fascinating, but I am not sure I understand. any examples?  
> > > > > >> 
> > > > > >> Hi Michael!
> > > > > >> 
> > > > > >> multiprocess QEMU and vfio-user implement a client-server model to 
> > > > > >> allow
> > > > > >> out-of-process emulation of devices. The client QEMU, which makes 
> > > > > >> ioctls
> > > > > >> to the kernel and runs VCPUs, could attach devices running in a 
> > > > > >> server
> > > > > >> QEMU. The server QEMU needs access to parts of the client’s RAM to
> > > > > >> perform DMA.  
> > > > > > 
> > > > > > Do you ever have the opposite problem? i.e. when an emulated PCI 
> > > > > > device  
> > > > > 
> > > > > That’s an interesting question.
> > > > >   
> > > > > > exposes a chunk of RAM-like space (frame buffer, or maybe a mapped 
> > > > > > file)
> > > > > > that the client can see.  What happens if two emulated devices need 
> > > > > > to
> > > > > > access each others emulated address space?  
> > > > > 
> > > > > In this case, the kernel driver would map the destination’s chunk of 
> > > > > internal RAM into
> > > > > the DMA space of the source device. Then the source device could 
> > > > > write to that
> > > > > mapped address range, and the IOMMU should direct those writes to the
> > > > > destination device.
> > > > > 
> > > > > I would like to take a closer look at the IOMMU implementation on how 
> > > > > to achieve
> > > > > this, and get back to you. I think the IOMMU would handle this. Could 
> > > > > you please
> > > > > point me to the IOMMU implementation you have in mind?  
> > > > 
> > > > I don't know if the current vfio-user client/server patches already
> > > > implement device-to-device DMA, but the functionality is supported by
> > > > the vfio-user protocol.
> > > > 
> > > > Basically: if the DMA regions lookup inside the vfio-user server fails,
> > > > fall back to VFIO_USER_DMA_READ/WRITE messages instead.
> > > > https://github.com/nutanix/libvfio-user/blob/master/docs/vfio-user.rst#vfio-user-dma-read
> > > > 
> > > > Here is the flow:
> > > > 1. The vfio-user server with device A sends a DMA read to QEMU.
> > > > 2. QEMU finds the MemoryRegion associated with the DMA address and sees
> > > >    it's a device.
> > > >    a. If it's emulated inside the QEMU process then the normal
> > > >       device emulation code kicks in.
> > > >    b. If it's another vfio-user PCI device then the vfio-user PCI proxy
> > > >       device forwards the DMA to the second vfio-user server's device 
> > > > B.  
> > > 
> > > I'm starting to be curious if there's a way to persuade the guest kernel
> > > to do it for us; in general is there a way to say to PCI devices that
> > > they can only DMA to the host and not other PCI devices?  
> > 
> > 
> > But of course - this is how e.g. VFIO protects host PCI devices from
> > each other when one of them is passed through to a VM.  
> 
> Michael: Are you saying just turn on vIOMMU? :)
> 
> Devices in different VFIO groups have their own IOMMU context, so their
> IOVA space is isolated. Just don't map other devices into the IOVA space
> and those other devices will be inaccessible.

Devices in different VFIO *containers* have their own IOMMU context.
Based on the group attachment to a container, groups can either have
shared or isolated IOVA space.  That determination is made by looking
at the address space of the bus, which is governed by the presence of a
vIOMMU.

If the goal here is to restrict DMA between devices, ie. peer-to-peer
(p2p), why are we trying to re-invent what an IOMMU already does?  In
fact, it seems like an IOMMU does this better in providing an IOVA
address space per BDF.  Is the dynamic mapping overhead too much?  What
physical hardware properties or specifications could we leverage to
restrict p2p mappings to a device?  Should it be governed by machine
type to provide consistency between devices?  Should each "isolated"
bus be in a separate root complex?  Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]