qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 5/5] vifo: introduce new VFIO ioctl VFIO_DEVICE_PC


From: Alex Williamson
Subject: Re: [Qemu-devel] [RFC 5/5] vifo: introduce new VFIO ioctl VFIO_DEVICE_PCI_GET_DIRTY_BITMAP
Date: Fri, 30 Jun 2017 10:59:52 -0600

On Fri, 30 Jun 2017 05:14:40 +0000
"Tian, Kevin" <address@hidden> wrote:

> > From: Alex Williamson [mailto:address@hidden
> > Sent: Friday, June 30, 2017 4:57 AM
> > 
> > On Thu, 29 Jun 2017 00:10:59 +0000
> > "Tian, Kevin" <address@hidden> wrote:
> >   
> > > > From: Alex Williamson [mailto:address@hidden
> > > > Sent: Thursday, June 29, 2017 12:00 AM
> > > > Thanks Kevin.  So really it's not really a dirty bitmap, it's just a
> > > > bitmap of pages that the device has access to and may have dirtied.
> > > > Don't we have this more generally in the vfio type1 IOMMU backend?  For
> > > > a mediated device, we know all the pages that the vendor driver has
> > > > asked to be pinned.  Should we perhaps make this interface on the vfio
> > > > container rather than the device?  Any mediated device can provide this
> > > > level of detail without specific vendor support.  If we had DMA page
> > > > faulting, this would be the natural place to put it as well, so maybe
> > > > we should design the interface there to support everything similarly.
> > > > Thanks,
> > > >  
> > >
> > > That's a nice idea. Just two comments:
> > >
> > > 1) If some mediated device has its own way to construct true dirty
> > > bitmap (not thru DMA page faulting), the interface is better designed
> > > to allow that flexibility. Maybe an optional callback if not registered
> > > then use common type1 IOMMU logic otherwise prefers to vendor
> > > specific callback  
> > 
> > I'm not sure what that looks like, but I agree with the idea.  Could
> > the pages that type1 knows about every be anything other than a
> > superset of the dirty pages?  Perhaps a device ioctl to flush unused
> > mappings would be sufficient.  
> 
> sorry I didn't quite get your idea here. My understanding is that
> type1 is OK as an alternative in case mediated device has no way
> to track dirtied pages (as for Intel GPU), so we can use type1 pinned
> pages as an indirect way to indicate dirtied pages. But if mediated
> device has its own way (e.g. a device private MMU) to track dirty
> pages, then we should allow that device to provide dirty bitmap
> instead of using type1.

My thought was that our current mdev iommu interface allows the vendor
driver to pin specific pages.  In order for the mdev device to dirty a
page, we need for it to be pinned.  Therefore at worst, the set of
pages pinned in type1 is the superset of all pages that can potentially
be dirtied by the device.  In the worst case, this devolves to all
pages mapped through the iommu in the case of direct assigned devices.
My assertion is therefore that a device specific dirty page bitmap can
only be a subset of the type1 pinned pages.  Therefore if the mdev
vendor driver can flush any stale pinnings, then the type1 view off
pinned pages should match the devices view of the current working set.
Then we wouldn't need a device specific dirty bitmap, we'd only need a
mechanism to trigger a flush of stale mappings on the device.

Otherwise I'm not sure how we cleanly create an interface where the
dirty bitmap can either come from the device or the container... but
I'd welcome suggestions.  Thanks,

Alex
 
> > > 2) If there could be multiple mediated devices from different vendors
> > > in same container while not all mediated devices support live migration,
> > > would container-level interface impose some limitation?  
> > 
> > Dirty page logging is only one small part of migration, each
> > migrate-able device would still need to provide a device-level
> > interface to save/restore state.  The migration would fail when we get
> > to the device(s) that don't provide that.  Thanks,
> >   
> 
> Agree here. Yulei, can you investigate this direction and report back
> whether it's feasible or anything overlooked?
> 
> Thanks
> Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]