qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release


From: Jike Song
Subject: Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)
Date: Tue, 26 Jan 2016 15:41:07 +0800
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8

On 01/26/2016 05:30 AM, Alex Williamson wrote:
> [cc +Neo @Nvidia]
> 
> Hi Jike,
> 
> On Mon, 2016-01-25 at 19:34 +0800, Jike Song wrote:
>> On 01/20/2016 05:05 PM, Tian, Kevin wrote:
>>> I would expect we can spell out next level tasks toward above
>>> direction, upon which Alex can easily judge whether there are
>>> some common VFIO framework changes that he can help :-)
>>
>> Hi Alex,
>>
>> Here is a draft task list after a short discussion w/ Kevin,
>> would you please have a look?
>>
>>      Bus Driver
>>
>>              { in i915/vgt/xxx.c }
>>
>>              - define a subset of vfio_pci interfaces
>>              - selective pass-through (say aperture)
>>              - trap MMIO: interface w/ QEMU
> 
> What's included in the subset?  Certainly the bus reset ioctls really
> don't apply, but you'll need to support the full device interface,
> right?  That includes the region info ioctl and access through the vfio
> device file descriptor as well as the interrupt info and setup ioctls.
> 

[All interfaces I thought are via ioctl:)  For other stuff like file
descriptor we'll definitely keep it.]

The list of ioctl commands provided by vfio_pci:

        - VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
        - VFIO_DEVICE_PCI_HOT_RESET

As you said, above 2 don't apply. But for this:

        - VFIO_DEVICE_RESET

In my opinion it should be kept, no matter what will be provided in
the bus driver.

        - VFIO_PCI_ROM_REGION_INDEX
        - VFIO_PCI_VGA_REGION_INDEX

I suppose above 2 don't apply neither? For a vgpu we don't provide a
ROM BAR or VGA region.

        - VFIO_DEVICE_GET_INFO
        - VFIO_DEVICE_GET_REGION_INFO
        - VFIO_DEVICE_GET_IRQ_INFO
        - VFIO_DEVICE_SET_IRQS

Above 4 are needed of course.

We will need to extend:

        - VFIO_DEVICE_GET_REGION_INFO


a) adding a flag: DONT_MAP. For example, the MMIO of vgpu
should be trapped instead of being mmap-ed.

b) adding other information. For example, for the OpRegion, QEMU need
to do more than mmap a region, it has to:

        - allocate a region
        - copy contents from somewhere in host to that region
        - mmap it to guest


I remember you already have a prototype for this?


>>      IOMMU
>>
>>              { in a new vfio_xxx.c }
>>
>>              - allocate: struct device & IOMMU group
> 
> It seems like the vgpu instance management would do this.
>

Yes, it can be removed from here.

>>              - map/unmap functions for vgpu
>>              - rb-tree to maintain iova/hpa mappings
> 
> Yep, pretty much what type1 does now, but without mapping through the
> IOMMU API.  Essentially just a database of the current userspace
> mappings that can be accessed for page pinning and IOVA->HPA
> translation.
> 

Yes.

>>              - interacts with kvmgt.c
>>
>>
>>      vgpu instance management
>>
>>              { in i915 }
>>
>>              - path, create/destroy
>>
> 
> Yes, and since you're creating and destroying the vgpu here, this is
> where I'd expect a struct device to be created and added to an IOMMU
> group.  The lifecycle management should really include links between
> the vGPU and physical GPU, which would be much, much easier to do with
> struct devices create here rather than at the point where we start
> doing vfio "stuff".
> 

Yes, just like the SRIOV does.


> Nvidia has also been looking at this and has some ideas how we might
> standardize on some of the interfaces and create a vgpu framework to
> help share code between vendors and hopefully make a more consistent
> userspace interface for libvirt as well.  I'll let Neo provide some
> details.  Thanks,

Good to know that, so we can possibly cooperate on some common part,
e.g. the instance management :)

> 
> Alex
> 

--
Thanks,
Jike



reply via email to

[Prev in Thread] Current Thread [Next in Thread]