qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.repla


From: Liu, Yi L
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback
Date: Sat, 1 Apr 2017 06:39:06 +0000

> -----Original Message-----
> From: Jason Wang [mailto:address@hidden
> Sent: Saturday, April 1, 2017 1:01 PM
> To: Liu, Yi L <address@hidden>; 'Peter Xu' <address@hidden>
> Cc: Lan, Tianyu <address@hidden>; Tian, Kevin <address@hidden>;
> 'address@hidden' <address@hidden>; 'address@hidden'
> <address@hidden>; 'address@hidden' <address@hidden>; 'qemu-
> address@hidden' <address@hidden>; 'address@hidden'
> <address@hidden>; 'David Gibson' <address@hidden>
> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> MemoryRegionIOMMUOps.replay() callback
> 
> 
> 
> On 2017年03月31日 15:30, Liu, Yi L wrote:
> >> -----Original Message-----
> >> From: Jason Wang [mailto:address@hidden
> >> Sent: Friday, March 31, 2017 3:17 PM
> >> To: Liu, Yi L <address@hidden>; 'Peter Xu' <address@hidden>
> >> Cc: Lan, Tianyu <address@hidden>; Tian, Kevin
> >> <address@hidden>; 'address@hidden' <address@hidden>;
> 'address@hidden'
> >> <address@hidden>; 'address@hidden' <address@hidden>;
> >> 'qemu- address@hidden' <address@hidden>;
> 'address@hidden'
> >> <address@hidden>; 'David Gibson'
> >> <address@hidden>
> >> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> >> MemoryRegionIOMMUOps.replay() callback
> >>
> >>
> >>
> >> On 2017年03月31日 13:34, Liu, Yi L wrote:
> >>>> -----Original Message-----
> >>>> From: Jason Wang [mailto:address@hidden
> >>>> Sent: Thursday, March 30, 2017 7:58 PM
> >>>> To: Liu, Yi L <address@hidden>; 'Peter Xu' <address@hidden>
> >>>> Cc: 'address@hidden' <address@hidden>; Lan,
> >>>> Tianyu <address@hidden>; Tian, Kevin <address@hidden>;
> >> 'address@hidden'
> >>>> <address@hidden>; 'address@hidden'
> >>>> <address@hidden>; 'address@hidden' <address@hidden>;
> 'David Gibson'
> >>>> <address@hidden>; 'address@hidden' <qemu-
> >>>> address@hidden>
> >>>> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>> MemoryRegionIOMMUOps.replay() callback
> >>>>
> >>>>
> >>>>
> >>>> On 2017年03月30日 19:06, Liu, Yi L wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: Liu, Yi L
> >>>>>> Sent: Monday, March 27, 2017 5:22 PM
> >>>>>> To: Peter Xu <address@hidden>
> >>>>>> Cc: address@hidden; Lan, Tianyu
> >>>>>> <address@hidden>; Tian, Kevin <address@hidden>;
> >>>>>> address@hidden; address@hidden; address@hidden;
> >>>>>> address@hidden; David Gibson <address@hidden>;
> >>>>>> address@hidden
> >>>>>> Subject: RE: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>>>> MemoryRegionIOMMUOps.replay() callback
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Peter Xu [mailto:address@hidden
> >>>>>>> Sent: Monday, March 27, 2017 5:12 PM
> >>>>>>> To: Liu, Yi L <address@hidden>
> >>>>>>> Cc: address@hidden; Lan, Tianyu
> >>>>>>> <address@hidden>; Tian, Kevin <address@hidden>;
> >>>>>>> address@hidden; address@hidden; address@hidden;
> >>>>>>> address@hidden; David Gibson <address@hidden>;
> >>>>>>> address@hidden
> >>>>>>> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>>>>> MemoryRegionIOMMUOps.replay() callback
> >>>>>>>
> >>>>>>> On Mon, Mar 27, 2017 at 08:35:05AM +0000, Liu, Yi L wrote:
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Qemu-devel
> >>>>>>>>> [mailto:address@hidden On
> >>>>>>>>> Behalf Of Peter Xu
> >>>>>>>>> Sent: Tuesday, February 7, 2017 4:28 PM
> >>>>>>>>> To: address@hidden
> >>>>>>>>> Cc: Lan, Tianyu <address@hidden>; Tian, Kevin
> >>>>>>>>> <address@hidden>; address@hidden;
> >>>>>>>>> address@hidden; address@hidden;
> >>>>>>>>> address@hidden; address@hidden;
> >>>>>>>>> address@hidden; David Gibson <address@hidden>
> >>>>>>>>> Subject: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>>>>>>> MemoryRegionIOMMUOps.replay() callback
> >>>>>>>>>
> >>>>>>>>> Originally we have one memory_region_iommu_replay() function,
> >>>>>>>>> which is the default behavior to replay the translations of
> >>>>>>>>> the whole IOMMU region. However, on some platform like x86, we
> >>>>>>>>> may want our own
> >>>>>>> replay logic for IOMMU regions.
> >>>>>>>>> This patch add one more hook for IOMMUOps for the callback,
> >>>>>>>>> and it'll override the default if set.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Peter Xu <address@hidden>
> >>>>>>>>> ---
> >>>>>>>>>     include/exec/memory.h | 2 ++
> >>>>>>>>>     memory.c              | 6 ++++++
> >>>>>>>>>     2 files changed, 8 insertions(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/include/exec/memory.h b/include/exec/memory.h
> >>>>>>>>> index
> >>>>>>>>> 0767888..30b2a74 100644
> >>>>>>>>> --- a/include/exec/memory.h
> >>>>>>>>> +++ b/include/exec/memory.h
> >>>>>>>>> @@ -191,6 +191,8 @@ struct MemoryRegionIOMMUOps {
> >>>>>>>>>         void (*notify_flag_changed)(MemoryRegion *iommu,
> >>>>>>>>>                                     IOMMUNotifierFlag old_flags,
> >>>>>>>>>                                     IOMMUNotifierFlag
> >>>>>>>>> new_flags);
> >>>>>>>>> +    /* Set this up to provide customized IOMMU replay function */
> >>>>>>>>> +    void (*replay)(MemoryRegion *iommu, IOMMUNotifier
> >>>>>>>>> + *notifier);
> >>>>>>>>>     };
> >>>>>>>>>
> >>>>>>>>>     typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> >>>>>>>>> diff --git a/memory.c b/memory.c index 7a4f2f9..9c253cc 100644
> >>>>>>>>> --- a/memory.c
> >>>>>>>>> +++ b/memory.c
> >>>>>>>>> @@ -1630,6 +1630,12 @@ void
> >>>>>>>>> memory_region_iommu_replay(MemoryRegion
> >>>>>>>>> *mr, IOMMUNotifier *n,
> >>>>>>>>>         hwaddr addr, granularity;
> >>>>>>>>>         IOMMUTLBEntry iotlb;
> >>>>>>>>> +    /* If the IOMMU has its own replay callback, override */
> >>>>>>>>> +    if (mr->iommu_ops->replay) {
> >>>>>>>>> +        mr->iommu_ops->replay(mr, n);
> >>>>>>>>> +        return;
> >>>>>>>>> +    }
> >>>>>>>> Hi Alex, Peter,
> >>>>>>>>
> >>>>>>>> Will all the other vendors(e.g. PPC, s390, ARM) add their own
> >>>>>>>> replay callback as well? I guess it depends on whether the
> >>>>>>>> original replay algorithm work well for them? Do you have such
> knowledge?
> >>>>>>> I guess so. At least for VT-d we had this callback since the
> >>>>>>> default replay mechanism did not work well on x86 due to its
> >>>>>>> extremely large memory region size. Thanks,
> >>>>>> thx. that would make sense.
> >>>>> Peter,
> >>>>>
> >>>>> Just come to mind that there may be a corner case here.
> >>>>>
> >>>>> Intel VT-d actually has a "pt" mode which allows device use
> >>>>> physical address even when VT-d is enabled. In kernel, there is a
> >> iommu_identity_mapping.
> >>>>> If a device is in this map, then it would use "pt" mode. So that
> >>>>> IOMMU driver would not build second-level page table for it.
> >>>> Yes, but qemu does not support ECAP_PT now, so guest will still
> >>>> have a page table in this case.
> >>> That's true. Without ECAP_PT, IOMMU driver would create a 1:1 map.
> >>> So this solution can work well even a device is in identify_map.
> >>>
> >>>>> Back to the virtual IOVA implementation, if an assigned device is
> >>>>> in the iommu_identity_mapping(e.g. VGA controller), it uses GPA
> >>>>> directly to do
> >> DMA.
> >>>>> So it demands a GPA->HPA mapping in host. However, the
> >>>>> iommu->ops.replay is not able to build it when guest SL page table is 
> >>>>> empty.
> >>>>>
> >>>>> So I think building an entire guest PA->HPA mapping before guest
> >>>>> kernel boot would be recommended. Any thoughts?
> >>>> We plan to add PT in 2.10, a possible rough idea is disabled iommu
> >>>> dmar region and use another region without iommu_ops. Then
> >>>> vfio_listener_region_add() will just do the correct mappings.
> >>> Good to know it. Actually, I also need to expose ECAP_PT for vSVM.
> >>> So just comes to realize that the current replay solution may not
> >>> work well when I
> >> expose ECAP_PT to guest.
> >>> I also have a rough idea here. The current listener in container
> >>> listens to address space named with devfn if virtual VTd is added.
> >>> How about adding one more listener to listen memory address space.
> >>> So that the
> >> listener can build entire guest PA->HPA mapping.
> >>
> >> This is only needed for PT. So looks like current code is sufficient to do 
> >> this I think.
> >> See the else part of if (memory_region_is_iommu()) of 
> >> vfio_listener_region_add().
> > Jason, when the listener listen to device address space, the "else
> > part" may not work even we set the mr->iommu_ops = NULL. The mr would
> > be a non-ram region when the time region_add is called since it is actually 
> > listen to
> changes from device address space.
> >
> > Regards,
> > Yi L
> >
> 
> See Peter's patch ("intel_iommu: allow dynamic switch of IOMMU region").
> It has
> 
> +        memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s),
> +                                 "vtd_sys_alias", get_system_memory(),
> +                                 0,
> memory_region_size(get_system_memory()));
> 
> We can enable sys_alias in when PT is used which should work I think.

Great. I think it works. Thx.

Regards,
Yi L

reply via email to

[Prev in Thread] Current Thread [Next in Thread]