[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH RFC v4 00/20] VT-d: vfio enablement and misc enhance
From: |
Peter Xu |
Subject: |
[Qemu-devel] [PATCH RFC v4 00/20] VT-d: vfio enablement and misc enhances |
Date: |
Fri, 20 Jan 2017 21:08:36 +0800 |
This is v4 of vt-d vfio enablement series.
Sorry that v4 growed to 20 patches. Some newly added patches (which
are quite necessary):
[01/20] vfio: trace map/unmap for notify as well
[02/20] vfio: introduce vfio_get_vaddr()
[03/20] vfio: allow to notify unmap for very large region
Patches from RFC series:
"[PATCH RFC 0/3] vfio: allow to notify unmap for very big region"
Which is required by patch [19/20].
[11/20] memory: provide IOMMU_NOTIFIER_FOREACH macro
A helper only.
[19/20] intel_iommu: unmap existing pages before replay
This solves Alex's concern that there might have existing mappings
in previous domain when replay happens.
[20/20] intel_iommu: replay even with DSI/GLOBAL inv desc
This solves Jason/Kevin's concern by handling DSI/GLOBAL
invalidations as well.
Each individual patch will have more detailed explanation on itself.
Please refer to each of them.
Here I did separate work on patch 19/20 rather than squashing them
into patch 18 for easier modification and review. I prefer we have
them separately so we can see each problem separately, after all,
patch 18 survives in most use cases. Please let me know if we want to
squash them in some way. I can respin when necessary.
Besides the big things, lots of tiny tweaks as well. Here's the
changelog.
v4:
- convert all error_report()s into traces (in the two patches that did
that)
- rebased to Jason's DMAR series (master + one more patch:
"[PATCH V4 net-next] vhost_net: device IOTLB support")
- let vhost use the new api iommu_notifier_init() so it won't break
vhost dmar [Jason]
- touch commit message of the patch:
"intel_iommu: provide its own replay() callback"
old replay is not a dead loop, but it will just consume lots of time
[Jason]
- add comment for patch:
"intel_iommu: do replay when context invalidate"
telling why replay won't be a problem even without CM=1 [Jason]
- remove a useless comment line [Jason]
- remove dmar_enabled parameter for vtd_switch_address_space() and
vtd_switch_address_space_all() [Mst, Jason]
- merged the vfio patches in, to support unmap of big ranges at the
beginning ("[PATCH RFC 0/3] vfio: allow to notify unmap for very big
region")
- using caching_mode instead of cache_mode_enabled, and "caching-mode"
instead of "cache-mode" [Kevin]
- when receive context entry invalidation, we unmap the entire region
first, then replay [Alex]
- fix commit message for patch:
"intel_iommu: simplify irq region translation" [Kevin]
- handle domain/global invalidation, and notify where proper [Jason,
Kevin]
v3:
- fix style error reported by patchew
- fix comment in domain switch patch: use "IOMMU address space" rather
than "IOMMU region" [Kevin]
- add ack-by for Paolo in patch:
"memory: add section range info for IOMMU notifier"
(this is seperately collected besides this thread)
- remove 3 patches which are merged already (from Jason)
- rebase to master b6c0897
v2:
- change comment for "end" parameter in vtd_page_walk() [Tianyu]
- change comment for "a iova" to "an iova" [Yi]
- fix fault printed val for GPA address in vtd_page_walk_level (debug
only)
- rebased to master (rather than Aviv's v6 series) and merged Aviv's
series v6: picked patch 1 (as patch 1 in this series), dropped patch
2, re-wrote patch 3 (as patch 17 of this series).
- picked up two more bugfix patches from Jason's DMAR series
- picked up the following patch as well:
"[PATCH v3] intel_iommu: allow dynamic switch of IOMMU region"
This RFC series is a re-work for Aviv B.D.'s vfio enablement series
with vt-d:
https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html
Aviv has done a great job there, and what we still lack there are
mostly the following:
(1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU
memory region.
(2) VT-d still haven't provide a correct replay() mechanism (e.g.,
when IOMMU domain switches, things will broke).
This series should have solved the above two issues.
Online repo:
https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement-v4
I would be glad to hear about any review comments for above patches.
=========
Test Done
=========
Build test passed for x86_64/arm/ppc64.
Simply tested with x86_64, assigning two PCI devices to a single VM,
boot the VM using:
bin=x86_64-softmmu/qemu-system-x86_64
$bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \
-device intel-iommu,intremap=on,eim=off,caching-mode=on \
-netdev user,id=net0,hostfwd=tcp::5555-:22 \
-device virtio-net-pci,netdev=net0 \
-device vfio-pci,host=03:00.0 \
-device vfio-pci,host=02:00.0 \
-trace events=".trace.vfio" \
/var/lib/libvirt/images/vm1.qcow2
pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio
vtd_page_walk*
vtd_replay*
vtd_inv_desc*
Then, in the guest, run the following tool:
https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c
With parameter:
./vfio-bind-group 00:03.0 00:04.0
Check host side trace log, I can see pages are replayed and mapped in
00:04.0 device address space, like:
...
vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001
vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x8000000000
vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 -
0x8000000000
vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 -
0x40000000
vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 -
0x200000
vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000
mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa
0x22e25000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa
0x22e12000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa
0x22e2d000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa
0x12a49000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa
0x129bb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa
0x128db000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa
0x12a80000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa
0x12a7e000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa
0x12b22000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa
0x12b41000 mask 0xfff perm 3
...
=========
Todo List
=========
- error reporting for the assigned devices (as Tianyu has mentioned)
- per-domain address-space: A better solution in the future may be -
we maintain one address space per IOMMU domain in the guest (so
multiple devices can share a same address space if they are sharing
the same IOMMU domains in the guest), rather than one address space
per device (which is current implementation of vt-d). However that's
a step further than this series, and let's see whether we can first
provide a workable version of device assignment with vt-d
protection.
- more to come...
Thanks,
Aviv Ben-David (1):
IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to
guest
Peter Xu (19):
vfio: trace map/unmap for notify as well
vfio: introduce vfio_get_vaddr()
vfio: allow to notify unmap for very large region
intel_iommu: simplify irq region translation
intel_iommu: renaming gpa to iova where proper
intel_iommu: fix trace for inv desc handling
intel_iommu: fix trace for addr translation
intel_iommu: vtd_slpt_level_shift check level
memory: add section range info for IOMMU notifier
memory: provide IOMMU_NOTIFIER_FOREACH macro
memory: provide iommu_replay_all()
memory: introduce memory_region_notify_one()
memory: add MemoryRegionIOMMUOps.replay() callback
intel_iommu: provide its own replay() callback
intel_iommu: do replay when context invalidate
intel_iommu: allow dynamic switch of IOMMU region
intel_iommu: enable vfio devices
intel_iommu: unmap existing pages before replay
intel_iommu: replay even with DSI/GLOBAL inv desc
hw/i386/intel_iommu.c | 674 +++++++++++++++++++++++++++++++----------
hw/i386/intel_iommu_internal.h | 2 +
hw/i386/trace-events | 30 ++
hw/vfio/common.c | 68 +++--
hw/vfio/trace-events | 2 +-
hw/virtio/vhost.c | 4 +-
include/exec/memory.h | 49 ++-
include/hw/i386/intel_iommu.h | 12 +
memory.c | 47 ++-
9 files changed, 696 insertions(+), 192 deletions(-)
--
2.7.4