Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_M

From:	Alexey Kardashevskiy
Subject:	Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask
Date:	Wed, 15 Jul 2015 10:49:19 +1000
User-agent:	Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

On 07/15/2015 08:28 AM, Alex Williamson wrote:

On Tue, 2015-07-14 at 16:58 +1000, Alexey Kardashevskiy wrote:

On 07/14/2015 05:13 AM, Alex Williamson wrote:

On Tue, 2015-07-14 at 00:56 +1000, Alexey Kardashevskiy wrote:

These started switching from TARGET_PAGE_MASK (hardcoded as 4K) to
a real host page size:
4e51361d7 "cpu-all: complete "real" host page size API" and
f7ceed190 "vfio: cpu: Use "real" page size API"

This finished the transition by:
- %s/TARGET_PAGE_MASK/qemu_real_host_page_mask/
- %s/TARGET_PAGE_ALIGN/REAL_HOST_PAGE_ALIGN/
- removing bitfield length for offsets in VFIOQuirk::data as
qemu_real_host_page_mask is not a macro


Can we assume that none of the changes to quirks have actually been
tested?


No, why? :) I tried it on one of NVIDIAs I got here -
VGA compatible controller: NVIDIA Corporation GM107GL [Quadro K2200] (rev a2)
The driver was from NVIDIA (not nouveau) and the test was "acos" (some
basic CUDA test).


That's only one of a handful or more quirks.  The VGA related quirks are
all for backdoors in the bootstrap process and MMIO access to config
space, things that I would not expect to see on power.  So power
probably isn't a useful host to test these things.

I don't really support them being bundled in here since they
really aren't related to what you're doing.


This makes sense, I'll move them to a separate patch and add a note how it
helps on a 64k-pages host.

For DMA we generally want
to be host IOMMU page aligned,


Do all known IOMMUs use a constant page size? IOMMU memory region does not
have an IOMMU page size/mask and I wanted to add it there but not sure if
it is generic enough.


AMD supports nearly any power-of-2 size (>=4k), Intel supports 4k +
optionally 2M and 1G.  The vfio type1 iommu driver looks for physically
contiguous ranges to give to the hardware iommu driver, which makes use
of whatever optimal page size it can.  Therefore, we really don't care
what the hardware page size is beyond the assumption that it supports
4k.  When hugepages are used by the VM, we expect the iommu will
automatically make use of them if supported.  A non-VM vfio userspace
driver might care a little bit more about the supported page sizes, I
imagine.


Oh. Cooler that us (p8).

which we can generally assume is the same
as host page aligned,


They are almost never the same on sPAPR for 32bit windows...

but quirks are simply I/O regions, so I think they
ought to continue to be target page aligned.


Without s/TARGET_PAGE_MASK/qemu_real_host_page_mask/,
&vfio_nvidia_88000_quirk fails+exits in kvm_set_phys_mem() as the size of
section is 0x88000. It still works with x-mmap=false (or TCG, I suppose)
though.


Think about what this is doing to the guest.  There's a 4k window
(because PCIe extended config space is 4k) at offset 0x88000 of the MMIO
BAR that allows access to PCI config space of the device.  With 4k pages
this all aligns quite nicely and only config space accesses are trapped.
With 64k pages, you're trapping everything from 0x80000 to 0x8ffff.  We
have no idea what else might live in that space and what kind of
performance impact it'll cause to the operation of the device.  If you


If the alternative is not to work all, then working slower (potentially) is ok.

don't know that you need it and can't meet the mapping criteria, don't
enable it.  BTW, this quirk is for GeForce, not Quadro.  The region
seems not to be used by Quadro drivers and we can't programatically tell
the difference between Quadro and GeForce hardware, so we leave it
enabled for either on x86.

What kind of driver exploits this? If it is windows-only, then I can safelydrop all of these hacks.


Or it is a adapter's bios? Or host bios?


Maybe these really should be real_host_page_size, but I can't really
picture how an underlying 64k page size host gets imposed on a guest
that's only aware of a 4k page size.  For instance, what prevents a 4k
guest from mapping PCI BARs with 4k alignment.  Multiple BARs for
different devices fit within a 64k host page.  MMU mappings would seem
to have similar issues.  These quirks need to be on the granularity we
take a target page fault, which I imagine is the same as the
real_host_page_size.  I'd expect that unless you're going to support
consumer graphics or crappy realtek NICs, none of these quirks are
relevant to you.

This keeps using TARGET_PAGE_MASK for IOMMU regions though as it is
the minimum page size which IOMMU regions may be using and at the moment
memory regions do not carry the actual page size.

Signed-off-by: Alexey Kardashevskiy <address@hidden>
---

In reality DMA windows are always a lot bigger than a single 4K page
and aligned to 32/64MB, may be only use here qemu_real_host_page_mask?


I don't understand what this is asking either.  While the bulk of memory
is going to be mapped in larger chunks, we do occasionally see 4k
mappings on x86, particularly in some of the legacy low memory areas.



The question was not about individual mappings - these are handled by a
iommu memory region notifier; here we are dealing with DMA windows which
are always megabytes but nothing really prevents/prohibits a guest from
requesting a 4K _window_ (with a single TCE entry). Whether we want to
support such small windows or not - this was a question.


But you're using the same memory listener that does deak with 4k
mappings.  What optimization would you hope to achieve by assuming only
larger mappings that compensates for the lack of generality?  Thanks,


I gave up here already - I reworked this part in v3 :)

Thanks for review and education.


--
Alexey

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [RFC PATCH qemu v2 0/5] vfio: SPAPR IOMMU v2 (memory preregistration support), Alexey Kardashevskiy, 2015/07/13
- [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask, Alexey Kardashevskiy, 2015/07/13
  - Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask, Alex Williamson, 2015/07/13
    - Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask, Alexey Kardashevskiy, 2015/07/14
    - Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask, Alexey Kardashevskiy, 2015/07/14
    - Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask, Alex Williamson, 2015/07/14
    - Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask, Alexey Kardashevskiy <=
- [Qemu-devel] [RFC PATCH qemu v2 5/5] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering), Alexey Kardashevskiy, 2015/07/13
- [Qemu-devel] [RFC PATCH qemu v2 4/5] vfio: Refactor memory listener to accommodate more IOMMU types, Alexey Kardashevskiy, 2015/07/13
- [Qemu-devel] [RFC PATCH qemu v2 2/5] vfio: Skip PCI BARs in memory listener, Alexey Kardashevskiy, 2015/07/13
- [Qemu-devel] [RFC PATCH qemu v2 3/5] vfio: Store IOMMU type in container, Alexey Kardashevskiy, 2015/07/13
- Re: [Qemu-devel] [RFC PATCH qemu v2 0/5] vfio: SPAPR IOMMU v2 (memory preregistration support), Alex Williamson, 2015/07/13

Prev by Date: [Qemu-devel] GSoC 2015 (Mac OS 9 support) report, week 11
Next by Date: [Qemu-devel] [PATCH v2] more check for replaced node
Previous by thread: Re: [Qemu-devel] [RFC PATCH qemu v2 1/5] vfio: Switch from TARGET_PAGE_MASK to qemu_real_host_page_mask
Next by thread: [Qemu-devel] [RFC PATCH qemu v2 5/5] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
Index(es):
- Date
- Thread