qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] A question about PCI device address spaces


From: Marcel Apfelbaum
Subject: Re: [Qemu-devel] A question about PCI device address spaces
Date: Mon, 26 Dec 2016 13:01:34 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1

On 12/22/2016 11:42 AM, Peter Xu wrote:
Hello,


Hi Peter,

Since this is a general topic, I picked it out from the VT-d
discussion and put it here, just want to be more clear of it.

The issue is, whether we have exposed too much address spaces for
emulated PCI devices?

Now for each PCI device, we are having PCIDevice::bus_master_as for
the device visible address space, which derived from
pci_device_iommu_address_space():

AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
{
    PCIBus *bus = PCI_BUS(dev->bus);
    PCIBus *iommu_bus = bus;

    while(iommu_bus && !iommu_bus->iommu_fn && iommu_bus->parent_dev) {
        iommu_bus = PCI_BUS(iommu_bus->parent_dev->bus);
    }
    if (iommu_bus && iommu_bus->iommu_fn) {
        return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, dev->devfn);
    }
    return &address_space_memory;
}

By default (for no-iommu case), it's pointed to system memory space,
which includes MMIO, and looks wrong - PCI device should not be able to
write to MMIO regions.


Why? As far as I know a PCI device can start a read/write transaction
to virtually any address, it doesn't matter if it 'lands' in RAM or a MMIO
region mapped by other device. But I might be wrong, need to read the spec 
again...

The PCI transaction will eventually reach the Root Complex/PCI host bridge
where an IOMMU or some other hw entity can sanitize/translate, but is out of
the scope of the device itself.

The Root Complex will 'translate' the transaction into a memory read/write
in the behalf of the device and pass it to the memory controller.
If the transaction target is another device, I am not sure if the
Root Complex will re-route by itself or pass it to the Memory Controller.


As an example, if we dump a PCI device address space into detail on
x86_64 system, we can see (this is address space for a virtio-net-pci
device on an Q35 machine with 6G memory):

    0000000000000000-000000000009ffff (prio 0, RW): pc.ram
    00000000000a0000-00000000000affff (prio 1, RW): vga.vram
    00000000000b0000-00000000000bffff (prio 1, RW): vga-lowmem
    00000000000c0000-00000000000c9fff (prio 0, RW): pc.ram
    00000000000ca000-00000000000ccfff (prio 0, RW): pc.ram
    00000000000cd000-00000000000ebfff (prio 0, RW): pc.ram
    00000000000ec000-00000000000effff (prio 0, RW): pc.ram
    00000000000f0000-00000000000fffff (prio 0, RW): pc.ram
    0000000000100000-000000007fffffff (prio 0, RW): pc.ram
    00000000b0000000-00000000bfffffff (prio 0, RW): pcie-mmcfg-mmio
    00000000fd000000-00000000fdffffff (prio 1, RW): vga.vram
    00000000fe000000-00000000fe000fff (prio 0, RW): virtio-pci-common
    00000000fe001000-00000000fe001fff (prio 0, RW): virtio-pci-isr
    00000000fe002000-00000000fe002fff (prio 0, RW): virtio-pci-device
    00000000fe003000-00000000fe003fff (prio 0, RW): virtio-pci-notify
    00000000febd0400-00000000febd041f (prio 0, RW): vga ioports remapped
    00000000febd0500-00000000febd0515 (prio 0, RW): bochs dispi interface
    00000000febd0600-00000000febd0607 (prio 0, RW): qemu extended regs
    00000000febd1000-00000000febd102f (prio 0, RW): msix-table
    00000000febd1800-00000000febd1807 (prio 0, RW): msix-pba
    00000000febd2000-00000000febd2fff (prio 1, RW): ahci
    00000000fec00000-00000000fec00fff (prio 0, RW): kvm-ioapic
    00000000fed00000-00000000fed003ff (prio 0, RW): hpet
    00000000fed1c000-00000000fed1ffff (prio 1, RW): lpc-rcrb-mmio
    00000000fee00000-00000000feefffff (prio 4096, RW): kvm-apic-msi
    00000000fffc0000-00000000ffffffff (prio 0, R-): pc.bios
    0000000100000000-00000001ffffffff (prio 0, RW): pc.ram

So here are the "pc.ram" regions the only ones that we should expose
to PCI devices? (it should contain all of them, including the low-mem
ones and the >=4g one)


As I previously said, it does not have to be RAM only, but let's wait
also for Michael's opinion.

And, should this rule work for all platforms?

The PCI rules should be generic for all platforms, but I don't know
the other platforms.

Thanks,
Marcel

Or say, would it be a
problem if I directly change address_space_memory in
pci_device_iommu_address_space() into something else, which only
contains RAMs? (of course this won't affect any platform that has
IOMMU, aka, customized PCIBus::iommu_fn function)

(btw, I'd appreciate if anyone has quick answer on why we have lots of
 continuous "pc.ram" in low 2g range - from can_merge() I guess they
 seem to have different dirty_log_mask, romd_mode, etc., but I still
 would like to know why they are having these difference. Anyway, this
 is totally an "optional question" just to satisfy my own curiosity :)

Thanks,

-- peterx





reply via email to

[Prev in Thread] Current Thread [Next in Thread]