qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] vfio-pci: Report on a hack to successfully pass through a b


From: Robert Ou
Subject: [Qemu-devel] vfio-pci: Report on a hack to successfully pass through a boot GPU
Date: Tue, 12 Jul 2016 02:30:44 -0700

I would like to report on a hack that I created to successfully use
vfio-pci to pass through a boot GPU. The short TL;DR summary is that
the BOOTFB framebuffer memory region seems to cause a "BAR <n>: can't
reserve [mem <...>]" error, and this can be hackily worked around by
calling __release_region on the BOOTFB framebuffer. I was told by
someone on IRC to send this hack to this list.

My system setup is as follows: I have a Xeon E5-2630 v4 on an Asrock
X99 Extreme6 motherboard. The GPU I am attempting to pass through is
an NVIDIA GTX 1080 plugged into the slot closest to the CPU. There is
a second GPU, an AMD R5 240 OEM (Oland) being used as the "initial"
GPU for Linux ("Initial" in this case means that the text consoles and
the X graphical login appear on the monitor connected to this GPU.
After logging in, additional commands are run to either run a VM or
run a new X server using the NVIDIA GPU.). Each GPU has separate
monitor cables connected to them - there is no attempt to somehow
forward the output from one GPU to another. Linux is booted using
UEFI, not BIOS boot. The CSM is disabled. The UEFI splash and the GRUB
bootloader display using the NVIDIA GPU. There does not appear to be
an option to change the boot GPU. However, Linux is configured to
display its output on the AMD GPU by a) only describing the AMD GPU in
xorg.conf and b) passing "video=simplefb:off" on the command line as
well as putting radeon in the initrd so that it can load before the
nvidia driver does. I am running Debian sid with kernel 4.6.

I activate the vfio-pci drivers manually by writing to
/sys/bus/pci/drivers/vfio-pci/new_id and then unbinding the existing
driver and binding vfio-pci. This actually works most of the time
(more on this later). When I initially (without my hack) try to launch
a qemu-kvm guest (using virt-manager; guest OS is Windows 10; guest is
booting via OVMF; guest is using i440fx), the host kernel log gets
flooded with an error "vfio-pci 0000:04:00.0: BAR 1: can't reserve
[mem 0xc0000000-0xcfffffff 64bit pref]". Examining /proc/iomem shows
the memory region vfio-pci is trying to claim overlaps with a memory
region named BOOTFB which is apparently the UEFI framebuffer (despite
the fact that simplefb is disabled, apparently this memory region is
still created). As a really terrible hack, I wrote a kernel module
that calls "__release_region(&iomem_resource, <start of bootfb>, <size
of bootfb>)". This fixed the issue for me, and I was successfully able
to pass through the boot GPU to the guest.

The source code of this hacky kernel module is below. It is used by
running "insmod forcefully-remove-bootfb.ko bootfb_start=<addr>
bootfb_end=<addr>" using addresses found from /proc/iomem. The module
is then immediately unloaded with rmmod. (The kernel module can't find
BOOTFB by itself because I couldn't and didn't bother to figure out
how to actually traverse iomem_resource from a kernel module. The
resource_lock lock doesn't seem to be accessible from modules.)

Regarding activating the vfio-pci drivers, I actually do not have the
nvidia/snd_hda_intel drivers blacklisted. I allow them to load
normally on boot and unbind them when I run a VM. I also attempt to
rebind the normal drivers after shutting down the VM. The idea is that
I can either run a Windows VM using the NVIDIA GPU, or I can start a
second X server using the NVIDIA GPU and a separate xorg.nv.conf, and
I can switch between these two modes without rebooting the host
(restarting (the second) X is still required). Most of the time, this
actually works correctly. Occasionally however, the kernel will
encounter a general protection fault, but this is an unrelated issue
to this hack I am describing.

A dump of various pieces of information follows (this probably isn't
directly useful and is for reference only):

$ lspci -nn
<snip>
00:1b.0 Audio device [0403]: Intel Corporation C610/X99 series chipset
HD Audio Controller [8086:8d20] (rev 05)
<snip>
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104
[GeForce GTX 1080] [10de:1b80] (rev a1)
04:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f0] (rev a1)
<snip>
08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611]
08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI]
Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
[1002:aab0]
<snip>

$ uname -a
Linux <hostname> 4.6.0-1-amd64 #1 SMP Debian 4.6.2-2 (2016-06-25)
x86_64 GNU/Linux

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.6.0-1-amd64 root=UUID=<snip> ro
rootflags=subvol=@ cgroup_enable=memory intremap=no_x2apic_optout
intel_iommu=on video=simplefb:off quiet

# cat /proc/iomem    # before hack
<snip>
60000000-6fffffff : PCI MMCONFIG 0000 [bus 00-ff]
  60000000-6fffffff : reserved
70000000-fbffbfff : PCI Bus 0000:00
  c0000000-d1ffffff : PCI Bus 0000:04
    c0000000-cfffffff : 0000:04:00.0
      c0000000-c086ffff : BOOTFB
    d0000000-d1ffffff : 0000:04:00.0
<snip>

# cat /proc/iomem    # after hack
<snip>
60000000-6fffffff : PCI MMCONFIG 0000 [bus 00-ff]
  60000000-6fffffff : reserved
70000000-fbffbfff : PCI Bus 0000:00
  c0000000-d1ffffff : PCI Bus 0000:04
    c0000000-cfffffff : 0000:04:00.0
    d0000000-d1ffffff : 0000:04:00.0
<snip>

---------- full commands to prep for running VM ----------
sudo insmod forcefully-remove-bootfb.ko bootfb_start=0xc0000000
bootfb_end=0xc086ffff
sudo rmmod forcefully_remove_bootfb
echo "8086 8d20" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id #
Intel HD Audio, unrelated to this hack
echo "10de 1b80" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
echo "10de 10f0" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
echo "0000:00:1b.0" | sudo tee
/sys/bus/pci/devices/0000\:00\:1b.0/driver/unbind # Intel HD Audio,
unrelated to this hack
echo "0000:04:00.0" | sudo tee /sys/bus/pci/devices/0000\:04\:00.0/driver/unbind
echo "0000:04:00.1" | sudo tee /sys/bus/pci/devices/0000\:04\:00.1/driver/unbind
echo "0000:00:1b.0" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
echo "0000:04:00.0" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
echo "0000:04:00.1" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
# Can run virt-manager and launch VM now

---------- full commands to switch back to Linux ----------
echo "0000:00:1b.0" | sudo tee /sys/bus/pci/devices/0000\:00\:1b.0/driver/unbind
echo "0000:04:00.0" | sudo tee /sys/bus/pci/devices/0000\:04\:00.0/driver/unbind
echo "0000:04:00.1" | sudo tee /sys/bus/pci/devices/0000\:04\:00.1/driver/unbind
echo "0000:00:1b.0" | sudo tee /sys/bus/pci/drivers/snd_hda_intel/bind
echo "0000:04:00.0" | sudo tee /sys/bus/pci/drivers/nvidia/bind
echo "0000:04:00.1" | sudo tee /sys/bus/pci/drivers/snd_hda_intel/bind

---------- forcefully-remove-bootfb.c ----------
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>

#include <linux/resource_ext.h>

static resource_size_t bootfb_start = 0;
static resource_size_t bootfb_end = 0;

static int __init remover_module_init(void)
{
    printk(KERN_INFO "forcefully-remove-bootfb loaded\n");

    if (sizeof(resource_size_t) != 8) {
        // lol
        printk(KERN_ERR "Herp derp what is a programming?\n");
    } else {
        printk(KERN_INFO "forcefully-remove-bootfb 0x%llx-0x%llx\n",
            bootfb_start, bootfb_end);
        if (bootfb_start == 0 && bootfb_end == 0) {
            printk(KERN_ERR "forcefully-remove-bootfb needs addresses!\n");
        } else {
            // Do the actual removal here
            __release_region(&iomem_resource,
                bootfb_start, bootfb_end - bootfb_start + 1);
        }
    }
    return 0;
}

static void __exit remover_module_exit(void)
{
    printk(KERN_INFO "forcefully-remove-bootfb unloaded\n");
}

module_init(remover_module_init);
module_exit(remover_module_exit);

module_param(bootfb_start, ullong, 0000);
module_param(bootfb_end, ullong, 0000);

MODULE_LICENSE("Dual BSD/GPL");
MODULE_AUTHOR("Robert Ou <address@hidden>");
MODULE_DESCRIPTION("Forcefully removes BOOTFB I/O resource");



reply via email to

[Prev in Thread] Current Thread [Next in Thread]