qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] how Windows treats BARs of driver-less devices when other d


From: Laszlo Ersek
Subject: [Qemu-devel] how Windows treats BARs of driver-less devices when other devices are hotplugged
Date: Thu, 25 Feb 2016 13:44:54 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

Hi,

On 02/25/16 12:57, Michael S. Tsirkin wrote:
> ----- Forwarded message from Igor Mammedov <address@hidden> -----
> 
> Date: Thu, 11 Feb 2016 16:16:05 +0100
> From: Igor Mammedov <address@hidden>
> To: "Michael S. Tsirkin" <address@hidden>
> To: address@hidden
> Subject: on pci rebalancing
> Message-ID: <address@hidden>
> In-Reply-To: <address@hidden>
> 
>>>>> For PCI rebalance to work on Windows, one has to provide working PCI 
>>>>> driver
>>>>> otherwise OS will ignore it when rebalancing happens and
>>>>> might map something else over ignored BAR.    
>>>>
>>>> Does it disable the BAR then? Or just move it elsewhere?  
>>> it doesn't, it just blindly ignores BARs existence and maps BAR of
>>> another device with driver over it.  
>>
>> Interesting. On classical PCI this is a forbidden configuration.
>> Maybe we do something that confuses windows?
>> Could you tell me how to reproduce this behaviour?
> #cat > t << EOF
> pci_update_mappings_del
> pci_update_mappings_add
> EOF
> 
> #./x86_64-softmmu/qemu-system-x86_64 -snapshot -enable-kvm -snapshot \
>  -monitor unix:/tmp/m,server,nowait -device pci-bridge,chassis_nr=1 \
>  -boot menu=on -m 4G -trace events=t ws2012r2x64dc.img \
>  -device ivshmem,id=foo,size=2M,shm,bus=pci.1,addr=01
> 
> wait till OS boots, note BARs programmed for ivshmem
>  in my case it was
>    01:01.0 0,0xfe800000+0x100
> then execute script and watch pci_update_mappings* trace events
> 
> # for i in $(seq 3 18); do printf -- "device_add e1000,bus=pci.1,addr=%x\n" 
> $i | nc -U /tmp/m; sleep 5; done;
> 
> hotplugging e1000,bus=pci.1,addr=12 triggers rebalancing where
> Windows unmaps all BARs of nics on bridge but doesn't touch ivshmem
> and then programs new BARs, where:
>   pci_update_mappings_add d=0x7fa02ff0cf90 01:11.0 0,0xfe800000+0x20000
> creates overlapping BAR with ivshmem 

Michael informed me of this on IRC (and forwarded this email to me). I hope to 
start a new thread with my response. (I also reedited the subject fully.)

So, to summarize what I said on IRC first. The situation where firmware 
recognizes and enables a PCI device, hands control to the OS, and then the OS 
lacks a driver for the PCI device, is completely normal and expected. For UEFI 
specifically, I can name a general argument and a specific argument.

The general argument is that actions that need to be taken in 
ExitBootServices() callbacks do not include clearing IO or MMIO decode bits in 
PCI device command registers. Command register manipulation happens when a PCI 
device driver (that conforms to the UEFI driver model) *binds* or *unbinds* a 
device. And unbinding a device is not possible in the ExitBootServices() 
callback, minimally because such callbacks are forbidden from modifying the 
memory map -- but unbinding would release allocated memory.

So what we use such callbacks for is aborting in-flight, outstanding DMA-like 
transfers. Re-setting virtio devices is also an example (think outstanding 
receive requests for virtio-net).

Now let's move on to the specific argument I mentioned above. The Graphics 
Output Protocol (GOP) is a UEFI abstraction that was specifically designed with 
the case in mind when the operating system doesn't have a display driver -- yet 
installed --, but the user obviously has to use the display somehow. The GOP is 
most frequently provided on top of an EFI_PCI_IO_PROTOCOL instance; meaning 
simply that the "GOP driver" is a UEFI driver that drives a PCI device. In 
short, the driver provides the GOP on top of a PCI device.

Now, the GOP is supposed to communicate the pixel format and the frame buffer 
base address for the currently active graphics mode to the software that 
consumes the GOP. This includes UEFI applications of course (think a boot 
loader putting up a splash screen or an anmiation), but importantly, the 
runtime OS is *also* supposed to inherit these characteristics from boot 
services time. The OS can then use simple unaccelerated MMIO writes to display 
things on the screen, until the users installs an accelerated driver.

(Concrete example: this is why you can see *anything at all* on the screen, 
when you run e.g. Windows Server 2012 R2 on top of OVMF and a QXL display, 
before installing the QXL WDDM driver in the guest.)

Clearly, the frame buffer base address communicated through the GOP points into 
one of the MMIO BARs of the PCI device. If, at ExitBootServices(), MMIO 
decoding were disabled for the PCI device that underlies the GOP, that would 
*completely* defeat the GOP design. The OS's attempt to poke at those MMIO 
addresses would be futile -- and in fact the OS has no idea what PCI device (if 
any) the framebuffer is supposed to be related to. This is the jurisdiction of 
the OS-level display driver -- if one exists and is installed.

So, this is a Windows bug in my option. Just because there is no OS-level 
driver, a PCI device is fully expected to be decoding resources, if the 
firmware brought it up.

--*--

Okay, so Michael asked me to try to reproduce the above with OVMF, and see what 
happens. Unfortunately I'm not really knowledgeable about ivshmem, hotplug, et 
cetera. Let me instead tell Igor about using OVMF.

(1) Please follow the instructions on Gerd's page 
<https://www.kraxel.org/repos/>, and install the "edk2.git-ovmf-x64" package.

(2) Create a separate directory for testing. In this directory, run the 
following command:

  cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd myvars.fd

Also create a disk image for your new guest, etc.

(3) Use the following command line snippet to work with OVMF:

     qemu-system-x86_64 \
       -machine accel=kvm \
       -smp cpus=2 \
       -m 2048 \
       \
       -debugcon file:ovmf.debug.log \
       -global isa-debugcon.iobase=0x402 \
       \
       -device qxl-vga \
       \
       -drive 
if=pflash,format=raw,unit=0,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd
 \
       -drive if=pflash,format=raw,unit=1,file=myvars.fd \
       \
       [your options here]

You can of course customize the # of VCPUs, memory size, disks, CD-ROMs, 
network, and so on.

Recommended: when you use the -device option to add the disk and the CD-ROM(s) 
to install the OS (and driver(s)) from, be sure to use the "bootindex" 
property. OVMF will adhere to the boot order. It is recommended to set 
bootindex=0 for your main disk, bootindex=1 for your OS installer CD-ROM, and 
*no* bootindex for your virtio-win driver disk. This way at first boot (with no 
OS installed) OVMF will boot the installer CD-ROM. Further boots (with the same 
command line) will boot the installed OS.

Caveat: I never used the -snapshot option with OVMF virtual machines; it might 
or might not work.

Caveat #2: I had tested simple PCI hotplug and hot-unplug with Windows running 
on OVMF many months ago, but I can't tell off-hand if it will work right now.

Thanks
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]