Re: aarch64 efi boot failures with qemu 6.0+

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: aarch64 efi boot failures with qemu 6.0+

From:	Guenter Roeck
Subject:	Re: aarch64 efi boot failures with qemu 6.0+
Date:	Tue, 27 Jul 2021 04:18:30 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 7/27/21 2:30 AM, Michael S. Tsirkin wrote:

On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote:

On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <linux@roeck-us.net> wrote:


On 7/26/21 9:45 PM, Michael S. Tsirkin wrote:

On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote:

(cc Bjorn)

On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:


On 7/26/21 12:56 AM, Guenter Roeck wrote:

On 7/25/21 3:14 PM, Michael S. Tsirkin wrote:

On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote:

Hi all,

starting with qemu v6.0, some of my aarch64 efi boot tests no longer
work. Analysis shows that PCI devices with IO ports do not instantiate
in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem affects
(at least) ne2k_pci, tulip, dc390, and am53c974. The problem only
affects
aarch64, not x86/x86_64.

I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to
keep firmware resource map"). Since this commit, PCI device BAR
allocation has changed. Taking tulip as example, the kernel reports
the following PCI bar assignments when running qemu v5.2.

[    3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
[    3.922207] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
[    3.922505] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]


IIUC, these lines are read back from the BARs

[    3.927111] pci 0000:00:01.0: BAR 0: assigned [io  0x1000-0x107f]
[    3.927455] pci 0000:00:01.0: BAR 1: assigned [mem
0x10000000-0x1000007f]


... and this is the assignment created by the kernel.

With qemu v6.0, the assignment is reported as follows.

[    3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000
[    3.923278] pci 0000:00:01.0: reg 0x10: [io  0x0000-0x007f]
[    3.923451] pci 0000:00:01.0: reg 0x14: [mem 0x10000000-0x1000007f]


The problem here is that Linux, for legacy reasons, does not support
I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is
rejected.

This might make sense on x86, where legacy I/O ports may exist, but on
other architectures, this makes no sense.



Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings
that trip up existing guests, right?


I think it is difficult to draw a line. Sure, maybe EFI should not create
such mappings, but then maybe qemu should not suddenly start to enforce
those mappings for existing guests either.


EFI creates the mappings primarily for itself, and up until DSM #5
started to be enforced, all PCI resource allocations that existed at
boot were ignored by Linux and recreated from scratch.

Also, the commit in question looks dubious to me. I don't think it is
likely that Linux would fail to create a resource tree. What does
happen is that BARs get moved around, which may cause trouble in some
cases: for instance, we had to add special code to the EFI framebuffer
driver to copy with framebuffer BARs being relocated.

For my own testing, I simply reverted commit 0cf8882fd0 in my copy of
qemu. That solves my immediate problem, giving us time to find a solution
that is acceptable for everyone. After all, it doesn't look like anyone
else has noticed the problem, so there is no real urgency.


I would argue that it is better to revert that commit. DSM #5 has a
long history of debate and misinterpretation, and while I think we
ended up with something sane, I don't think we should be using it in
this particular case.


I think revert might make sense, however:

0: No (The operating system shall not ignore the PCI configuration that 
firmware has done
at boot time. However, the operating system is free to configure the devices in 
this hierarchy
that have not been configured by the firmware. There may be a reduced level of 
hot plug
capability support in this hierarchy due to resource constraints. This 
situation is the same as
the legacy situation where this _DSM is not provided.)

^^^^ does not this imply that reporting a 0 as we currently do
      should be mostly a NOP?


1: Yes (The operating system may ignore the PCI configuration that the firmware 
has done
at boot time, and reconfigure/rebalance the resources in the hierarchy.)


So I am debating with myself whether this should be a plain revert or
return 1 here:
      /*
       * 0 - The operating system must not ignore the PCI configuration that
       *     firmware has done at boot time.
       */
      aml_append(ifctx1, aml_return(aml_int(0)));
-    aml_append(ifctx, ifctx1);
+    aml_append(ifctx1, aml_return(aml_int(1)));
      aml_append(method, ifctx);



Guenter what happens if we return 1? Do things work well?

Yes.

Guenter

[Prev in Thread]

Current Thread

[Next in Thread]

Re: aarch64 efi boot failures with qemu 6.0+, (continued)

Prev by Date: [PULL 14/14] hw: aspeed_gpio: Fix memory size
Next by Date: Re: QEMU question: upstreaming I2C device with unpublished datasheet
Previous by thread: Re: aarch64 efi boot failures with qemu 6.0+
Next by thread: Re: aarch64 efi boot failures with qemu 6.0+
Index(es):
- Date
- Thread