qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems with irq mapping in qemu v5.2


From: BALATON Zoltan
Subject: Re: Problems with irq mapping in qemu v5.2
Date: Thu, 24 Dec 2020 00:56:15 +0100 (CET)

On Wed, 23 Dec 2020, Guenter Roeck wrote:
On 12/23/20 12:20 PM, BALATON Zoltan wrote:
On Wed, 23 Dec 2020, Guenter Roeck wrote:
On 12/23/20 8:09 AM, Mark Cave-Ayland wrote:
On 23/12/2020 15:21, Philippe Mathieu-Daudé wrote:
FWIW bisecting Fuloong2E starts failing here:

4ea98d317eb442c738f898f16cfdd47a18b7ca49 is the first bad commit
commit 4ea98d317eb442c738f898f16cfdd47a18b7ca49
Author: BALATON Zoltan <balaton@eik.bme.hu>
Date:   Fri Jan 25 14:52:12 2019 -0500

     ide/via: Implement and use native PCI IDE mode

     This device only implemented ISA compatibility mode and native PCI IDE
     mode was missing but no clients actually need ISA mode but to the
     contrary, they usually want to switch to and use device in native
     PCI IDE mode. Therefore implement native PCI mode and switch default
     to that.

     Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
     Message-id:
c323f08c59b9931310c5d92503d370f77ce3a557.1548160772.git.balaton@eik.bme.hu
     Signed-off-by: John Snow <jsnow@redhat.com>

  hw/ide/via.c | 52 ++++++++++++++++++++++++++++++++++++++--------------
  1 file changed, 38 insertions(+), 14 deletions(-)

I think the original version of the patch broke fuloong2e, however that should 
have been fixed by my patchset here: 
https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg03936.html. It might 
be that there are multiple regressions located during a full bisect :/


Not really. The following patch on top of qemu 5.2 results in the ide drive
being detected and working.

diff --git a/hw/ide/via.c b/hw/ide/via.c
index be09912b33..1bfdc422ee 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -186,11 +186,14 @@ static void via_ide_realize(PCIDevice *dev, Error **errp)
    pci_register_bar(dev, 3, PCI_BASE_ADDRESS_SPACE_IO, &d->cmd_bar[1]);

    bmdma_setup_bar(d);
+#if 0
    pci_register_bar(dev, 4, PCI_BASE_ADDRESS_SPACE_IO, &d->bmdma_bar);
+#endif

    qdev_init_gpio_in(ds, via_ide_set_irq, 2);
    for (i = 0; i < 2; i++) {
        ide_bus_new(&d->bus[i], sizeof(d->bus[i]), ds, i, 2);
+        ide_init_ioport(&d->bus[i], NULL, i ? 0x170 : 0x1f0, i ? 0x376 : 
0x3f6);
        ide_init2(&d->bus[i], qdev_get_gpio_in(ds, i));

        bmdma_init(&d->bus[i], &d->bmdma[i], d);

With the added ide_init_ioport(), the drive is detected. With the #if 0,

This breaks MorphOS on pegasos2 so it's not acceptable for me as a fix. 
(Actually this just reverts my commit in a cryptic way.)

it actually starts working. So there are two problems: 1) The qemu ide
subsystem isn't informed about the io addresses, and 2) bmdma isn't working.

The problem rather seems to be that whatever you're trying to run can only 
handle legacy mode and does not correctly detect or work with native mode of 
this IDE controller. The real chip can switch between these modes and starts in 
legacy mode but most OSes with a better driver will switch to native mode 
during boot (in some cases the firmware will switch already). But we can't 
emulate that in QEMU easily because of how the IDE emulation is implemented: we 
either set up legacy ioports or use PCI BMDMA, I don't see a way to deregister 
legacy ports and irqs once the config reg is flipped to native mode. Therefore 
I've chosen to only emulate native mode which is what most guests want to use 
and some only work with that and I've tested this with the previously mentioned 
Linux version that it still detected and worked with the IDE ports. During 
testing I've found that Linux will use either native or legacy modes if the 
appropriate config bits are set but for some boards there may
be work arounds for specific quirks such as the case for pegasos2 with IRQs 
hardwired to legacy interrupts even in native mode where we need to follow what 
hardware does otherwise one or the other guest breaks. Maybe there's a similar 
quirk for the fuloong2e?

What guest OS are you running and did you confirm that it runs on the real 
machine? If you run recent Linux kernels and don't know if those still work 
with real hardware could this be a bug in the guest driver and not in QEMU? We 
know that we don't fully emulate this controller but there should be a way to 
set things up in a way that satisfies all guests and I've tried to do that when 
touching this part but possibly I did not have the right Linux version for the 
real machine as it was hard to find one distro that worked with it. Maybe 
Jiaxun has a known working Linux distro or kernel that we can use to check 
emulation with or knows more about how the VIA IDE port IRQs are wired on this 
board. (I've added Jiaxun again but the list seems to strip his addess.)


I don't have a real machine, and therefore did not test it on one.

I tried with Linux mainline (v5.10-12913-g614cb5894306), v3.16.85, v4.4.248,
and v4.14.212. I can't test older version because my cross compiler is too
new. Each of those kernel versions shows exactly the same behavior.

I think the original author of this device was Huacai so adding him to the thread too in case he remembers anything relevant, but code there now is not what he wrote because that only emulated legacy mode and after my changes (reworked by Mark) we only emulate native mode with optional quirk using legacy IRQs for pegasos2. But maybe he has some images that were known to work on the real machine that we could test now to see where's the problem.

On Tue, 22 Dec 2020, Guenter Roeck wrote:
qemu-system-mips64el -M fulong2e \
    -kernel vmlinux -no-reboot -m 256 -snapshot \
    -drive file=rootfs.mipsel.ext3,format=raw,if=ide \
    -vga none -nographic \
    --append "root=/dev/sda console=ttyS0"
    -serial stdio -monitor none

This works just fine with qemu v3.1. With qemu v5.2 (after applying the
fuloong patch series), I get:

VFS: Cannot open root device "sda" or unknown-block(0,0): error -6

This used to work up to qemu v3.1. Since qemu v4.0, there has been a variety
of failures. Common denominator is that the ide drive is no longer recognized,
presumably due to related changes in the via and/or pci code between v3.1
and v4.0.

Difference in log messages:

v3.1:

pci 0000:00:05.1: [Firmware Bug]: reg 0x10: invalid BAR (can't size)
pci 0000:00:05.1: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
pci 0000:00:05.1: [Firmware Bug]: reg 0x18: invalid BAR (can't size)
pci 0000:00:05.1: reg 0x1c: [mem 0x100000370-0x10000037f 64bit]
...
pata_via 0000:00:05.1: BMDMA: BAR4 is zero, falling back to PIO
ata1: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
ata2: PATA max PIO4 cmd 0x170 ctl 0x376 irq 15
ata1.00: ATA-7: QEMU HARDDISK, 2.5+, max UDMA/100
...

This is the previous state only emulating legacy mode and since none of the native mode BARs are there Linux fails to enable native mode and falls back to legacy so it ends up working but probably not how this should work on real machine.

----

v5.2:

pci 0000:00:05.1: reg 0x10: [io  0x0000-0x0007]
pci 0000:00:05.1: reg 0x14: [io  0x0000-0x0003]
pci 0000:00:05.1: reg 0x18: [io  0x0000-0x0007]
pci 0000:00:05.1: reg 0x1c: [io  0x0000-0x0003]
pci 0000:00:05.1: reg 0x20: [io  0x0000-0x000f]
pci 0000:00:05.1: BAR 4: assigned [io  0x4440-0x444f]
...
ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x4440 irq 14
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x4448 irq 15
[and nothing else]

Now we emulate native mode and Linux seems to program the BARs (although I'm not sure all these should be starting at 0) but then still tries to access the device in legacy mode as shown by ports and IRQs.

If someone has logs from original machine it would be interesting to see how IDE ports are detected there. I'll try with the kernel from debian and see what that does but maybe it tries to use legacy mode too then it won't work.

With the original image I used for testing described here:
https://lists.nongnu.org/archive/html/qemu-devel/2020-03/msg04086.html
I now get:

$ qemu-system-mips64el -M fuloong2e -serial stdio -net none -vga none -kernel 
gentoo-loongson-2.6.22.6-20070902 -cdrom debian-8.11.0-mipsel-netinst.iso

Linux version 2.6.22.6-mipsgit-20070902-lm2e-liveusb (stuartl@zhenghe) (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.1)) #5 Fri Jan 25 11:19:12 EST 2008

[...]

via686b fix: ISA bridge
via686b fix: ISA bridge done
via686b fix: IDE
via686b fix: IDE done
PCI quirk: region eee0-eeef claimed by vt82c686 SMB
ac97 interrupt = 9
ac97 rev=80
Setting sub-vendor ID & device ID
sub vendor-device id=11001af4
pci_update_mappings: adding bar 4 to io @ 0x4040

[note the IDE fixup which some board specific quirk coming from
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/mips/pci/fixup-fuloong2e.c]

serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
io_map_base of root PCI bus 0000:00 unset.  Trying to continue but you better
fix this issue or report it to linux-mips@linux-mips.org or your vendor.
scsi0 : pata_via
scsi1 : pata_via
ata1: PATA max UDMA/100 cmd 0xffffffffbfd001f0 ctl 0xffffffffbfd003f6 bmdma 
0xffffffffbfd04040 irq 14
ata2: PATA max UDMA/100 cmd 0xffffffffbfd00170 ctl 0xffffffffbfd00376 bmdma 
0xffffffffbfd04048 irq 15
[...]
NET: Registered protocol family 17
Freeing unused kernel memory: 8832k freed


Loading drivers...
usbcore: registered new interface driver usbfs
[...]
USB Universal Host Controller Interface driver v3.0
PCI: Enabling device 0000:00:05.2 (0000 -> 0001)
pci_update_mappings: adding bar 4 to io @ 0x4000
uhci_hcd 0000:00:05.2: UHCI Host Controller
uhci_hcd 0000:00:05.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:05.2: irq 10, io base 0x00004000
qemu-system-mips64el: ../hw/pci/pci.c:255: pci_bus_change_irq_level: Assertion `irq_num < bus->nirq' failed.
Aborted (core dumped)

I think this is the problem you've reported originally exposed by 459ca8bfa41, reverting that commit gives me:

USB Universal Host Controller Interface driver v3.0
PCI: Enabling device 0000:00:05.2 (0000 -> 0001)
uhci_hcd 0000:00:05.2: UHCI Host Controller
uhci_hcd 0000:00:05.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:05.2: irq 10, io base 0x00004000
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
PCI: Enabling device 0000:00:05.3 (0000 -> 0001)
uhci_hcd 0000:00:05.3: UHCI Host Controller
uhci_hcd 0000:00:05.3: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:05.3: irq 11, io base 0x00004020
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
loop: module loaded
Registering unionfs 2.2.2 (for 2.6.22.15)
squashfs: version 3.2-r2 (2007/01/15) Phillip Lougher
Waiting 10 seconds for devices to settle.

Gentoo/MIPS 2007.1 Netboot Image
--------------------------------

Kernel 2.6.22.6-mipsgit-20070902-lm2e-liveusb compiled #5 Fri Jan 25 11:19:12 EST 2008
Running on platform lemote-fulong

but so does removing USB such as:

diff --git a/hw/mips/fuloong2e.c b/hw/mips/fuloong2e.c
index 45c596f4fe..26c3438729 100644
--- a/hw/mips/fuloong2e.c
+++ b/hw/mips/fuloong2e.c
@@ -255,10 +255,10 @@ static void vt82c686b_southbridge_init(PCIBus *pci_bus, int slot, qemu_irq intc,

     dev = pci_create_simple(pci_bus, PCI_DEVFN(slot, 1), "via-ide");
     pci_ide_create_devs(dev);
-
+#if 0
     pci_create_simple(pci_bus, PCI_DEVFN(slot, 2), "vt82c686b-usb-uhci");
     pci_create_simple(pci_bus, PCI_DEVFN(slot, 3), "vt82c686b-usb-uhci");
-
+#endif
     *i2c_bus = vt82c686b_pm_init(pci_bus, PCI_DEVFN(slot, 4), 0xeee1, NULL);

     /* Audio support */

which also avoids the the problem without reverting 459ca8bfa41 so I think there are two problems:

1. IRQ mapping is somehow wrong on fuloong2, this is evidenced by the USB part above and also by kernel panics if I remove -net none or -vga none from the command line. I think this is not related to IDE emulation, maybe only exposed by it first.

2. The Linux driver you use wants to use legacy mode of the IDE that we don't emulate. The linux/arch/mips/pci/fixup-fuloong2e.c does mention legacy mode but I think I've found previously that if we hard code native mode, Linux would detect it and use it anyway. I think this worked with my original series but may have been broken during the rework. I'd have to dig up those patches and see what's the difference. Probably hardcoding legacy IRQ's instead of allowing true native mode and maybe handling of the native mode bit are different in my original patches but I don't really want to go over this again after we had a long discussion previously and I did describe everything I've found in detail back then but forgotten by now and would have to discover it again.

Neither of the above problems should be fixed by reverting my via-ide changes that are needed for pegasos2 emulation. If Linux can't be changed to work with native mode of the controller then maybe emulating both legacy and native mode could help but that does not seem to be simple without changing low level IDE emulation that has a chance to break something else so I did not try to do that. That's why I chose to emulate native mode only which I did test to work with both fuloong2e and pegasos2. I did spend quite some time with this back then. I think the problem with the current version is that it only emulates half-native mode now which is OK for pegasos2 but confuses Linux on fuloong2e. On fuloong full native mode would probably work where IRQ is settable by register as described in datasheet if we fix native mode bit so it can't be set to legacy mode which we don't emulate. I think I did this in original series and it worked but needed a bit to select between this mode and half-native mode for pegasos2 which Mark disliked so he twisted the patches as long as he could get rid of that bit but still allow the test cases to pass.

I know my original proposal may not match real hardware completely but we have to decide what we want: faithful emulation of all the quirks of this chip or getting an OS running so the user space could be tested. A lot of machines in QEMU do the latter (e.g. mac99 or e500 and maybe really all machines as they don't aim to be a simulator just emulate enough of the machine to get OSes running) so I think it would be OK for this case as well if fully emulating the chip is much more trouble than it's worth.

Regards,
BALATON Zoltan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]