qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] pci: fix bridge IO/BASE


From: Blue Swirl
Subject: Re: [Qemu-devel] [PATCH] pci: fix bridge IO/BASE
Date: Sun, 4 Mar 2012 19:51:02 +0000

On Sun, Mar 4, 2012 at 17:35, Michael S. Tsirkin <address@hidden> wrote:
> On Sun, Mar 04, 2012 at 05:07:34PM +0000, Blue Swirl wrote:
>> On Sun, Mar 4, 2012 at 15:22, Michael S. Tsirkin <address@hidden> wrote:
>> > On Sun, Mar 04, 2012 at 02:35:28PM +0000, Blue Swirl wrote:
>> >> On Sun, Mar 4, 2012 at 14:23, Michael S. Tsirkin <address@hidden> wrote:
>> >> > On Sun, Mar 04, 2012 at 01:38:38PM +0000, Blue Swirl wrote:
>> >> >> On Sun, Mar 4, 2012 at 13:28, Michael S. Tsirkin <address@hidden> 
>> >> >> wrote:
>> >> >> > On Sun, Mar 04, 2012 at 12:37:57PM +0000, Blue Swirl wrote:
>> >> >> >> On Sun, Mar 4, 2012 at 12:21, Michael S. Tsirkin <address@hidden> 
>> >> >> >> wrote:
>> >> >> >> > On Sun, Mar 04, 2012 at 10:27:24AM +0000, Blue Swirl wrote:
>> >> >> >> >> On Sun, Mar 4, 2012 at 09:46, Michael S. Tsirkin 
>> >> >> >> >> <address@hidden> wrote:
>> >> >> >> >> > commit 5caef97a16010f818ea8b950e2ee24ba876643ad introduced
>> >> >> >> >> > a regression: we do not make IO base/limit upper 16
>> >> >> >> >> > bit registers writeable, so we should report a 16 bit
>> >> >> >> >> > IO range type, not a 32 bit one.
>> >> >> >> >> > Note that PCI_PREF_RANGE_TYPE_32 is 0x0, but 
>> >> >> >> >> > PCI_IO_RANGE_TYPE_32 is 0x1.
>> >> >> >> >> >
>> >> >> >> >> > In particular, this broke sparc64.
>> >> >> >> >> >
>> >> >> >> >> > Note: this just reverts to behaviour prior to the patch.
>> >> >> >> >> > Making PCI_IO_BASE_UPPER16 and PCI_IO_LIMIT_UPPER16
>> >> >> >> >> > registers writeable should, and seems to, work just as well, 
>> >> >> >> >> > but
>> >> >> >> >> > as no system seems to actually be interested in 32 bit IO,
>> >> >> >> >> > let's not make unnecessary changes.
>> >> >> >> >> >
>> >> >> >> >> > Reported-by: Mark Cave-Ayland <address@hidden>
>> >> >> >> >> > Signed-off-by: Michael S. Tsirkin <address@hidden>
>> >> >> >> >> >
>> >> >> >> >> > Mark, can you confirm that this fixes the bug for you?
>> >> >> >> >>
>> >> >> >> >> No, running
>> >> >> >> >> qemu-system-sparc64 -serial stdio
>> >> >> >> >> still shows black screen and the following on console:
>> >> >> >> >> OpenBIOS for Sparc64
>> >> >> >> >> Unhandled Exception 0x0000000000000032
>> >> >> >> >> PC = 0x00000000ffd19e18 NPC = 0x00000000ffd19e1c
>> >> >> >> >> Stopping execution
>> >> >> >> >
>> >> >> >> > The weird thing is the range type does not seem to be accessed
>> >> >> >> > at all. So I guessed there's some memory corruption here.
>> >> >> >> > Running valgrind shows this:
>> >> >> >> >
>> >> >> >> > --11114-- WARNING: unhandled syscall: 340
>> >> >> >> > --11114-- You may be able to write your own handler.
>> >> >> >> > --11114-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
>> >> >> >> > --11114-- Nevertheless we consider this a bug.  Please report
>> >> >> >> > --11114-- it at http://valgrind.org/support/bug_reports.html.
>> >> >> >> > ==11114== Invalid read of size 4
>> >> >> >> > ==11114==    at 0x2A68C0: pci_apb_init (apb_pci.c:350)
>> >> >> >> > ==11114==    by 0x2F7A84: sun4uv_init (sun4u.c:779)
>> >> >> >> > ==11114==    by 0x13D716: main (vl.c:3397)
>> >> >> >> > ==11114==  Address 0x156c7d30 is 0 bytes after a block of size 64
>> >> >> >> > alloc'd
>> >> >> >> > ==11114==    at 0x557DD69: malloc (vg_replace_malloc.c:236)
>> >> >> >> > ==11114==    by 0x225F56: malloc_and_trace (vl.c:2156)
>> >> >> >> > ==11114==    by 0x584AFEC: ??? (in /lib/libglib-2.0.so.0.2800.8)
>> >> >> >> > ==11114==    by 0x584B528: g_malloc0 (in 
>> >> >> >> > /lib/libglib-2.0.so.0.2800.8)
>> >> >> >> > ==11114==    by 0x19C50C: qemu_allocate_irqs (irq.c:47)
>> >> >> >> > ==11114==    by 0x2F7A4C: sun4uv_init (sun4u.c:778)
>> >> >> >> > ==11114==    by 0x13D716: main (vl.c:3397)
>> >> >> >> > ==11114==
>> >> >> >> > apb: here
>> >> >> >> > ==11114== Warning: client switching stacks?  SP change: 
>> >> >> >> > 0xfec42cbc -->
>> >> >> >> > 0x16894008
>> >> >> >> > ==11114==          to suppress, use: --max-stackframe=398791500 or
>> >> >> >> > greater
>> >> >> >> > ==11114== Warning: client switching stacks?  SP change: 
>> >> >> >> > 0x16893fa0 -->
>> >> >> >> > 0xfec42cc0
>> >> >> >> > ==11114==          to suppress, use: --max-stackframe=398791392 or
>> >> >> >> > greater
>> >> >> >> > ==11114== Warning: client switching stacks?  SP change: 
>> >> >> >> > 0xfec42fe0 -->
>> >> >> >> > 0x16893fd0
>> >> >> >> > ==11114==          to suppress, use: --max-stackframe=398790640 or
>> >> >> >> > greater
>> >> >> >> > ==11114==          further instances of this message will not be 
>> >> >> >> > shown.
>> >> >> >> > QEMU 1.0.50 monitor - type 'help' for more information
>> >> >> >> > (qemu) ==11114== Thread 2:
>> >> >> >> > ==11114== Conditional jump or move depends on uninitialised 
>> >> >> >> > value(s)
>> >> >> >> > ==11114==    at 0x2A8351: compute_all_sub (cc_helper.c:37)
>> >> >> >> > ==11114==    by 0x2A8782: helper_compute_psr (cc_helper.c:470)
>> >> >> >> > ==11114==    by 0x9AD9A19: ???
>> >> >> >> > ==11114==
>> >> >> >> > ==11114== Conditional jump or move depends on uninitialised 
>> >> >> >> > value(s)
>> >> >> >> > ==11114==    at 0x2A827C: compute_all_sub_xcc (cc_helper.c:60)
>> >> >> >> > ==11114==    by 0x2A8795: helper_compute_psr (cc_helper.c:473)
>> >> >> >> > ==11114==    by 0x9AD9A19: ???
>> >> >> >> > ==11114==
>> >> >> >> > ==11114== Conditional jump or move depends on uninitialised 
>> >> >> >> > value(s)
>> >> >> >> > ==11114==    at 0x2A8296: compute_all_sub_xcc (cc_helper.c:295)
>> >> >> >> > ==11114==    by 0x2A8795: helper_compute_psr (cc_helper.c:473)
>> >> >> >> > ==11114==    by 0x9AD9A19: ???
>> >> >> >> > ==11114==
>> >> >> >> >
>> >> >> >> > Is the above a problem?
>> >> >> >>
>> >> >> >> It looks like Sparc does not reset registers at CPU reset. Nice 
>> >> >> >> catch.
>> >> >> >
>> >> >> > Invalid read and address after block are also worrying.
>> >> >> >
>> >> >> > irqs are allocated with
>> >> >> >  #define MAX_PILS 16
>> >> >> >
>> >> >> >    irq = qemu_allocate_irqs(cpu_set_irq, env, MAX_PILS);
>> >> >> >
>> >> >> > then passed to apb:
>> >> >> >
>> >> >> >    pci_bus = pci_apb_init(APB_SPECIAL_BASE, APB_MEM_BASE, irq, 
>> >> >> > &pci_bus2,
>> >> >> >                           &pci_bus3);
>> >> >> >
>> >> >> > which does:
>> >> >> > PCIBus *pci_apb_init(target_phys_addr_t special_base,
>> >> >> >                     target_phys_addr_t mem_base,
>> >> >> >                     qemu_irq *pic, PCIBus **bus2, PCIBus **bus3)
>> >> >> >
>> >> >> > and
>> >> >> >
>> >> >> >   for (i = 0; i < 32; i++) {
>> >> >> >        sysbus_connect_irq(s, i, pic[i]);
>> >> >> >    }
>> >> >>
>> >> >> Awful. But using 32 for MAX_PILS does not help either.
>> >> >
>> >> >
>> >> > Could you please clarify what is the SABRE device?
>> >> > Is it, in fact, a bridge device? Or not?
>> >>
>> >> Yes, it's the host bridge, also known as PBM. It's documented in
>> >> UltraSPARC IIi User's Manual
>> >
>> > Btw would be nice to host the manuals at qemu.org
>> > our code points at sun.com URLs :(
>>
>> I have most if not all manuals, downloaded from sun.com, but I'm not
>> sure if they can be redistributed.
>
> Okay ...
> Let's change the link to point to some other place which has them?
>
>> > I am looking at 19.3.1 PCI Configuration Space
>> > and it appears to show that this is a regular device
>> > with a couple of custom registers at pffsets 0x40
>> > and 0x41.
>> >
>> > Why do we want to pretend it is a bridge?
>>
>> It's the host bridge and the device class is PCI_CLASS_BRIDGE_HOST.
>
> Yes. But the *header* type is 0 (NORMAL)
> while the code in pci_init_mask_bridge
> which is the only user of the is_bridge register
> initializes a type 1 (BRIDGE) header.
>
> So it just happens to do a vaguely correct thing.

Well, that is still according to device spec.

>> >
>> >> and there it says that the device is
>> >> found in the configuration space.
>> >>
>> >> The secondary bridges are Simbas and should be called APBs.
>> >
>> > As far as I can see from the code, it has header type
>> > NORMAL but sets is_bridge.
>> > This was done by this commit:
>> > 776e1bbb6cf4fe66a93c1a5dd814bbb650deca00
>>
>> IIRC otherwise some registers are not writable.
>
> Yes but which ones? I looked at the manual and
> it does not list any registers. Playing with code,
> it looks like we just need to make *some*
> BAR writeable. I tried with
> pci_set_long(d->wmask + PCI_BASE_ADDRESS_0, 0xfffffff0);
> to
> pci_set_long(d->wmask + PCI_BASE_ADDRESS_5, 0xfffffff0);
>
> and any one of these makes bios get at least to
> the prompt.

I now know the root cause of the problem. OpenBIOS programs the BARs
somewhat correctly just by accident. The initial io_base and mem_base
for BARs are not correct, but because the host bridge BARs (and also 6
of which 4 are not even BARs!) are programmed first, the bases
happened to settle to values that happen to work. The commit revealed
the problem since the settling didn't happen. The mask changes just
let the host bridge setup continue to do the magic.

By just changing OpenBIOS (see attached patch), I can get the devices
to work (assuming that VGA is a separate problem). There's no need to
change QEMU.

>> >
>> >> >
>> >> >
>> >> > --
>> >> > MST

Attachment: 0001-pci-fix-BAR-setup.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]