[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH] sparc32 irq clearing (guest Solaris performance

From: Artyom Tarasenko
Subject: [Qemu-devel] Re: [PATCH] sparc32 irq clearing (guest Solaris performance+NetBSD) fix
Date: Mon, 16 Nov 2009 12:47:06 +0100

2009/11/15 Blue Swirl <address@hidden>:
> On Sun, Nov 15, 2009 at 1:15 AM, Artyom Tarasenko
> <address@hidden> wrote:
>> 2009/11/14 Blue Swirl <address@hidden>:
>>> On Sat, Nov 14, 2009 at 3:03 AM, Artyom Tarasenko
>>> <address@hidden> wrote:
>>>> According to NCR89C105 documentation
>>>> http://www.ibiblio.org/pub/historic-linux/early-ports/Sparc/NCR/NCR89C105.txt
>>>> Interrupts are cleared by disabling and then re-enabling them.
>>>> This patch implements the specified behaviour. The most visible effects:
>>> I think the current version also implements this behaviour. The
>>> difference is that now we clear on disable, with your version, the
>>> interrupts are cleared when re-enabling them.
>> Doesn't this imply that the current version does not implement this
>> ("Interrupts are cleared by disabling and then re-enabling them") behavior? 
>> ;-)
> The specification only says that the sequence disable-enable clears
> interrupts, but not which of these is true:
> - clearing happens in the moment of disabling (and interrupts after
> that are not cleared, current version)
> - clearing happens  in the moment of re-enabling (your version, sort of)
> - clearing happens in both cases (lose interrupts)

English is not my native language, but fwiw I think "and then
re-enabling" can only be the second variant. Without "then" it could
be either one or three. And if the first variant is what they really
meant, the part with "and then" is totally redundant and misleading.

> It's also interesting to think what happens between the interrupt
> controller and the devices. Clearing an interrupt at controller level
> does not clear the interrupt condition at the device. Aren't the
> interrupts level triggered on Sparc, so the interrupt is still posted?
> Is the interrupt actually masked by clearing until the level is
> deactivated?

Looks unlikely to me. In the book "Panic! Unix system crash dump
analysis" the authors write that the first thing interrupt handler has
to do is disable the interrupt, and yes wrting "unix" they mean

That's also what I observe debugging the Solaris kernel code
(Solaris kernel debugger is a really powerful tool).
Looks like interrupts can be shared between devices, so the general
handler disables the interrupt and then calls multiple device-specific
handlers sequentially and checks if any of then claims the interrupt.
If no one does it writes the message "Spurious interrupt %d\n".

> Or maybe the controller latches the interrupt so that even after the
> device releases the interrupt line, interrupt is still active towards
> the CPU. Then the clearing would make sense.

Looks very realistic to me. I think that's the way the interrupts are
handled at least under x86.

> This does not match current code.
>>> With the current version, the interrupts which arrived after getting
>>> cleared during disable but before re-enabling become visible after
>>> re-enabling. It looks like esp driver in Aurora 1.0, 2.0 and 2.1
>>> depend on this.
>> Hm. It certainly makes sense. But I don't see how does it agree with
>> the NCR89C105 docs. Can it be the documentation is not precise?
>>>> - NetBSD 1.3.3 - 1.5.3 boots successfully
>>> I didn't find those even on ftp.netbsd.org, the first I have is 1.6.
>>> Thus I could not test if any of the changes I did had any positive
>>> effect except if some test failed.
>> Have you looked in the NetBSD-archive at ftp.netbsd.org?
>> ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-archive/
> Yes, but there are no ISOs.

Don't have ISOs either. I think they did never exist for the early
versions. Miniroots can be used.
One nice [missing] feature of OpenBIOS is that it doesn't check
whether a disk label is valid.
The disk labels of NetBSD miniroots are invalid, so it's not possible
to boot them on a real hw,
or under OBP (it drove me crazy when I was fixing the scsi layer to
work with OBP because
I thought miniroots were hdd images).
But, under OpenBIOS it works like a charm:

qemu-system-sparc -nographic  -hda miniroot-133.fs -m 64

>>>> - Solaris 2.5.1 - 7 boots ~1500 times faster (~20 seconds instead of ~8 
>>>> hours)
>>>> Signed-off-by: Artyom Tarasenko <address@hidden>
>>>> ---
>>>> diff --git a/hw/slavio_intctl.c b/hw/slavio_intctl.c
>>>> index 9680392..779c661 100644
>>>> --- a/hw/slavio_intctl.c
>>>> +++ b/hw/slavio_intctl.c
>>>> @@ -177,19 +177,19 @@ static void slavio_intctlm_mem_writel(void
>>>> *opaque, target_phys_addr_t addr,
>>>>     saddr = addr >> 2;
>>>>     DPRINTF("write system reg 0x" TARGET_FMT_plx " = %x\n", addr, val);
>>>>     switch (saddr) {
>>>> -    case 2: // clear (enable)
>>>> +    case 2: // clear (enable, clear formerly disabled pending)
>>>>         // Force clear unused bits
>>>>         val &= MASTER_IRQ_MASK;
>>>> +        s->intregm_pending &= (s->intregm_disabled & val);
>>> This looks buggy, the AND operation will clear a lot of bits unrelated
>>> to val. I think you are missing a ~ here. I tried a few combinations,
>>> but none of them passed my tests.
>> My bad. It had to be s->intregm_pending &= ~(s->intregm_disabled &
>> val) . But unfortunately it doesn't work.
>>>>         s->intregm_disabled &= ~val;
>>>>         DPRINTF("Enabled master irq mask %x, curmask %x\n", val,
>>>>                 s->intregm_disabled);
>>>>         slavio_check_interrupts(s, 1);
>>>>         break;
>>>> -    case 3: // set (disable, clear pending)
>>>> +    case 3: // set (disable, do not clear pending)
>>>>         // Force clear unused bits
>>>>         val &= MASTER_IRQ_MASK;
>>>>         s->intregm_disabled |= val;
>>>> -        s->intregm_pending &= ~val;
>>> Here
>>> s->intregm_pending &= ~s->intregm_disabled;
>>> would also pass my tests. Does that change anything?
>> No, doesn't work for me either.
>> Also I put some additional printfs into sun4m.c and see that
>> interrupts are both set while being already set, and reset while not being 
>> set.
>> Looks like a bug?
>> static void cpu_set_irq(void *opaque, int irq, int level)
>> {
>>    CPUState *env = opaque;
>>    if (level) {
>>        DPRINTF("Raise CPU IRQ %d\n", irq);
>>        env->halted = 0;
>>        if(env->pil_in & (1 << irq)) {printf("already set: %d\n",irq);return;}
>>        env->pil_in |= 1 << irq;
>>        cpu_check_irqs(env);
>>    } else {
>>        DPRINTF("Lower CPU IRQ %d\n", irq);
>>        if(~(env->pil_in & (1 << irq))) {printf("already low:
>> %d\n",irq);return;}
>>        env->pil_in &= ~(1 << irq);
>>        cpu_check_irqs(env);
>>    }
>> }
> It should be harmless, though it may suck performance.

Do you mean the check against old_interrupt in the cpu_check_irqs()?
To me it looks harmless only in the case when we have screaming
interrupts of the same level. Or do you mean something else?

The problem is, with "return" after "pritnf" above at least Linux,
NetBSD and OBP
hang (or booting so slow that it looks like they hang).

It looks like something is depending on this broken behaviour.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]