qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH] sparc32 irq clearing (guest Solaris performance


From: Artyom Tarasenko
Subject: [Qemu-devel] Re: [PATCH] sparc32 irq clearing (guest Solaris performance+NetBSD) fix
Date: Mon, 16 Nov 2009 18:19:10 +0100

2009/11/16 Blue Swirl <address@hidden>:
> On Mon, Nov 16, 2009 at 1:47 PM, Artyom Tarasenko
> <address@hidden> wrote:
>> 2009/11/15 Blue Swirl <address@hidden>:
>>> On Sun, Nov 15, 2009 at 1:15 AM, Artyom Tarasenko
>>> <address@hidden> wrote:
>>>> 2009/11/14 Blue Swirl <address@hidden>:
>>>>> On Sat, Nov 14, 2009 at 3:03 AM, Artyom Tarasenko
>>>>> <address@hidden> wrote:
>>>>>> According to NCR89C105 documentation
>>>>>> http://www.ibiblio.org/pub/historic-linux/early-ports/Sparc/NCR/NCR89C105.txt
>>>>>>
>>>>>> Interrupts are cleared by disabling and then re-enabling them.
>>>>>> This patch implements the specified behaviour. The most visible effects:
>>>>>
>>>>> I think the current version also implements this behaviour. The
>>>>> difference is that now we clear on disable, with your version, the
>>>>> interrupts are cleared when re-enabling them.
>>>>
>>>> Doesn't this imply that the current version does not implement this
>>>> ("Interrupts are cleared by disabling and then re-enabling them") 
>>>> behavior? ;-)
>>>
>>> The specification only says that the sequence disable-enable clears
>>> interrupts, but not which of these is true:
>>> - clearing happens in the moment of disabling (and interrupts after
>>> that are not cleared, current version)
>>> - clearing happens  in the moment of re-enabling (your version, sort of)
>>> - clearing happens in both cases (lose interrupts)
>>
>> English is not my native language, but fwiw I think "and then
>> re-enabling" can only be the second variant. Without "then" it could
>> be either one or three. And if the first variant is what they really
>> meant, the part with "and then" is totally redundant and misleading.
>
> Still, this is user documentation, not implementation specification.
> I'm open to both versions, if they work.
>
>>> It's also interesting to think what happens between the interrupt
>>> controller and the devices. Clearing an interrupt at controller level
>>> does not clear the interrupt condition at the device. Aren't the
>>> interrupts level triggered on Sparc, so the interrupt is still posted?
>>> Is the interrupt actually masked by clearing until the level is
>>> deactivated?
>>
>> Looks unlikely to me. In the book "Panic! Unix system crash dump
>> analysis" the authors write that the first thing interrupt handler has
>> to do is disable the interrupt, and yes wrting "unix" they mean
>> "SunOS/Solaris".
>>
>> That's also what I observe debugging the Solaris kernel code
>> (Solaris kernel debugger is a really powerful tool).
>> Looks like interrupts can be shared between devices, so the general
>> handler disables the interrupt and then calls multiple device-specific
>> handlers sequentially and checks if any of then claims the interrupt.
>> If no one does it writes the message "Spurious interrupt %d\n".
>>
>>
>>> Or maybe the controller latches the interrupt so that even after the
>>> device releases the interrupt line, interrupt is still active towards
>>> the CPU. Then the clearing would make sense.
>>
>> Looks very realistic to me. I think that's the way the interrupts are
>> handled at least under x86.
>
> It's a must on x86, because the interrupts are edge triggered.

I don't know, how the real sun4m reacts in the case where irq stays
on, not being cleared.
It can not be though that it would try to process irq for every next
tick. The CPU must have some time to clear the pending irq, so it must
be edge triggered too, at least in a way.

>>> This does not match current code.
>>>
>>>>> With the current version, the interrupts which arrived after getting
>>>>> cleared during disable but before re-enabling become visible after
>>>>> re-enabling. It looks like esp driver in Aurora 1.0, 2.0 and 2.1
>>>>> depend on this.
>>>>
>>>> Hm. It certainly makes sense. But I don't see how does it agree with
>>>> the NCR89C105 docs. Can it be the documentation is not precise?
>>>>
>>>>>> - NetBSD 1.3.3 - 1.5.3 boots successfully
>>>>>
>>>>> I didn't find those even on ftp.netbsd.org, the first I have is 1.6.
>>>>> Thus I could not test if any of the changes I did had any positive
>>>>> effect except if some test failed.
>>>>
>>>> Have you looked in the NetBSD-archive at ftp.netbsd.org?
>>>> ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-archive/
>>>
>>> Yes, but there are no ISOs.
>>
>> Don't have ISOs either. I think they did never exist for the early
>> versions. Miniroots can be used.
>> One nice [missing] feature of OpenBIOS is that it doesn't check
>> whether a disk label is valid.
>> The disk labels of NetBSD miniroots are invalid, so it's not possible
>> to boot them on a real hw,
>> or under OBP (it drove me crazy when I was fixing the scsi layer to
>> work with OBP because
>> I thought miniroots were hdd images).
>> But, under OpenBIOS it works like a charm:
>>
>> qemu-system-sparc -nographic  -hda miniroot-133.fs -m 64
>
> Thanks, now I can try something.
>
>>>>>> - Solaris 2.5.1 - 7 boots ~1500 times faster (~20 seconds instead of ~8 
>>>>>> hours)
>>>>>>
>>>>>> Signed-off-by: Artyom Tarasenko <address@hidden>
>>>>>> ---
>>>>>> diff --git a/hw/slavio_intctl.c b/hw/slavio_intctl.c
>>>>>> index 9680392..779c661 100644
>>>>>> --- a/hw/slavio_intctl.c
>>>>>> +++ b/hw/slavio_intctl.c
>>>>>> @@ -177,19 +177,19 @@ static void slavio_intctlm_mem_writel(void
>>>>>> *opaque, target_phys_addr_t addr,
>>>>>>     saddr = addr >> 2;
>>>>>>     DPRINTF("write system reg 0x" TARGET_FMT_plx " = %x\n", addr, val);
>>>>>>     switch (saddr) {
>>>>>> -    case 2: // clear (enable)
>>>>>> +    case 2: // clear (enable, clear formerly disabled pending)
>>>>>>         // Force clear unused bits
>>>>>>         val &= MASTER_IRQ_MASK;
>>>>>> +        s->intregm_pending &= (s->intregm_disabled & val);
>>>>>
>>>>> This looks buggy, the AND operation will clear a lot of bits unrelated
>>>>> to val. I think you are missing a ~ here. I tried a few combinations,
>>>>> but none of them passed my tests.
>>>>
>>>> My bad. It had to be s->intregm_pending &= ~(s->intregm_disabled &
>>>> val) . But unfortunately it doesn't work.
>>>>
>>>>>>         s->intregm_disabled &= ~val;
>>>>>>         DPRINTF("Enabled master irq mask %x, curmask %x\n", val,
>>>>>>                 s->intregm_disabled);
>>>>>>         slavio_check_interrupts(s, 1);
>>>>>>         break;
>>>>>> -    case 3: // set (disable, clear pending)
>>>>>> +    case 3: // set (disable, do not clear pending)
>>>>>>         // Force clear unused bits
>>>>>>         val &= MASTER_IRQ_MASK;
>>>>>>         s->intregm_disabled |= val;
>>>>>> -        s->intregm_pending &= ~val;
>>>>>
>>>>> Here
>>>>> s->intregm_pending &= ~s->intregm_disabled;
>>>>> would also pass my tests. Does that change anything?
>>>>
>>>> No, doesn't work for me either.
>>>>
>>>> Also I put some additional printfs into sun4m.c and see that
>>>> interrupts are both set while being already set, and reset while not being 
>>>> set.
>>>> Looks like a bug?
>>>>
>>>> static void cpu_set_irq(void *opaque, int irq, int level)
>>>> {
>>>>    CPUState *env = opaque;
>>>>
>>>>    if (level) {
>>>>        DPRINTF("Raise CPU IRQ %d\n", irq);
>>>>        env->halted = 0;
>>>>        if(env->pil_in & (1 << irq)) {printf("already set: 
>>>> %d\n",irq);return;}
>>>>        env->pil_in |= 1 << irq;
>>>>        cpu_check_irqs(env);
>>>>    } else {
>>>>        DPRINTF("Lower CPU IRQ %d\n", irq);
>>>>        if(~(env->pil_in & (1 << irq))) {printf("already low:
>>>> %d\n",irq);return;}
>>>>        env->pil_in &= ~(1 << irq);
>>>>        cpu_check_irqs(env);
>>>>    }
>>>> }
>>>>
>>>
>>> It should be harmless, though it may suck performance.
>>
>> Do you mean the check against old_interrupt in the cpu_check_irqs()?
>> To me it looks harmless only in the case when we have screaming
>> interrupts of the same level. Or do you mean something else?
>
> If the old level is unchanged, there should be no action. What's the
> case where you see problems?


In the case where irq 10 and 14 are both being set it can happen that
old_interrupt==14 and another call cpu_set_irq(10)  would call
do_interrupt() ?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]