qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume o


From: Jan Kiszka
Subject: Re: [Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
Date: Wed, 26 Jan 2011 14:31:37 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2011-01-26 14:15, Jan Kiszka wrote:
> On 2011-01-26 14:08, Stefan Berger wrote:
>> On 01/26/2011 07:09 AM, Jan Kiszka wrote:
>>> On 2011-01-26 13:05, Stefan Berger wrote:
>>>> On 01/26/2011 03:14 AM, Jan Kiszka wrote:
>>>>> On 2011-01-25 17:49, Stefan Berger wrote:
>>>>>> On 01/25/2011 02:26 AM, Jan Kiszka wrote:
>>>>>>> Do you see a chance to look closer at the issue yourself? E.g.
>>>>>>> instrument the kernel's irqchip models and dump their states once
>>>>>>> your
>>>>>>> guest is stuck?
>>>>>> The device runs on iRQ 3. So I applied this patch here.
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
>>>>>> index 3cece05..8f4f94c 100644
>>>>>> --- a/arch/x86/kvm/i8259.c
>>>>>> +++ b/arch/x86/kvm/i8259.c
>>>>>> @@ -106,7 +106,7 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state
>>>>>> *s, int irq, int level)
>>>>>>    {
>>>>>>        int mask, ret = 1;
>>>>>>        mask = 1<<    irq;
>>>>>> -    if (s->elcr&    mask)    /* level triggered */
>>>>>> +    if (s->elcr&    mask)    /* level triggered */ {
>>>>>>            if (level) {
>>>>>>                ret = !(s->irr&    mask);
>>>>>>                s->irr |= mask;
>>>>>> @@ -115,7 +115,10 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state *s, int irq, int level)
>>>>>>                s->irr&= ~mask;
>>>>>>                s->last_irr&= ~mask;
>>>>>>            }
>>>>>> -    else    /* edge triggered */
>>>>>> +if (irq == 3)
>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>> __FUNCTION__,__LINE__,level,
>>>>>> s->irr);
>>>>>> +        }
>>>>>> +    else    /* edge triggered */ {
>>>>>>            if (level) {
>>>>>>                if ((s->last_irr&    mask) == 0) {
>>>>>>                    ret = !(s->irr&    mask);
>>>>>> @@ -124,7 +127,9 @@ static inline int pic_set_irq1(struct
>>>>>> kvm_kpic_state
>>>>>> *s, int irq, int level)
>>>>>>                s->last_irr |= mask;
>>>>>>            } else
>>>>>>                s->last_irr&= ~mask;
>>>>>> -
>>>>>> +if (irq == 3)
>>>>>> +    printk("%s %d: level=%d, irr = %x\n",
>>>>>> __FUNCTION__,__LINE__,level,
>>>>>> s->irr);
>>>>>> +        }
>>>>>>        return (s->imr&    mask) ? -1 : ret;
>>>>>>    }
>>>>>>
>>>>>> @@ -206,6 +211,8 @@ int kvm_pic_set_irq(void *opaque, int irq, int
>>>>>> level)
>>>>>>
>>>>>>        pic_lock(s);
>>>>>>        if (irq>= 0&&    irq<    PIC_NUM_PINS) {
>>>>>> +if (irq == 3)
>>>>>> +printk("%s\n", __FUNCTION__);
>>>>>>            ret = pic_set_irq1(&s->pics[irq>>    3], irq&    7, level);
>>>>>>            pic_update_irq(s);
>>>>>>            trace_kvm_pic_set_irq(irq>>    3, irq&    7, s->pics[irq>>
>>>>>> 3].elcr,
>>>>>>
>>>>>>
>>>>>>
>>>>>> While it's still working I see this here with the levels changing
>>>>>> 0-1-0.
>>>>>> Though then it stops and levels are only at '1'.
>>>>>>
>>>>>> [ 1773.833824] kvm_pic_set_irq
>>>>>> [ 1773.833827] pic_set_irq1 131: level=0, irr = 5b
>>>>>> [ 1773.834161] kvm_pic_set_irq
>>>>>> [ 1773.834163] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.834193] kvm_pic_set_irq
>>>>>> [ 1773.834195] pic_set_irq1 131: level=0, irr = 5b
>>>>>> [ 1773.835028] kvm_pic_set_irq
>>>>>> [ 1773.835031] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.835542] kvm_pic_set_irq
>>>>>> [ 1773.835545] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1773.889892] kvm_pic_set_irq
>>>>>> [ 1773.889894] pic_set_irq1 131: level=1, irr = 5b
>>>>>> [ 1791.258793] pic_set_irq1 119: level=1, irr = d9
>>>>>> [ 1791.258824] pic_set_irq1 119: level=0, irr = d1
>>>>>> [ 1791.402476] pic_set_irq1 119: level=1, irr = d9
>>>>>> [ 1791.402534] pic_set_irq1 119: level=0, irr = d1
>>>>>> [ 1791.402538] pic_set_irq1 119: level=1, irr = d9
>>>>>> [...]
>>>>>>
>>>>>>
>>>>>> I believe the last 5 shown calls can be ignored. After that the
>>>>>> interrupts don't go through anymore.
>>>>>>
>>>>>> In the device model I see interrupts being raised and cleared.
>>>>>> After the
>>>>>> last one was cleared in 'my' device model, only interrupts are raised.
>>>>>> This looks like as if the interrupt handler in the guest Linux was
>>>>>> never
>>>>>> run, thus the IRQ is never cleared and we're stuck.
>>>>>>
>>>>> User space is responsible for both setting and clearing that line. IRQ3
>>>>> means you are using some serial device model? Then you should check
>>>>> what
>>>>> its state is.
>>>> Good hint. I moved it now to IRQ11 and it works fine now (with kvm-git)
>>>> from what I can see. There was no UART on IRQ3 before, though, but
>>>> certainly it was the wrong IRQ for it.
>>>>> Moreover, a complete picture of the kernel/user space interaction
>>>>> should
>>>>> be obtainable by using fstrace for capturing kvm events.
>>>>>
>>>> Should it be working on IRQ3? If so, I'd look into it when I get a
>>>> chance...
>>> I don't know your customizations, so it's hard to tell if that should
>>> work or not. IRQ3 is intended to be used by ISA devices on the PC
>>> machine. Are you adding an ISA model, or what is your use case?
>>>
>> The use case is to add a TPM device interface.
>>
>> http://xenbits.xensource.com/xen-unstable.hg?file/1e56ac73b9b9/tools/ioemu/hw/tpm_tis.c
>>
>>
>> This one typically is connected to the LPC bus.
> 
> I see. Do you also have the xen-free version of it? Maybe there are
> still issues with proper qdev integration etc.
> 

Without knowing the hardware spec or what is actually behind set_irq,
this looks at least suspicious:

[...]
if (off == TPM_REG_INT_STATUS) {
    /* clearing of interrupt flags */
    if ((val & INTERRUPTS_SUPPORTED) &&
        (s->loc[locty].ints & INTERRUPTS_SUPPORTED)) {
        s->set_irq(s->irq_opaque, s->irq, 0);
        s->irq_pending = 0;
    }
    s->loc[locty].ints &= ~(val & INTERRUPTS_SUPPORTED);
} else
[...]

The code does no
t check if there are ints left after masking out those provided in val.
Does that device already de-asserts the line if you only clear a single
interrupt reason?

BTW, irq_pending looks redundant, at least when using the qemu irq
subsystem.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]