[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] e1000e migration
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] e1000e migration |
Date: |
Mon, 15 May 2017 17:14:46 +0100 |
User-agent: |
Mutt/1.8.2 (2017-04-18) |
* Dmitry Fleytman (address@hidden) wrote:
> Hello Dave,
>
> It looks like we identified the problem.
>
> We are working on fix and will send it as soon as it is ready.
Thanks!
Dave
> ~Dmitry.
>
> Sent from my iPhone
>
> > On 15 May 2017, at 12:22, Dr. David Alan Gilbert <address@hidden> wrote:
> >
> > * Dmitry Fleytman (address@hidden) wrote:
> >> Hello Dave,
> >
> > Hi Dmitry,
> > Thanks for the reply.
> >
> >> We are trying to reproduce this issue on our systems but with no luck so
> >> far…
> >
> > Note our QE hit this with both a Win8.1 and a win2012r2 guest - although
> > the 2012r2 is reported to have recoverd after a few minutes.
> > 2016 apparently works OK.
> >
> >> From what you describe it looks like some bit in ICR is not being cleared
> >> by the driver.
> >> This usually means that this bit should never be set in that specific
> >> interrupt mode.
> >>
> >> Could you please check which bit is not cleared and who sets it?
> >
> > The full set of e1000e_irq_pending_interrupts after migration is:
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR:
> > 0x80100082, IMS: 0x1f00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x80100082, IMS: 0x1e00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x80100082, IMS: 0x1e00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR:
> > 0x80300082, IMS: 0x1e00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x80100082, IMS: 0x1c00004)
> > <repeated lots>
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x80300082, IMS: 0x1c00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR:
> > 0x813000c2, IMS: 0x1c00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR:
> > 0x813000c2, IMS: 0x1400004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR:
> > 0x813000c2, IMS: 0x1000004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR:
> > 0x813000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0xa00004)
> > <repeats>
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR:
> > 0x813000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0x800004)
> > <repeats>
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x800004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x813000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR:
> > 0x813000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0x4)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x811000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR:
> > 0x815000c2, IMS: 0x1a00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> > 0x815000c2, IMS: 0xa00004)
> > address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR:
> > 0x815000c2, IMS: 0x1a00004)
> >
> > and then I think we get stuck in this cycle of this one always being the
> > one that fires repeatedly. I think that's the 'other' firing, I think
> > because of the receive-overrun. One thing I've not
> > figured out is why the receive overrun happens - is that because we
> > really have a very heavy packet rate or is it because something has
> > stopped receiving them.
> > The network I'm testing on does have a fair amount of broadcast traffic
> > on.
> >
> > Dave
> >
> >> Regards,
> >> Dmitry
> >>
> >>> On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <address@hidden>
> >>> wrote:
> >>>
> >>> Hi Dmitry,
> >>> Have you seen any problems with e1000e migration under windows?
> >>> I've got a repeatable case where after migration with e1000e windows
> >>> hangs/almost hangs.
> >>> I'm seeing the e1000e generate interrupts at a very very high
> >>> rate (maybe ~1000 second ish?) after migration.
> >>>
> >>> Some versions of qemu do it and some dont, but my attempts
> >>> at bisection lead me to code that should be irrelevant.
> >>>
> >>> Prior to migration I see:
> >>>
> >>> address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR:
> >>> 0x80100082, IMS: 0x1f00004)
> >>> address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> >>> 0x80000082, IMS: 0x1a00004)
> >>> address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> >>> 0x80000082, IMS: 0x1f00004)
> >>> address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> >>> 0x80000082, IMS: 0x1a00004)
> >>> address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR:
> >>> 0x80000082, IMS: 0x1f00004)
> >>>
> >>> which I think the ICR means:
> >>> 31 - int asserted
> >>> 20 - RxQ0 - receive queue 0 interrupt
> >>> 7 - RXT0 - receiver timer interrupt
> >>> 1 - TXQE - Transmit Queue empty
> >>>
> >>> after migration it varies more, I'm seeing mostly:
> >>> address@hidden:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR:
> >>> 0x815000c2, IMS: 0x1a00004)
> >>> 31 - int asserted
> >>> 24 - 'Other'
> >>> 22 - TxQ0 interrupt
> >>> 20 - RxQ0 interrupt
> >>> 07 - RXT0 Receiver timer interrupt
> >>> 06 - RX0 - Receiver overrun
> >>> 01 - TXQE - Transmit queue empty
> >>>
> >>> For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935
> >>>
> >>> Dave
> >>> --
> >>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> >>
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK