qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] e1000e: Don't zero out buffer address in rx des


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH] e1000e: Don't zero out buffer address in rx descriptor
Date: Wed, 19 Oct 2016 09:25:10 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Am 19.10.2016 um 08:48 hat Dmitry Fleytman geschrieben:
>     Another related thing that I noticed while debugging this and turning on
>     tracing is that the interrupt throttling timers kept firing even if
>     there was no activity at all. Something might be wrong, there, too.
> 
>     Next thing I wondered why throttling was enabled at all because the spec
>     says the default is 0 (turned off). So one thing that I'm pretty sure is
>     just a misunderstanding is the following defintion:
> 
>     #define E1000E_MIN_XITR     (500) /* No more then 7813 interrupts per
>                                         second according to spec 10.2.4.2 */
> 
> 
>     As I understand it, the spec is just giving an example there and lower
>     values are valid as well. At the very least, 0 should be accepted as a
>     special case because it means "disabled" and it's specified to be the
>     default.
> 
> 
> Right, this according to the spec this value should be 0 by default and
> throttling should be disabled.
> 
> Current device implementation does not allow specification of
> throttling interval less than 500 and treats interval 0 as throttling
> enabled with interval 500.
> 
> This is done by intention because according to the spec (10.2.4.2)
> device cannot produce more than 7813 interrupts per second even when
> throttling is disabled. Therefore, even in case of interrupt storm
> (continuous interrupt re-injection by device), number of interrupts
> produced by device is limited and CPU (driver) has enough time to do
> its job and handle problematic interrupt state.

I think you're misinterpreting the spec here. This is the paragraph
we're talking about, right?

    For example, if the interval is programmed to 500 (decimal), the
    82574 guarantees the CPU is not interrupted by it for 128 µs from
    the last interrupt. The maximum observable interrupt rate from the
    82574 should never exceed 7813 interrupts/sec.

It says "for example", so this is just demonstrating how you can
calculate the effects of a specific throttling setting. It says that
_if_ you set ITR to 500, you get an interrupt at most every
500 * 256 ns = 128 µs. And 1 / 128 µs = 7821.5 Hz, so this is the
effective maximum frequency that _this specific_ ITR setting allows.

I also don't think it would make any sense for hardware to be unable to
trigger interrupts more often than that. Triggering an interrupt is not
a complex operation that involves a lot of calculation or anything.

> Opposed to this, virtual device is able to raise interrupts with rate
> limited by CPU speed only therefore driver has no chance to fix
> interrupt storm condition. 
> 
> Windows e1000e drivers rely on upper limit for number of interrupts
> per second in some cases and absence of this limit leads to infinite
> interrupt storms.
> 
> To summarise, while usage of throttling mechanisms is a little bit
> different from what specification says, effective emulated device
> behavior is totally compliant to the real device.

So Windows doesn't configure ITR (i.e. it is 0) even though it can't
handle unlimited interrupts? That would be a driver bug then, and
perhaps an important enough one to keep a workaround in our code. But
then let's be explicit that this is a workaround for a Windows bug and
not mandated by the spec.

I'm not sure in what setup you produced this error, but possibly a
reason why this doesn't happen with real hardware isn't the NIC itself
but the backend: Communication with the host can obviously be faster
than talking to a physical network (so if you were doing the latter, the
rate in the VM wouldn't be limited by the CPU, but by the physical
network).

>     As this wasn't blocking me after I had patched the above constant
>     locally to 0 so that I could see the actually meaningful trace events, I
>     didn't dig deeper, but I suspect that my local workaround for the
>     trace point spam may actually be a valid fix.
> 
> 
> This part that looks suspicious.
> 
> Even when throttling is enabled, timers should be idle
> unless there are packets on corresponding path.
> 
> In case you see times firing when device is idle - something
> might be wrong in throttling mechanisms.

I saw a lot of trace events related to the timer between the really
interesting information (that is, multiple timer invocations with
nothing else in between), but I think it may not have been always. I
didn't really look too closely at the cause of this one because it was
easy enough to patch it out of the way.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]