[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] interrupt mitigation for e1000
From: |
Luigi Rizzo |
Subject: |
Re: [Qemu-devel] interrupt mitigation for e1000 |
Date: |
Wed, 25 Jul 2012 12:54:27 +0200 |
User-agent: |
Mutt/1.4.2.3i |
On Wed, Jul 25, 2012 at 12:12:55PM +0200, Paolo Bonzini wrote:
> Il 25/07/2012 11:56, Luigi Rizzo ha scritto:
> > On Wed, Jul 25, 2012 at 11:53:29AM +0300, Avi Kivity wrote:
> >> On 07/24/2012 07:58 PM, Luigi Rizzo wrote:
> >>> I noticed that the various NIC modules in qemu/kvm do not implement
> >>> interrupt mitigation, which is very beneficial as it dramatically
> >>> reduces exits from the hypervisor.
> >>>
> >>> As a proof of concept i tried to implement it for the e1000 driver
> >>> (patch below), and it brings tx performance from 9 to 56Kpps on
> >>> qemu-softmmu, and from ~20 to 140Kpps on qemu-kvm.
> >>>
> >>> I am going to measure the rx interrupt mitigation in the next couple
> >>> of days.
> >>>
> >>> Is there any interest in having this code in ?
> >>
> >> Indeed. But please drop the #ifdef MITIGATIONs.
> >
> > Thanks for the comments. The #ifdef block MITIGATION was only temporary to
> > point out the differences and run the performance comparisons.
> > Similarly, the magic thresholds below will be replaced with
> > appropriately commented #defines.
> >
> > Note:
> > On the real hardware interrupt mitigation is controlled by a total of four
> > registers (TIDV, TADV, RIDV, RADV) which control it with a granularity
> > of 1024ns , see
> >
> > http://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf
> >
> > An exact emulation of the feature is hard, because the timer resolution we
> > have is much coarser (in the ms range). So i am inclined to use a different
> > approach, similar to the one i have implemented, namely:
> > - the first few packets (whether 1 or 4 or 5 will be decided on the host)
> > report an interrupt immediately;
> > - subsequent interrupts are delayed through qemu_bh_schedule_idle()
>
> qemu_bh_schedule_idle() is really a 10ms timer.
yes, i figured that out, this is why i said that my code was more
a "proof of concept" than an actual patch.
If you have a suggestion on how to schedule a shorter (say 1ms) timer i
am all hears. Perhaps qemu_new_timer_ns() and friends ?
This said, i do not plan to implement the full mitigation registers
controlled by the guest, just possibly use a parameter as in virtio-net
where you can have
'tx=bh' or 'tx=timer' and 'x-txtimer=N' with N is the mitigation delay
in nanoseconds (virtually, in practice rounded to whatever the host granularity
is)
cheers
luigi