qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [QUESTION] How to reduce network latency to improve net


From: Jason Wang
Subject: Re: [Qemu-devel] [QUESTION] How to reduce network latency to improve netperf TCP_RR drastically?
Date: Tue, 11 Jun 2019 15:36:16 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0


On 2019/6/10 下午11:55, Michael S. Tsirkin wrote:
On Tue, Jun 04, 2019 at 03:10:43PM +0800, Like Xu wrote:
Hi Michael,

At https://www.linux-kvm.org/page/NetworkingTodo, there is an entry for
network latency saying:

---
reduce networking latency:
  allow handling short packets from softirq or VCPU context
  Plan:
    We are going through the scheduler 3 times
    (could be up to 5 if softirqd is involved)
    Consider RX: host irq -> io thread -> VCPU thread ->
    guest irq -> guest thread.
    This adds a lot of latency.
    We can cut it by some 1.5x if we do a bit of work
    either in the VCPU or softirq context.
  Testing: netperf TCP RR - should be improved drastically
           netperf TCP STREAM guest to host - no regression
  Contact: MST
---

I am trying to make some contributions to improving netperf TCP_RR.
Could you please share more ideas or plans or implemental details to make it
happen?

Thanks,
Like Xu

So some of this did happen. netif_receive_skb is now called
directly from tun_get_user.

Question is about the rx/tun_put_user path now.

If the vhost thread is idle, there's a single packet
outstanding then maybe we can forward the packet to userspace
directly from BH without waking up the thread.


After the batch dequeue, it's pretty hard to determine whether or not no packet is outstanding just from tun itself.



For this to work we need to map some userspace memory into kernel
ahead of the time. For example, maybe it can happen when
guest adds RX buffers? Copying Jason who's looking into
memory mapping matters.


Need to go over the rx queue and pin the pages and then use MMU notifiers to unpin them if necessary.  And need to consider a way to work with batch dequeue.

Thanks




reply via email to

[Prev in Thread] Current Thread [Next in Thread]