qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net


From: Jason Wang
Subject: Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net
Date: Wed, 4 Nov 2020 10:15:05 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0


On 2020/11/3 下午7:56, Daniel P. Berrangé wrote:
On Tue, Nov 03, 2020 at 12:32:43PM +0200, Yuri Benditovich wrote:
On Tue, Nov 3, 2020 at 11:02 AM Jason Wang <jasowang@redhat.com> wrote:

On 2020/11/3 上午2:51, Andrew Melnychenko wrote:
Basic idea is to use eBPF to calculate and steer packets in TAP.
RSS(Receive Side Scaling) is used to distribute network packets to guest
virtqueues
by calculating packet hash.
eBPF RSS allows us to use RSS with vhost TAP.

This set of patches introduces the usage of eBPF for packet steering
and RSS hash calculation:
* RSS(Receive Side Scaling) is used to distribute network packets to
guest virtqueues by calculating packet hash
* eBPF RSS suppose to be faster than already existing 'software'
implementation in QEMU
* Additionally adding support for the usage of RSS with vhost

Supported kernels: 5.8+

Implementation notes:
Linux TAP TUNSETSTEERINGEBPF ioctl was used to set the eBPF program.
Added eBPF support to qemu directly through a system call, see the
bpf(2) for details.
The eBPF program is part of the qemu and presented as an array of bpf
instructions.
The program can be recompiled by provided Makefile.ebpf(need to adjust
'linuxhdrs'),
although it's not required to build QEMU with eBPF support.
Added changes to virtio-net and vhost, primary eBPF RSS is used.
'Software' RSS used in the case of hash population and as a fallback
option.
For vhost, the hash population feature is not reported to the guest.

Please also see the documentation in PATCH 6/6.

I am sending those patches as RFC to initiate the discussions and get
feedback on the following points:
* Fallback when eBPF is not supported by the kernel

Yes, and it could also a lacking of CAP_BPF.


* Live migration to the kernel that doesn't have eBPF support

Is there anything that we needs special treatment here?

Possible case: rss=on, vhost=on, source system with kernel 5.8 (everything
works) -> dest. system 5.6 (bpf does not work), the adapter functions, but
all the steering does not use proper queues.




* Integration with current QEMU build

Yes, a question here:

1) Any reason for not using libbpf, e.g it has been shipped with some
distros

We intentionally do not use libbpf, as it present only on some distros.
We can switch to libbpf, but this will disable bpf if libbpf is not
installed
If we were modifying existing funtionality then introducing a dep on
libbpf would be a problem as you'd be breaking existing QEMU users
on distros without libbpf.

This is brand new functionality though, so it is fine to place a
requirement on libbpf. If distros don't ship that library and they
want BPF features in QEMU, then those distros should take responsibility
for adding libbpf to their package set.

2) It would be better if we can avoid shipping bytecodes


This creates new dependencies: llvm + clang + ...
We would prefer byte code and ability to generate it if prerequisites are
installed.
I've double checked with Fedora, and generating the BPF program from
source is a mandatory requirement for QEMU. Pre-generated BPF bytecode
is not permitted.

There was also a question raised about the kernel ABI compatibility
for BPF programs ?

   https://lwn.net/Articles/831402/

   "The basic problem is that when BPF is compiled, it uses a set
    of kernel headers that describe various kernel data structures
    for that particular version, which may be different from those
    on the kernel where the program is run. Until relatively recently,
    that was solved by distributing the BPF as C code along with the
    Clang compiler to build the BPF on the system where it was going
    to be run."

Is this not an issue for QEMU's usage of BPF here ?


That's good point. Actually, DPDK ships RSS bytecodes but I don't know it works.

But as mentioned in the link, if we generate the code with BTF that would be fine.

Thanks



The dependancy on llvm is unfortunate for people who build with GCC,
but at least they can opt-out via a configure switch if they really
want to. As that LWN article notes, GCC will gain BPF support


Regards,
Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]