diff options
author | David S. Miller <davem@davemloft.net> | 2019-11-20 18:11:23 -0800 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2019-11-20 18:11:23 -0800 |
commit | ee5a489fd9645104925e5cdf8f8e455d833730b9 (patch) | |
tree | 1e46a8c460e1d51d465fe472e42cf1c16f92f9c7 /Documentation/networking | |
parent | e2193c9334291ecdc437cdbd9fe9ac35c14fffa8 (diff) | |
parent | 196e8ca74886c433dcfc64a809707074b936aaf5 (diff) |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-11-20
The following pull-request contains BPF updates for your *net-next* tree.
We've added 81 non-merge commits during the last 17 day(s) which contain
a total of 120 files changed, 4958 insertions(+), 1081 deletions(-).
There are 3 trivial conflicts, resolve it by always taking the chunk from
196e8ca74886c433:
<<<<<<< HEAD
=======
void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
>>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5
<<<<<<< HEAD
void *bpf_map_area_alloc(u64 size, int numa_node)
=======
static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable)
>>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5
<<<<<<< HEAD
if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
=======
/* kmalloc()'ed memory can't be mmap()'ed */
if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
>>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5
The main changes are:
1) Addition of BPF trampoline which works as a bridge between kernel functions,
BPF programs and other BPF programs along with two new use cases: i) fentry/fexit
BPF programs for tracing with practically zero overhead to call into BPF (as
opposed to k[ret]probes) and ii) attachment of the former to networking related
programs to see input/output of networking programs (covering xdpdump use case),
from Alexei Starovoitov.
2) BPF array map mmap support and use in libbpf for global data maps; also a big
batch of libbpf improvements, among others, support for reading bitfields in a
relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko.
3) Extend s390x JIT with usage of relative long jumps and loads in order to lift
the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich.
4) Add BPF audit support and emit messages upon successful prog load and unload in
order to have a timeline of events, from Daniel Borkmann and Jiri Olsa.
5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode
(XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson.
6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API
call named bpf_get_link_xdp_info() for retrieving the full set of prog
IDs attached to XDP, from Toke Høiland-Jørgensen.
7) Add BTF support for array of int, array of struct and multidimensional arrays
and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau.
8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo.
9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid
xdping to be run as standalone, from Jiri Benc.
10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song.
11) Fix a memory leak in BPF fentry test run data, from Colin Ian King.
12) Various smaller misc cleanups and improvements mostly all over BPF selftests and
samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation/networking')
-rw-r--r-- | Documentation/networking/af_xdp.rst | 28 | ||||
-rw-r--r-- | Documentation/networking/filter.txt | 8 |
2 files changed, 27 insertions, 9 deletions
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst index 7a4caaaf3a17..5bc55a4e3bce 100644 --- a/Documentation/networking/af_xdp.rst +++ b/Documentation/networking/af_xdp.rst @@ -295,7 +295,7 @@ round-robin example of distributing packets is shown below: { rr = (rr + 1) & (MAX_SOCKS - 1); - return bpf_redirect_map(&xsks_map, rr, 0); + return bpf_redirect_map(&xsks_map, rr, XDP_DROP); } Note, that since there is only a single set of FILL and COMPLETION @@ -304,6 +304,12 @@ to make sure that multiple processes or threads do not use these rings concurrently. There are no synchronization primitives in the libbpf code that protects multiple users at this point in time. +Libbpf uses this mode if you create more than one socket tied to the +same umem. However, note that you need to supply the +XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the +xsk_socket__create calls and load your own XDP program as there is no +built in one in libbpf that will route the traffic for you. + XDP_USE_NEED_WAKEUP bind flag ----------------------------- @@ -355,10 +361,22 @@ to set the size of at least one of the RX and TX rings. If you set both, you will be able to both receive and send traffic from your application, but if you only want to do one of them, you can save resources by only setting up one of them. Both the FILL ring and the -COMPLETION ring are mandatory if you have a UMEM tied to your socket, -which is the normal case. But if the XDP_SHARED_UMEM flag is used, any -socket after the first one does not have a UMEM and should in that -case not have any FILL or COMPLETION rings created. +COMPLETION ring are mandatory as you need to have a UMEM tied to your +socket. But if the XDP_SHARED_UMEM flag is used, any socket after the +first one does not have a UMEM and should in that case not have any +FILL or COMPLETION rings created as the ones from the shared umem will +be used. Note, that the rings are single-producer single-consumer, so +do not try to access them from multiple processes at the same +time. See the XDP_SHARED_UMEM section. + +In libbpf, you can create Rx-only and Tx-only sockets by supplying +NULL to the rx and tx arguments, respectively, to the +xsk_socket__create function. + +If you create a Tx-only socket, we recommend that you do not put any +packets on the fill ring. If you do this, drivers might think you are +going to receive something when you in fact will not, and this can +negatively impact performance. XDP_UMEM_REG setsockopt ----------------------- diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index 319e5e041f38..c4a328f2d57a 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -770,10 +770,10 @@ Some core changes of the new internal format: callq foo mov %rax,%r13 mov %rbx,%rdi - mov $0x2,%esi - mov $0x3,%edx - mov $0x4,%ecx - mov $0x5,%r8d + mov $0x6,%esi + mov $0x7,%edx + mov $0x8,%ecx + mov $0x9,%r8d callq bar add %r13,%rax mov -0x228(%rbp),%rbx |