diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-02-21 18:24:12 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-02-21 18:24:12 -0800 |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /tools | |
parent | 36289a03bcd3aabdf66de75cb6d1b4ee15726438 (diff) | |
parent | d1fabc68f8e0541d41657096dc713cb01775652d (diff) |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'tools')
261 files changed, 15651 insertions, 3270 deletions
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile index f610e184ce02..681fbcc5ed50 100644 --- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -53,7 +53,7 @@ $(LIBBPF_INTERNAL_HDRS): $(LIBBPF_HDRS_DIR)/%.h: $(BPF_DIR)/%.h | $(LIBBPF_HDRS_ $(LIBBPF_BOOTSTRAP): $(wildcard $(BPF_DIR)/*.[ch] $(BPF_DIR)/Makefile) | $(LIBBPF_BOOTSTRAP_OUTPUT) $(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(LIBBPF_BOOTSTRAP_OUTPUT) \ DESTDIR=$(LIBBPF_BOOTSTRAP_DESTDIR:/=) prefix= \ - ARCH= CROSS_COMPILE= CC=$(HOSTCC) LD=$(HOSTLD) AR=$(HOSTAR) $@ install_headers + ARCH= CROSS_COMPILE= CC="$(HOSTCC)" LD="$(HOSTLD)" AR="$(HOSTAR)" $@ install_headers $(LIBBPF_BOOTSTRAP_INTERNAL_HDRS): $(LIBBPF_BOOTSTRAP_HDRS_DIR)/%.h: $(BPF_DIR)/%.h | $(LIBBPF_BOOTSTRAP_HDRS_DIR) $(call QUIET_INSTALL, $@) @@ -215,7 +215,8 @@ $(OUTPUT)%.bpf.o: skeleton/%.bpf.c $(OUTPUT)vmlinux.h $(LIBBPF_BOOTSTRAP) -I$(or $(OUTPUT),.) \ -I$(srctree)/tools/include/uapi/ \ -I$(LIBBPF_BOOTSTRAP_INCLUDE) \ - -g -O2 -Wall -target bpf -c $< -o $@ + -g -O2 -Wall -fno-stack-protector \ + -target bpf -c $< -o $@ $(Q)$(LLVM_STRIP) -g $@ $(OUTPUT)%.skel.h: $(OUTPUT)%.bpf.o $(BPFTOOL_BOOTSTRAP) @@ -293,3 +294,6 @@ FORCE: .PHONY: all FORCE bootstrap clean install-bin install uninstall .PHONY: doc doc-clean doc-install doc-uninstall .DEFAULT_GOAL := all + +# Delete partially updated (corrupted) files on error +.DELETE_ON_ERROR: diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index 352290ba7b29..91fcb75babe3 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -537,7 +537,7 @@ static bool btf_is_kernel_module(__u32 btf_id) len = sizeof(btf_info); btf_info.name = ptr_to_u64(btf_name); btf_info.name_len = sizeof(btf_name); - err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len); + err = bpf_btf_get_info_by_fd(btf_fd, &btf_info, &len); close(btf_fd); if (err) { p_err("can't get BTF (ID %u) object info: %s", btf_id, strerror(errno)); @@ -606,7 +606,7 @@ static int do_dump(int argc, char **argv) if (fd < 0) return -1; - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_prog_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get prog info: %s", strerror(errno)); goto done; @@ -789,7 +789,10 @@ build_btf_type_table(struct hashmap *tab, enum bpf_obj_type type, } memset(info, 0, *len); - err = bpf_obj_get_info_by_fd(fd, info, len); + if (type == BPF_OBJ_PROG) + err = bpf_prog_get_info_by_fd(fd, info, len); + else + err = bpf_map_get_info_by_fd(fd, info, len); close(fd); if (err) { p_err("can't get %s info: %s", names[type], @@ -931,7 +934,7 @@ show_btf(int fd, struct hashmap *btf_prog_table, int err; memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_btf_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get BTF object info: %s", strerror(errno)); return -1; @@ -943,7 +946,7 @@ show_btf(int fd, struct hashmap *btf_prog_table, info.name = ptr_to_u64(name); len = sizeof(info); - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_btf_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get BTF object info: %s", strerror(errno)); return -1; diff --git a/tools/bpf/bpftool/btf_dumper.c b/tools/bpf/bpftool/btf_dumper.c index eda71fdfe95a..e7f6ec3a8f35 100644 --- a/tools/bpf/bpftool/btf_dumper.c +++ b/tools/bpf/bpftool/btf_dumper.c @@ -57,7 +57,7 @@ static int dump_prog_id_as_func_ptr(const struct btf_dumper *d, if (prog_fd < 0) goto print; - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (err) goto print; @@ -70,7 +70,7 @@ static int dump_prog_id_as_func_ptr(const struct btf_dumper *d, info.func_info_rec_size = finfo_rec_size; info.func_info = ptr_to_u64(&finfo); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (err) goto print; diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c index b46a998d8f8d..ac846b0805b4 100644 --- a/tools/bpf/bpftool/cgroup.c +++ b/tools/bpf/bpftool/cgroup.c @@ -82,7 +82,7 @@ static void guess_vmlinux_btf_id(__u32 attach_btf_obj_id) if (fd < 0) return; - err = bpf_obj_get_info_by_fd(fd, &btf_info, &btf_len); + err = bpf_btf_get_info_by_fd(fd, &btf_info, &btf_len); if (err) goto out; @@ -108,7 +108,7 @@ static int show_bpf_prog(int id, enum bpf_attach_type attach_type, if (prog_fd < 0) return -1; - if (bpf_obj_get_info_by_fd(prog_fd, &info, &info_len)) { + if (bpf_prog_get_info_by_fd(prog_fd, &info, &info_len)) { close(prog_fd); return -1; } diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 620032042576..5a73ccf14332 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -353,7 +353,7 @@ void get_prog_full_name(const struct bpf_prog_info *prog_info, int prog_fd, info.func_info_rec_size = sizeof(finfo); info.func_info = ptr_to_u64(&finfo); - if (bpf_obj_get_info_by_fd(prog_fd, &info, &info_len)) + if (bpf_prog_get_info_by_fd(prog_fd, &info, &info_len)) goto copy_name; prog_btf = btf__load_from_kernel_by_id(info.btf_id); @@ -488,7 +488,7 @@ static int do_build_table_cb(const char *fpath, const struct stat *sb, goto out_close; memset(&pinned_info, 0, sizeof(pinned_info)); - if (bpf_obj_get_info_by_fd(fd, &pinned_info, &len)) + if (bpf_prog_get_info_by_fd(fd, &pinned_info, &len)) goto out_close; path = strdup(fpath); @@ -756,7 +756,7 @@ static int prog_fd_by_nametag(void *nametag, int **fds, bool tag) goto err_close_fds; } - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_prog_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get prog info (%u): %s", id, strerror(errno)); @@ -916,7 +916,7 @@ static int map_fd_by_name(char *name, int **fds) goto err_close_fds; } - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_map_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get map info (%u): %s", id, strerror(errno)); @@ -1026,7 +1026,8 @@ exit_free: return fd; } -int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len) +int map_parse_fd_and_info(int *argc, char ***argv, struct bpf_map_info *info, + __u32 *info_len) { int err; int fd; @@ -1035,7 +1036,7 @@ int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len) if (fd < 0) return -1; - err = bpf_obj_get_info_by_fd(fd, info, info_len); + err = bpf_map_get_info_by_fd(fd, info, info_len); if (err) { p_err("can't get map info: %s", strerror(errno)); close(fd); diff --git a/tools/bpf/bpftool/feature.c b/tools/bpf/bpftool/feature.c index 36cf0f1517c9..da16e6a27ccc 100644 --- a/tools/bpf/bpftool/feature.c +++ b/tools/bpf/bpftool/feature.c @@ -486,16 +486,16 @@ static void probe_kernel_image_config(const char *define_prefix) } } -end_parse: - if (file) - gzclose(file); - for (i = 0; i < ARRAY_SIZE(options); i++) { if (define_prefix && !options[i].macro_dump) continue; print_kernel_option(options[i].name, values[i], define_prefix); free(values[i]); } + +end_parse: + if (file) + gzclose(file); } static bool probe_bpf_syscall(const char *define_prefix) diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c index 6f4cfe01cad4..f985b79cca27 100644 --- a/tools/bpf/bpftool/link.c +++ b/tools/bpf/bpftool/link.c @@ -145,7 +145,7 @@ static int get_prog_info(int prog_id, struct bpf_prog_info *info) return prog_fd; memset(info, 0, sizeof(*info)); - err = bpf_obj_get_info_by_fd(prog_fd, info, &len); + err = bpf_prog_get_info_by_fd(prog_fd, info, &len); if (err) p_err("can't get prog info: %s", strerror(errno)); close(prog_fd); @@ -327,7 +327,7 @@ static int do_show_link(int fd) memset(&info, 0, sizeof(info)); again: - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_link_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get link info: %s", strerror(errno)); diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index a84224b6a604..0ef373cef4c7 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -168,7 +168,8 @@ int prog_parse_fd(int *argc, char ***argv); int prog_parse_fds(int *argc, char ***argv, int **fds); int map_parse_fd(int *argc, char ***argv); int map_parse_fds(int *argc, char ***argv, int **fds); -int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len); +int map_parse_fd_and_info(int *argc, char ***argv, struct bpf_map_info *info, + __u32 *info_len); struct bpf_prog_linfo; #if defined(HAVE_LLVM_SUPPORT) || defined(HAVE_LIBBFD_SUPPORT) diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index 88911d3aa2d9..aaeb8939e137 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -638,7 +638,7 @@ static int do_show_subset(int argc, char **argv) if (json_output && nb_fds > 1) jsonw_start_array(json_wtr); /* root array */ for (i = 0; i < nb_fds; i++) { - err = bpf_obj_get_info_by_fd(fds[i], &info, &len); + err = bpf_map_get_info_by_fd(fds[i], &info, &len); if (err) { p_err("can't get map info: %s", strerror(errno)); @@ -708,7 +708,7 @@ static int do_show(int argc, char **argv) break; } - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_map_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get map info: %s", strerror(errno)); close(fd); @@ -764,7 +764,7 @@ static int maps_have_btf(int *fds, int nb_fds) int err, i; for (i = 0; i < nb_fds; i++) { - err = bpf_obj_get_info_by_fd(fds[i], &info, &len); + err = bpf_map_get_info_by_fd(fds[i], &info, &len); if (err) { p_err("can't get map info: %s", strerror(errno)); return -1; @@ -925,7 +925,7 @@ static int do_dump(int argc, char **argv) if (wtr && nb_fds > 1) jsonw_start_array(wtr); /* root array */ for (i = 0; i < nb_fds; i++) { - if (bpf_obj_get_info_by_fd(fds[i], &info, &len)) { + if (bpf_map_get_info_by_fd(fds[i], &info, &len)) { p_err("can't get map info: %s", strerror(errno)); break; } diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index cfc9fdc1e863..afbe3ec342c8 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -198,7 +198,7 @@ static void show_prog_maps(int fd, __u32 num_maps) info.nr_map_ids = num_maps; info.map_ids = ptr_to_u64(map_ids); - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_prog_get_info_by_fd(fd, &info, &len); if (err || !info.nr_map_ids) return; @@ -231,7 +231,7 @@ static void *find_metadata(int prog_fd, struct bpf_map_info *map_info) memset(&prog_info, 0, sizeof(prog_info)); prog_info_len = sizeof(prog_info); - ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); + ret = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); if (ret) return NULL; @@ -248,7 +248,7 @@ static void *find_metadata(int prog_fd, struct bpf_map_info *map_info) prog_info.map_ids = ptr_to_u64(map_ids); prog_info_len = sizeof(prog_info); - ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); + ret = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); if (ret) goto free_map_ids; @@ -259,7 +259,7 @@ static void *find_metadata(int prog_fd, struct bpf_map_info *map_info) memset(map_info, 0, sizeof(*map_info)); map_info_len = sizeof(*map_info); - ret = bpf_obj_get_info_by_fd(map_fd, map_info, &map_info_len); + ret = bpf_map_get_info_by_fd(map_fd, map_info, &map_info_len); if (ret < 0) { close(map_fd); goto free_map_ids; @@ -580,7 +580,7 @@ static int show_prog(int fd) __u32 len = sizeof(info); int err; - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_prog_get_info_by_fd(fd, &info, &len); if (err) { p_err("can't get prog info: %s", strerror(errno)); return -1; @@ -949,7 +949,7 @@ static int do_dump(int argc, char **argv) for (i = 0; i < nb_fds; i++) { memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(fds[i], &info, &info_len); + err = bpf_prog_get_info_by_fd(fds[i], &info, &info_len); if (err) { p_err("can't get prog info: %s", strerror(errno)); break; @@ -961,7 +961,7 @@ static int do_dump(int argc, char **argv) break; } - err = bpf_obj_get_info_by_fd(fds[i], &info, &info_len); + err = bpf_prog_get_info_by_fd(fds[i], &info, &info_len); if (err) { p_err("can't get prog info: %s", strerror(errno)); break; @@ -2170,9 +2170,9 @@ static char *profile_target_name(int tgt_fd) char *name = NULL; int err; - err = bpf_obj_get_info_by_fd(tgt_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(tgt_fd, &info, &info_len); if (err) { - p_err("failed to bpf_obj_get_info_by_fd for prog FD %d", tgt_fd); + p_err("failed to get info for prog FD %d", tgt_fd); goto out; } @@ -2183,7 +2183,7 @@ static char *profile_target_name(int tgt_fd) func_info_rec_size = info.func_info_rec_size; if (info.nr_func_info == 0) { - p_err("bpf_obj_get_info_by_fd for prog FD %d found 0 func_info", tgt_fd); + p_err("found 0 func_info for prog FD %d", tgt_fd); goto out; } @@ -2192,7 +2192,7 @@ static char *profile_target_name(int tgt_fd) info.func_info_rec_size = func_info_rec_size; info.func_info = ptr_to_u64(&func_info); - err = bpf_obj_get_info_by_fd(tgt_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(tgt_fd, &info, &info_len); if (err) { p_err("failed to get func_info for prog FD %d", tgt_fd); goto out; @@ -2233,10 +2233,38 @@ static void profile_close_perf_events(struct profiler_bpf *obj) profile_perf_event_cnt = 0; } +static int profile_open_perf_event(int mid, int cpu, int map_fd) +{ + int pmu_fd; + + pmu_fd = syscall(__NR_perf_event_open, &metrics[mid].attr, + -1 /*pid*/, cpu, -1 /*group_fd*/, 0); + if (pmu_fd < 0) { + if (errno == ENODEV) { + p_info("cpu %d may be offline, skip %s profiling.", + cpu, metrics[mid].name); + profile_perf_event_cnt++; + return 0; + } + return -1; + } + + if (bpf_map_update_elem(map_fd, + &profile_perf_event_cnt, + &pmu_fd, BPF_ANY) || + ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0)) { + close(pmu_fd); + return -1; + } + + profile_perf_events[profile_perf_event_cnt++] = pmu_fd; + return 0; +} + static int profile_open_perf_events(struct profiler_bpf *obj) { unsigned int cpu, m; - int map_fd, pmu_fd; + int map_fd; profile_perf_events = calloc( sizeof(int), obj->rodata->num_cpu * obj->rodata->num_metric); @@ -2255,17 +2283,11 @@ static int profile_open_perf_events(struct profiler_bpf *obj) if (!metrics[m].selected) continue; for (cpu = 0; cpu < obj->rodata->num_cpu; cpu++) { - pmu_fd = syscall(__NR_perf_event_open, &metrics[m].attr, - -1/*pid*/, cpu, -1/*group_fd*/, 0); - if (pmu_fd < 0 || - bpf_map_update_elem(map_fd, &profile_perf_event_cnt, - &pmu_fd, BPF_ANY) || - ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0)) { + if (profile_open_perf_event(m, cpu, map_fd)) { p_err("failed to create event %s on cpu %d", metrics[m].name, cpu); return -1; } - profile_perf_events[profile_perf_event_cnt++] = pmu_fd; } } return 0; diff --git a/tools/bpf/bpftool/struct_ops.c b/tools/bpf/bpftool/struct_ops.c index 903b80ff4e9a..b389f4830e11 100644 --- a/tools/bpf/bpftool/struct_ops.c +++ b/tools/bpf/bpftool/struct_ops.c @@ -151,7 +151,7 @@ static int get_next_struct_ops_map(const char *name, int *res_fd, return -1; } - err = bpf_obj_get_info_by_fd(fd, info, &info_len); + err = bpf_map_get_info_by_fd(fd, info, &info_len); if (err) { p_err("can't get map info: %s", strerror(errno)); close(fd); @@ -262,7 +262,7 @@ static struct res do_one_id(const char *id_str, work_func func, void *data, goto done; } - if (bpf_obj_get_info_by_fd(fd, info, &info_len)) { + if (bpf_map_get_info_by_fd(fd, info, &info_len)) { p_err("can't get map info: %s", strerror(errno)); res.nr_errs++; goto done; @@ -522,7 +522,7 @@ static int do_register(int argc, char **argv) bpf_link__disconnect(link); bpf_link__destroy(link); - if (!bpf_obj_get_info_by_fd(bpf_map__fd(map), &info, + if (!bpf_map_get_info_by_fd(bpf_map__fd(map), &info, &info_len)) p_info("Registered %s %s id %u", get_kern_struct_ops_name(&info), diff --git a/tools/bpf/resolve_btfids/Build b/tools/bpf/resolve_btfids/Build index ae82da03f9bf..077de3829c72 100644 --- a/tools/bpf/resolve_btfids/Build +++ b/tools/bpf/resolve_btfids/Build @@ -1,3 +1,5 @@ +hostprogs := resolve_btfids + resolve_btfids-y += main.o resolve_btfids-y += rbtree.o resolve_btfids-y += zalloc.o @@ -7,4 +9,4 @@ resolve_btfids-y += str_error_r.o $(OUTPUT)%.o: ../../lib/%.c FORCE $(call rule_mkdir) - $(call if_changed_dep,cc_o_c) + $(call if_changed_dep,host_cc_o_c) diff --git a/tools/bpf/resolve_btfids/Makefile b/tools/bpf/resolve_btfids/Makefile index 19a3112e271a..ac548a7baa73 100644 --- a/tools/bpf/resolve_btfids/Makefile +++ b/tools/bpf/resolve_btfids/Makefile @@ -17,15 +17,15 @@ else MAKEFLAGS=--no-print-directory endif -# always use the host compiler -AR = $(HOSTAR) -CC = $(HOSTCC) -LD = $(HOSTLD) -ARCH = $(HOSTARCH) +# Overrides for the prepare step libraries. +HOST_OVERRIDES := AR="$(HOSTAR)" CC="$(HOSTCC)" LD="$(HOSTLD)" ARCH="$(HOSTARCH)" \ + CROSS_COMPILE="" EXTRA_CFLAGS="$(HOSTCFLAGS)" + RM ?= rm +HOSTCC ?= gcc +HOSTLD ?= ld +HOSTAR ?= ar CROSS_COMPILE = -CFLAGS := $(KBUILD_HOSTCFLAGS) -LDFLAGS := $(KBUILD_HOSTLDFLAGS) OUTPUT ?= $(srctree)/tools/bpf/resolve_btfids/ @@ -35,51 +35,64 @@ SUBCMD_SRC := $(srctree)/tools/lib/subcmd/ BPFOBJ := $(OUTPUT)/libbpf/libbpf.a LIBBPF_OUT := $(abspath $(dir $(BPFOBJ)))/ SUBCMDOBJ := $(OUTPUT)/libsubcmd/libsubcmd.a +SUBCMD_OUT := $(abspath $(dir $(SUBCMDOBJ)))/ LIBBPF_DESTDIR := $(LIBBPF_OUT) LIBBPF_INCLUDE := $(LIBBPF_DESTDIR)include +SUBCMD_DESTDIR := $(SUBCMD_OUT) +SUBCMD_INCLUDE := $(SUBCMD_DESTDIR)include + BINARY := $(OUTPUT)/resolve_btfids BINARY_IN := $(BINARY)-in.o all: $(BINARY) +prepare: $(BPFOBJ) $(SUBCMDOBJ) + $(OUTPUT) $(OUTPUT)/libsubcmd $(LIBBPF_OUT): $(call msg,MKDIR,,$@) $(Q)mkdir -p $(@) $(SUBCMDOBJ): fixdep FORCE | $(OUTPUT)/libsubcmd - $(Q)$(MAKE) -C $(SUBCMD_SRC) OUTPUT=$(abspath $(dir $@))/ $(abspath $@) + $(Q)$(MAKE) -C $(SUBCMD_SRC) OUTPUT=$(SUBCMD_OUT) \ + DESTDIR=$(SUBCMD_DESTDIR) $(HOST_OVERRIDES) prefix= subdir= \ + $(abspath $@) install_headers $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(LIBBPF_OUT) $(Q)$(MAKE) $(submake_extras) -C $(LIBBPF_SRC) OUTPUT=$(LIBBPF_OUT) \ - DESTDIR=$(LIBBPF_DESTDIR) prefix= EXTRA_CFLAGS="$(CFLAGS)" \ + DESTDIR=$(LIBBPF_DESTDIR) $(HOST_OVERRIDES) prefix= subdir= \ $(abspath $@) install_headers -CFLAGS += -g \ +LIBELF_FLAGS := $(shell $(HOSTPKG_CONFIG) libelf --cflags 2>/dev/null) +LIBELF_LIBS := $(shell $(HOSTPKG_CONFIG) libelf --libs 2>/dev/null || echo -lelf) + +HOSTCFLAGS += -g \ -I$(srctree)/tools/include \ -I$(srctree)/tools/include/uapi \ -I$(LIBBPF_INCLUDE) \ - -I$(SUBCMD_SRC) + -I$(SUBCMD_INCLUDE) \ + $(LIBELF_FLAGS) -LIBS = -lelf -lz +LIBS = $(LIBELF_LIBS) -lz -export srctree OUTPUT CFLAGS Q +export srctree OUTPUT HOSTCFLAGS Q HOSTCC HOSTLD HOSTAR include $(srctree)/tools/build/Makefile.include -$(BINARY_IN): $(BPFOBJ) fixdep FORCE | $(OUTPUT) +$(BINARY_IN): fixdep FORCE prepare | $(OUTPUT) $(Q)$(MAKE) $(build)=resolve_btfids $(BINARY): $(BPFOBJ) $(SUBCMDOBJ) $(BINARY_IN) $(call msg,LINK,$@) - $(Q)$(CC) $(BINARY_IN) $(LDFLAGS) -o $@ $(BPFOBJ) $(SUBCMDOBJ) $(LIBS) + $(Q)$(HOSTCC) $(BINARY_IN) $(KBUILD_HOSTLDFLAGS) -o $@ $(BPFOBJ) $(SUBCMDOBJ) $(LIBS) clean_objects := $(wildcard $(OUTPUT)/*.o \ $(OUTPUT)/.*.o.cmd \ $(OUTPUT)/.*.o.d \ $(LIBBPF_OUT) \ $(LIBBPF_DESTDIR) \ - $(OUTPUT)/libsubcmd \ + $(SUBCMD_OUT) \ + $(SUBCMD_DESTDIR) \ $(OUTPUT)/resolve_btfids) ifneq ($(clean_objects),) @@ -96,4 +109,4 @@ tags: FORCE: -.PHONY: all FORCE clean tags +.PHONY: all FORCE clean tags prepare diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c index 80cd7843c677..77058174082d 100644 --- a/tools/bpf/resolve_btfids/main.c +++ b/tools/bpf/resolve_btfids/main.c @@ -75,7 +75,7 @@ #include <linux/err.h> #include <bpf/btf.h> #include <bpf/libbpf.h> -#include <parse-options.h> +#include <subcmd/parse-options.h> #define BTF_IDS_SECTION ".BTF_ids" #define BTF_ID "__BTF_ID__" diff --git a/tools/bpf/runqslower/Makefile b/tools/bpf/runqslower/Makefile index 8b3d87b82b7a..47acf6936516 100644 --- a/tools/bpf/runqslower/Makefile +++ b/tools/bpf/runqslower/Makefile @@ -13,6 +13,8 @@ BPF_DESTDIR := $(BPFOBJ_OUTPUT) BPF_INCLUDE := $(BPF_DESTDIR)/include INCLUDES := -I$(OUTPUT) -I$(BPF_INCLUDE) -I$(abspath ../../include/uapi) CFLAGS := -g -Wall $(CLANG_CROSS_FLAGS) +CFLAGS += $(EXTRA_CFLAGS) +LDFLAGS += $(EXTRA_LDFLAGS) # Try to detect best kernel BTF source KERNEL_REL := $(shell uname -r) diff --git a/tools/include/uapi/asm/bpf_perf_event.h b/tools/include/uapi/asm/bpf_perf_event.h index d7dfeab0d71a..ff52668abf8c 100644 --- a/tools/include/uapi/asm/bpf_perf_event.h +++ b/tools/include/uapi/asm/bpf_perf_event.h @@ -6,6 +6,8 @@ #include "../../arch/s390/include/uapi/asm/bpf_perf_event.h" #elif defined(__riscv) #include "../../arch/riscv/include/uapi/asm/bpf_perf_event.h" +#elif defined(__loongarch__) +#include "../../arch/loongarch/include/uapi/asm/bpf_perf_event.h" #else #include <uapi/asm-generic/bpf_perf_event.h> #endif diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 464ca3f01fe7..62ce1f5d1b1d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1156,6 +1156,11 @@ enum bpf_link_type { */ #define BPF_F_XDP_HAS_FRAGS (1U << 5) +/* If BPF_F_XDP_DEV_BOUND_ONLY is used in BPF_PROG_LOAD command, the loaded + * program becomes device-bound but can access XDP metadata. + */ +#define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) + /* link_create.kprobe_multi.flags used in LINK_CREATE command for * BPF_TRACE_KPROBE_MULTI attach type to create return probe. */ @@ -2001,6 +2006,9 @@ union bpf_attr { * sending the packet. This flag was added for GRE * encapsulation, but might be used with other protocols * as well in the future. + * **BPF_F_NO_TUNNEL_KEY** + * Add a flag to tunnel metadata indicating that no tunnel + * key should be set in the resulting tunnel header. * * Here is a typical usage on the transmit path: * @@ -2644,6 +2652,11 @@ union bpf_attr { * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the * L2 type as Ethernet. * + * * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**, + * **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**: + * Indicate the new IP header version after decapsulating the outer + * IP header. Used when the inner and outer IP versions are different. + * * A call to this helper is susceptible to change the underlying * packet buffer. Therefore, at load time, all checks on pointers * previously done by the verifier are invalidated and must be @@ -2788,7 +2801,7 @@ union bpf_attr { * * long bpf_perf_prog_read_value(struct bpf_perf_event_data *ctx, struct bpf_perf_event_value *buf, u32 buf_size) * Description - * For en eBPF program attached to a perf event, retrieve the + * For an eBPF program attached to a perf event, retrieve the * value of the event counter associated to *ctx* and store it in * the structure pointed by *buf* and of size *buf_size*. Enabled * and running times are also stored in the structure (see @@ -3121,6 +3134,11 @@ union bpf_attr { * **BPF_FIB_LOOKUP_OUTPUT** * Perform lookup from an egress perspective (default is * ingress). + * **BPF_FIB_LOOKUP_SKIP_NEIGH** + * Skip the neighbour table lookup. *params*->dmac + * and *params*->smac will not be set as output. A common + * use case is to call **bpf_redirect_neigh**\ () after + * doing **bpf_fib_lookup**\ (). * * *ctx* is either **struct xdp_md** for XDP programs or * **struct sk_buff** tc cls_act programs. @@ -5764,6 +5782,7 @@ enum { BPF_F_ZERO_CSUM_TX = (1ULL << 1), BPF_F_DONT_FRAGMENT = (1ULL << 2), BPF_F_SEQ_NUMBER = (1ULL << 3), + BPF_F_NO_TUNNEL_KEY = (1ULL << 4), }; /* BPF_FUNC_skb_get_tunnel_key flags. */ @@ -5803,6 +5822,8 @@ enum { BPF_F_ADJ_ROOM_ENCAP_L4_UDP = (1ULL << 4), BPF_F_ADJ_ROOM_NO_CSUM_RESET = (1ULL << 5), BPF_F_ADJ_ROOM_ENCAP_L2_ETH = (1ULL << 6), + BPF_F_ADJ_ROOM_DECAP_L3_IPV4 = (1ULL << 7), + BPF_F_ADJ_ROOM_DECAP_L3_IPV6 = (1ULL << 8), }; enum { @@ -6734,6 +6755,7 @@ struct bpf_raw_tracepoint_args { enum { BPF_FIB_LOOKUP_DIRECT = (1U << 0), BPF_FIB_LOOKUP_OUTPUT = (1U << 1), + BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), }; enum { @@ -6901,6 +6923,17 @@ struct bpf_list_node { __u64 :64; } __attribute__((aligned(8))); +struct bpf_rb_root { + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + +struct bpf_rb_node { + __u64 :64; + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + struct bpf_sysctl { __u32 write; /* Sysctl is being read (= 0) or written (= 1). * Allows 1,2,4-byte read, but no write. diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h new file mode 100644 index 000000000000..9ee459872600 --- /dev/null +++ b/tools/include/uapi/linux/netdev.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* Do not edit directly, auto-generated from: */ +/* Documentation/netlink/specs/netdev.yaml */ +/* YNL-GEN uapi header */ + +#ifndef _UAPI_LINUX_NETDEV_H +#define _UAPI_LINUX_NETDEV_H + +#define NETDEV_FAMILY_NAME "netdev" +#define NETDEV_FAMILY_VERSION 1 + +/** + * enum netdev_xdp_act + * @NETDEV_XDP_ACT_BASIC: XDP feautues set supported by all drivers + * (XDP_ABORTED, XDP_DROP, XDP_PASS, XDP_TX) + * @NETDEV_XDP_ACT_REDIRECT: The netdev supports XDP_REDIRECT + * @NETDEV_XDP_ACT_NDO_XMIT: This feature informs if netdev implements + * ndo_xdp_xmit callback. + * @NETDEV_XDP_ACT_XSK_ZEROCOPY: This feature informs if netdev supports AF_XDP + * in zero copy mode. + * @NETDEV_XDP_ACT_HW_OFFLOAD: This feature informs if netdev supports XDP hw + * oflloading. + * @NETDEV_XDP_ACT_RX_SG: This feature informs if netdev implements non-linear + * XDP buffer support in the driver napi callback. + * @NETDEV_XDP_ACT_NDO_XMIT_SG: This feature informs if netdev implements + * non-linear XDP buffer support in ndo_xdp_xmit callback. + */ +enum netdev_xdp_act { + NETDEV_XDP_ACT_BASIC = 1, + NETDEV_XDP_ACT_REDIRECT = 2, + NETDEV_XDP_ACT_NDO_XMIT = 4, + NETDEV_XDP_ACT_XSK_ZEROCOPY = 8, + NETDEV_XDP_ACT_HW_OFFLOAD = 16, + NETDEV_XDP_ACT_RX_SG = 32, + NETDEV_XDP_ACT_NDO_XMIT_SG = 64, +}; + +enum { + NETDEV_A_DEV_IFINDEX = 1, + NETDEV_A_DEV_PAD, + NETDEV_A_DEV_XDP_FEATURES, + + __NETDEV_A_DEV_MAX, + NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) +}; + +enum { + NETDEV_CMD_DEV_GET = 1, + NETDEV_CMD_DEV_ADD_NTF, + NETDEV_CMD_DEV_DEL_NTF, + NETDEV_CMD_DEV_CHANGE_NTF, + + __NETDEV_CMD_MAX, + NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) +}; + +#define NETDEV_MCGRP_MGMT "mgmt" + +#endif /* _UAPI_LINUX_NETDEV_H */ diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 9aff98f42a3d..e750b6f5fcc3 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -1044,6 +1044,26 @@ int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len) return libbpf_err_errno(err); } +int bpf_prog_get_info_by_fd(int prog_fd, struct bpf_prog_info *info, __u32 *info_len) +{ + return bpf_obj_get_info_by_fd(prog_fd, info, info_len); +} + +int bpf_map_get_info_by_fd(int map_fd, struct bpf_map_info *info, __u32 *info_len) +{ + return bpf_obj_get_info_by_fd(map_fd, info, info_len); +} + +int bpf_btf_get_info_by_fd(int btf_fd, struct bpf_btf_info *info, __u32 *info_len) +{ + return bpf_obj_get_info_by_fd(btf_fd, info, info_len); +} + +int bpf_link_get_info_by_fd(int link_fd, struct bpf_link_info *info, __u32 *info_len) +{ + return bpf_obj_get_info_by_fd(link_fd, info, info_len); +} + int bpf_raw_tracepoint_open(const char *name, int prog_fd) { const size_t attr_sz = offsetofend(union bpf_attr, raw_tracepoint); diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 7468978d3c27..9ed9bceb4111 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -386,6 +386,15 @@ LIBBPF_API int bpf_link_get_fd_by_id(__u32 id); LIBBPF_API int bpf_link_get_fd_by_id_opts(__u32 id, const struct bpf_get_fd_by_id_opts *opts); LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len); +/* Type-safe variants of bpf_obj_get_info_by_fd(). The callers still needs to + * pass info_len, which should normally be + * sizeof(struct bpf_{prog,map,btf,link}_info), in order to be compatible with + * different libbpf and kernel versions. + */ +LIBBPF_API int bpf_prog_get_info_by_fd(int prog_fd, struct bpf_prog_info *info, __u32 *info_len); +LIBBPF_API int bpf_map_get_info_by_fd(int map_fd, struct bpf_map_info *info, __u32 *info_len); +LIBBPF_API int bpf_btf_get_info_by_fd(int btf_fd, struct bpf_btf_info *info, __u32 *info_len); +LIBBPF_API int bpf_link_get_info_by_fd(int link_fd, struct bpf_link_info *info, __u32 *info_len); struct bpf_prog_query_opts { size_t sz; /* size of this struct for forward/backward compatibility */ diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h index 496e6a8ee0dc..1ac57bb7ac55 100644 --- a/tools/lib/bpf/bpf_core_read.h +++ b/tools/lib/bpf/bpf_core_read.h @@ -364,7 +364,7 @@ enum bpf_enum_value_kind { /* Non-CO-RE variant of BPF_CORE_READ_INTO() */ #define BPF_PROBE_READ_INTO(dst, src, a, ...) ({ \ - ___core_read(bpf_probe_read, bpf_probe_read, \ + ___core_read(bpf_probe_read_kernel, bpf_probe_read_kernel, \ dst, (src), a, ##__VA_ARGS__) \ }) @@ -400,7 +400,7 @@ enum bpf_enum_value_kind { /* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */ #define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({ \ - ___core_read(bpf_probe_read_str, bpf_probe_read, \ + ___core_read(bpf_probe_read_kernel_str, bpf_probe_read_kernel, \ dst, (src), a, ##__VA_ARGS__) \ }) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index d37c4fe2849d..5ec1871acb2f 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -109,7 +109,7 @@ * This is a variable-specific variant of more global barrier(). */ #ifndef barrier_var -#define barrier_var(var) asm volatile("" : "=r"(var) : "0"(var)) +#define barrier_var(var) asm volatile("" : "+r"(var)) #endif /* diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h index 2972dc25ff72..6db88f41fa0d 100644 --- a/tools/lib/bpf/bpf_tracing.h +++ b/tools/lib/bpf/bpf_tracing.h @@ -32,6 +32,9 @@ #elif defined(__TARGET_ARCH_arc) #define bpf_target_arc #define bpf_target_defined +#elif defined(__TARGET_ARCH_loongarch) + #define bpf_target_loongarch + #define bpf_target_defined #else /* Fall back to what the compiler says */ @@ -62,6 +65,9 @@ #elif defined(__arc__) #define bpf_target_arc #define bpf_target_defined +#elif defined(__loongarch__) + #define bpf_target_loongarch + #define bpf_target_defined #endif /* no compiler target */ #endif @@ -72,6 +78,10 @@ #if defined(bpf_target_x86) +/* + * https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI + */ + #if defined(__KERNEL__) || defined(__VMLINUX_H__) #define __PT_PARM1_REG di @@ -79,25 +89,40 @@ #define __PT_PARM3_REG dx #define __PT_PARM4_REG cx #define __PT_PARM5_REG r8 +#define __PT_PARM6_REG r9 +/* + * Syscall uses r10 for PARM4. See arch/x86/entry/entry_64.S:entry_SYSCALL_64 + * comments in Linux sources. And refer to syscall(2) manpage. + */ +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG r10 +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG + #define __PT_RET_REG sp #define __PT_FP_REG bp #define __PT_RC_REG ax #define __PT_SP_REG sp #define __PT_IP_REG ip -/* syscall uses r10 for PARM4 */ -#define PT_REGS_PARM4_SYSCALL(x) ((x)->r10) -#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ(x, r10) #else #ifdef __i386__ +/* i386 kernel is built with -mregparm=3 */ #define __PT_PARM1_REG eax #define __PT_PARM2_REG edx #define __PT_PARM3_REG ecx -/* i386 kernel is built with -mregparm=3 */ -#define __PT_PARM4_REG __unsupported__ -#define __PT_PARM5_REG __unsupported__ +/* i386 syscall ABI is very different, refer to syscall(2) manpage */ +#define __PT_PARM1_SYSCALL_REG ebx +#define __PT_PARM2_SYSCALL_REG ecx +#define __PT_PARM3_SYSCALL_REG edx +#define __PT_PARM4_SYSCALL_REG esi +#define __PT_PARM5_SYSCALL_REG edi +#define __PT_PARM6_SYSCALL_REG ebp + #define __PT_RET_REG esp #define __PT_FP_REG ebp #define __PT_RC_REG eax @@ -111,14 +136,20 @@ #define __PT_PARM3_REG rdx #define __PT_PARM4_REG rcx #define __PT_PARM5_REG r8 +#define __PT_PARM6_REG r9 + +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG r10 +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG + #define __PT_RET_REG rsp #define __PT_FP_REG rbp #define __PT_RC_REG rax #define __PT_SP_REG rsp #define __PT_IP_REG rip -/* syscall uses r10 for PARM4 */ -#define PT_REGS_PARM4_SYSCALL(x) ((x)->r10) -#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ(x, r10) #endif /* __i386__ */ @@ -126,6 +157,10 @@ #elif defined(bpf_target_s390) +/* + * https://github.com/IBM/s390x-abi/releases/download/v1.6/lzsabi_s390x.pdf + */ + struct pt_regs___s390 { unsigned long orig_gpr2; }; @@ -137,21 +172,41 @@ struct pt_regs___s390 { #define __PT_PARM3_REG gprs[4] #define __PT_PARM4_REG gprs[5] #define __PT_PARM5_REG gprs[6] -#define __PT_RET_REG grps[14] + +#define __PT_PARM1_SYSCALL_REG orig_gpr2 +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG gprs[7] +#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x) +#define PT_REGS_PARM1_CORE_SYSCALL(x) \ + BPF_CORE_READ((const struct pt_regs___s390 *)(x), __PT_PARM1_SYSCALL_REG) + +#define __PT_RET_REG gprs[14] #define __PT_FP_REG gprs[11] /* Works only with CONFIG_FRAME_POINTER */ #define __PT_RC_REG gprs[2] #define __PT_SP_REG gprs[15] #define __PT_IP_REG psw.addr -#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x) -#define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ((const struct pt_regs___s390 *)(x), orig_gpr2) #elif defined(bpf_target_arm) +/* + * https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#machine-registers + */ + #define __PT_PARM1_REG uregs[0] #define __PT_PARM2_REG uregs[1] #define __PT_PARM3_REG uregs[2] #define __PT_PARM4_REG uregs[3] -#define __PT_PARM5_REG uregs[4] + +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM6_SYSCALL_REG uregs[5] +#define __PT_PARM7_SYSCALL_REG uregs[6] + #define __PT_RET_REG uregs[14] #define __PT_FP_REG uregs[11] /* Works only with CONFIG_FRAME_POINTER */ #define __PT_RC_REG uregs[0] @@ -160,6 +215,10 @@ struct pt_regs___s390 { #elif defined(bpf_target_arm64) +/* + * https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#machine-registers + */ + struct pt_regs___arm64 { unsigned long orig_x0; }; @@ -171,21 +230,49 @@ struct pt_regs___arm64 { #define __PT_PARM3_REG regs[2] #define __PT_PARM4_REG regs[3] #define __PT_PARM5_REG regs[4] +#define __PT_PARM6_REG regs[5] +#define __PT_PARM7_REG regs[6] +#define __PT_PARM8_REG regs[7] + +#define __PT_PARM1_SYSCALL_REG orig_x0 +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG +#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x) +#define PT_REGS_PARM1_CORE_SYSCALL(x) \ + BPF_CORE_READ((const struct pt_regs___arm64 *)(x), __PT_PARM1_SYSCALL_REG) + #define __PT_RET_REG regs[30] #define __PT_FP_REG regs[29] /* Works only with CONFIG_FRAME_POINTER */ #define __PT_RC_REG regs[0] #define __PT_SP_REG sp #define __PT_IP_REG pc -#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x) -#define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ((const struct pt_regs___arm64 *)(x), orig_x0) #elif defined(bpf_target_mips) +/* + * N64 ABI is assumed right now. + * https://en.wikipedia.org/wiki/MIPS_architecture#Calling_conventions + */ + #define __PT_PARM1_REG regs[4] #define __PT_PARM2_REG regs[5] #define __PT_PARM3_REG regs[6] #define __PT_PARM4_REG regs[7] #define __PT_PARM5_REG regs[8] +#define __PT_PARM6_REG regs[9] +#define __PT_PARM7_REG regs[10] +#define __PT_PARM8_REG regs[11] + +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG /* only N32/N64 */ +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG /* only N32/N64 */ + #define __PT_RET_REG regs[31] #define __PT_FP_REG regs[30] /* Works only with CONFIG_FRAME_POINTER */ #define __PT_RC_REG regs[2] @@ -194,26 +281,58 @@ struct pt_regs___arm64 { #elif defined(bpf_target_powerpc) +/* + * http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf (page 3-14, + * section "Function Calling Sequence") + */ + #define __PT_PARM1_REG gpr[3] #define __PT_PARM2_REG gpr[4] #define __PT_PARM3_REG gpr[5] #define __PT_PARM4_REG gpr[6] #define __PT_PARM5_REG gpr[7] +#define __PT_PARM6_REG gpr[8] +#define __PT_PARM7_REG gpr[9] +#define __PT_PARM8_REG gpr[10] + +/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER. */ +#define PT_REGS_SYSCALL_REGS(ctx) ctx +#define __PT_PARM1_SYSCALL_REG orig_gpr3 +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG +#if !defined(__arch64__) +#define __PT_PARM7_SYSCALL_REG __PT_PARM7_REG /* only powerpc (not powerpc64) */ +#endif + #define __PT_RET_REG regs[31] #define __PT_FP_REG __unsupported__ #define __PT_RC_REG gpr[3] #define __PT_SP_REG sp #define __PT_IP_REG nip -/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER. */ -#define PT_REGS_SYSCALL_REGS(ctx) ctx #elif defined(bpf_target_sparc) +/* + * https://en.wikipedia.org/wiki/Calling_convention#SPARC + */ + #define __PT_PARM1_REG u_regs[UREG_I0] #define __PT_PARM2_REG u_regs[UREG_I1] #define __PT_PARM3_REG u_regs[UREG_I2] #define __PT_PARM4_REG u_regs[UREG_I3] #define __PT_PARM5_REG u_regs[UREG_I4] +#define __PT_PARM6_REG u_regs[UREG_I5] + +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG + #define __PT_RET_REG u_regs[UREG_I7] #define __PT_FP_REG __unsupported__ #define __PT_RC_REG u_regs[UREG_I0] @@ -227,22 +346,42 @@ struct pt_regs___arm64 { #elif defined(bpf_target_riscv) +/* + * https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#risc-v-calling-conventions + */ + #define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x)) #define __PT_PARM1_REG a0 #define __PT_PARM2_REG a1 #define __PT_PARM3_REG a2 #define __PT_PARM4_REG a3 #define __PT_PARM5_REG a4 +#define __PT_PARM6_REG a5 +#define __PT_PARM7_REG a6 +#define __PT_PARM8_REG a7 + +/* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */ +#define PT_REGS_SYSCALL_REGS(ctx) ctx +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG + #define __PT_RET_REG ra #define __PT_FP_REG s0 #define __PT_RC_REG a0 #define __PT_SP_REG sp #define __PT_IP_REG pc -/* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */ -#define PT_REGS_SYSCALL_REGS(ctx) ctx #elif defined(bpf_target_arc) +/* + * Section "Function Calling Sequence" (page 24): + * https://raw.githubusercontent.com/wiki/foss-for-synopsys-dwc-arc-processors/toolchain/files/ARCv2_ABI.pdf + */ + /* arc provides struct user_pt_regs instead of struct pt_regs to userspace */ #define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x)) #define __PT_PARM1_REG scratch.r0 @@ -250,13 +389,55 @@ struct pt_regs___arm64 { #define __PT_PARM3_REG scratch.r2 #define __PT_PARM4_REG scratch.r3 #define __PT_PARM5_REG scratch.r4 +#define __PT_PARM6_REG scratch.r5 +#define __PT_PARM7_REG scratch.r6 +#define __PT_PARM8_REG scratch.r7 + +/* arc does not select ARCH_HAS_SYSCALL_WRAPPER. */ +#define PT_REGS_SYSCALL_REGS(ctx) ctx +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG + #define __PT_RET_REG scratch.blink -#define __PT_FP_REG __unsupported__ +#define __PT_FP_REG scratch.fp #define __PT_RC_REG scratch.r0 #define __PT_SP_REG scratch.sp #define __PT_IP_REG scratch.ret -/* arc does not select ARCH_HAS_SYSCALL_WRAPPER. */ + +#elif defined(bpf_target_loongarch) + +/* + * https://docs.kernel.org/loongarch/introduction.html + * https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html + */ + +#define __PT_PARM1_REG regs[4] +#define __PT_PARM2_REG regs[5] +#define __PT_PARM3_REG regs[6] +#define __PT_PARM4_REG regs[7] +#define __PT_PARM5_REG regs[8] +#define __PT_PARM6_REG regs[9] +#define __PT_PARM7_REG regs[10] +#define __PT_PARM8_REG regs[11] + +/* loongarch does not select ARCH_HAS_SYSCALL_WRAPPER. */ #define PT_REGS_SYSCALL_REGS(ctx) ctx +#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG +#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG +#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG +#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG +#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG +#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG + +#define __PT_RET_REG regs[1] +#define __PT_FP_REG regs[22] +#define __PT_RC_REG regs[4] +#define __PT_SP_REG regs[3] +#define __PT_IP_REG csr_era #endif @@ -264,16 +445,49 @@ struct pt_regs___arm64 { struct pt_regs; -/* allow some architecutres to override `struct pt_regs` */ +/* allow some architectures to override `struct pt_regs` */ #ifndef __PT_REGS_CAST #define __PT_REGS_CAST(x) (x) #endif +/* + * Different architectures support different number of arguments passed + * through registers. i386 supports just 3, some arches support up to 8. + */ +#ifndef __PT_PARM4_REG +#define __PT_PARM4_REG __unsupported__ +#endif +#ifndef __PT_PARM5_REG +#define __PT_PARM5_REG __unsupported__ +#endif +#ifndef __PT_PARM6_REG +#define __PT_PARM6_REG __unsupported__ +#endif +#ifndef __PT_PARM7_REG +#define __PT_PARM7_REG __unsupported__ +#endif +#ifndef __PT_PARM8_REG +#define __PT_PARM8_REG __unsupported__ +#endif +/* + * Similarly, syscall-specific conventions might differ between function call + * conventions within each architecutre. All supported architectures pass + * either 6 or 7 syscall arguments in registers. + * + * See syscall(2) manpage for succinct table with information on each arch. + */ +#ifndef __PT_PARM7_SYSCALL_REG +#define __PT_PARM7_SYSCALL_REG __unsupported__ +#endif + #define PT_REGS_PARM1(x) (__PT_REGS_CAST(x)->__PT_PARM1_REG) #define PT_REGS_PARM2(x) (__PT_REGS_CAST(x)->__PT_PARM2_REG) #define PT_REGS_PARM3(x) (__PT_REGS_CAST(x)->__PT_PARM3_REG) #define PT_REGS_PARM4(x) (__PT_REGS_CAST(x)->__PT_PARM4_REG) #define PT_REGS_PARM5(x) (__PT_REGS_CAST(x)->__PT_PARM5_REG) +#define PT_REGS_PARM6(x) (__PT_REGS_CAST(x)->__PT_PARM6_REG) +#define PT_REGS_PARM7(x) (__PT_REGS_CAST(x)->__PT_PARM7_REG) +#define PT_REGS_PARM8(x) (__PT_REGS_CAST(x)->__PT_PARM8_REG) #define PT_REGS_RET(x) (__PT_REGS_CAST(x)->__PT_RET_REG) #define PT_REGS_FP(x) (__PT_REGS_CAST(x)->__PT_FP_REG) #define PT_REGS_RC(x) (__PT_REGS_CAST(x)->__PT_RC_REG) @@ -285,6 +499,9 @@ struct pt_regs; #define PT_REGS_PARM3_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM3_REG) #define PT_REGS_PARM4_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM4_REG) #define PT_REGS_PARM5_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM5_REG) +#define PT_REGS_PARM6_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM6_REG) +#define PT_REGS_PARM7_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM7_REG) +#define PT_REGS_PARM8_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM8_REG) #define PT_REGS_RET_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RET_REG) #define PT_REGS_FP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_FP_REG) #define PT_REGS_RC_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RC_REG) @@ -311,24 +528,33 @@ struct pt_regs; #endif #ifndef PT_REGS_PARM1_SYSCALL -#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1(x) +#define PT_REGS_PARM1_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM1_SYSCALL_REG) +#define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM1_SYSCALL_REG) +#endif +#ifndef PT_REGS_PARM2_SYSCALL +#define PT_REGS_PARM2_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM2_SYSCALL_REG) +#define PT_REGS_PARM2_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM2_SYSCALL_REG) +#endif +#ifndef PT_REGS_PARM3_SYSCALL +#define PT_REGS_PARM3_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM3_SYSCALL_REG) +#define PT_REGS_PARM3_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM3_SYSCALL_REG) #endif -#define PT_REGS_PARM2_SYSCALL(x) PT_REGS_PARM2(x) -#define PT_REGS_PARM3_SYSCALL(x) PT_REGS_PARM3(x) #ifndef PT_REGS_PARM4_SYSCALL -#define PT_REGS_PARM4_SYSCALL(x) PT_REGS_PARM4(x) +#define PT_REGS_PARM4_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM4_SYSCALL_REG) +#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM4_SYSCALL_REG) #endif -#define PT_REGS_PARM5_SYSCALL(x) PT_REGS_PARM5(x) - -#ifndef PT_REGS_PARM1_CORE_SYSCALL -#define PT_REGS_PARM1_CORE_SYSCALL(x) PT_REGS_PARM1_CORE(x) +#ifndef PT_REGS_PARM5_SYSCALL +#define PT_REGS_PARM5_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM5_SYSCALL_REG) +#define PT_REGS_PARM5_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM5_SYSCALL_REG) #endif -#define PT_REGS_PARM2_CORE_SYSCALL(x) PT_REGS_PARM2_CORE(x) -#define PT_REGS_PARM3_CORE_SYSCALL(x) PT_REGS_PARM3_CORE(x) -#ifndef PT_REGS_PARM4_CORE_SYSCALL -#define PT_REGS_PARM4_CORE_SYSCALL(x) PT_REGS_PARM4_CORE(x) +#ifndef PT_REGS_PARM6_SYSCALL +#define PT_REGS_PARM6_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM6_SYSCALL_REG) +#define PT_REGS_PARM6_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM6_SYSCALL_REG) +#endif +#ifndef PT_REGS_PARM7_SYSCALL +#define PT_REGS_PARM7_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM7_SYSCALL_REG) +#define PT_REGS_PARM7_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM7_SYSCALL_REG) #endif -#define PT_REGS_PARM5_CORE_SYSCALL(x) PT_REGS_PARM5_CORE(x) #else /* defined(bpf_target_defined) */ @@ -337,6 +563,9 @@ struct pt_regs; #define PT_REGS_PARM3(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM4(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM5(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM6(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM7(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM8(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_RET(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_FP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_RC(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) @@ -348,6 +577,9 @@ struct pt_regs; #define PT_REGS_PARM3_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM4_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM5_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM6_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM7_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM8_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_RET_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_FP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_RC_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) @@ -362,12 +594,16 @@ struct pt_regs; #define PT_REGS_PARM3_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM4_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM5_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM6_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM7_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM1_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM2_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM3_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM4_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #define PT_REGS_PARM5_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM6_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) +#define PT_REGS_PARM7_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; }) #endif /* defined(bpf_target_defined) */ @@ -553,6 +789,9 @@ struct pt_regs; #define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx) #define ___bpf_kprobe_args4(x, args...) ___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx) #define ___bpf_kprobe_args5(x, args...) ___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx) +#define ___bpf_kprobe_args6(x, args...) ___bpf_kprobe_args5(args), (void *)PT_REGS_PARM6(ctx) +#define ___bpf_kprobe_args7(x, args...) ___bpf_kprobe_args6(args), (void *)PT_REGS_PARM7(ctx) +#define ___bpf_kprobe_args8(x, args...) ___bpf_kprobe_args7(args), (void *)PT_REGS_PARM8(ctx) #define ___bpf_kprobe_args(args...) ___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args) /* @@ -609,6 +848,8 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args) #define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (void *)PT_REGS_PARM3_SYSCALL(regs) #define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (void *)PT_REGS_PARM4_SYSCALL(regs) #define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (void *)PT_REGS_PARM5_SYSCALL(regs) +#define ___bpf_syscall_args6(x, args...) ___bpf_syscall_args5(args), (void *)PT_REGS_PARM6_SYSCALL(regs) +#define ___bpf_syscall_args7(x, args...) ___bpf_syscall_args6(args), (void *)PT_REGS_PARM7_SYSCALL(regs) #define ___bpf_syscall_args(args...) ___bpf_apply(___bpf_syscall_args, ___bpf_narg(args))(args) /* If kernel doesn't have CONFIG_ARCH_HAS_SYSCALL_WRAPPER, we have to BPF_CORE_READ from pt_regs */ @@ -618,6 +859,8 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args) #define ___bpf_syswrap_args3(x, args...) ___bpf_syswrap_args2(args), (void *)PT_REGS_PARM3_CORE_SYSCALL(regs) #define ___bpf_syswrap_args4(x, args...) ___bpf_syswrap_args3(args), (void *)PT_REGS_PARM4_CORE_SYSCALL(regs) #define ___bpf_syswrap_args5(x, args...) ___bpf_syswrap_args4(args), (void *)PT_REGS_PARM5_CORE_SYSCALL(regs) +#define ___bpf_syswrap_args6(x, args...) ___bpf_syswrap_args5(args), (void *)PT_REGS_PARM6_CORE_SYSCALL(regs) +#define ___bpf_syswrap_args7(x, args...) ___bpf_syswrap_args6(args), (void *)PT_REGS_PARM7_CORE_SYSCALL(regs) #define ___bpf_syswrap_args(args...) ___bpf_apply(___bpf_syswrap_args, ___bpf_narg(args))(args) /* @@ -667,4 +910,11 @@ ____##name(struct pt_regs *ctx, ##args) #define BPF_KPROBE_SYSCALL BPF_KSYSCALL +/* BPF_UPROBE and BPF_URETPROBE are identical to BPF_KPROBE and BPF_KRETPROBE, + * but are named way less confusingly for SEC("uprobe") and SEC("uretprobe") + * use cases. + */ +#define BPF_UPROBE(name, args...) BPF_KPROBE(name, ##args) +#define BPF_URETPROBE(name, args...) BPF_KRETPROBE(name, ##args) + #endif diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 71e165b09ed5..9181d36118d2 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -688,8 +688,21 @@ int btf__align_of(const struct btf *btf, __u32 id) if (align <= 0) return libbpf_err(align); max_align = max(max_align, align); + + /* if field offset isn't aligned according to field + * type's alignment, then struct must be packed + */ + if (btf_member_bitfield_size(t, i) == 0 && + (m->offset % (8 * align)) != 0) + return 1; } + /* if struct/union size isn't a multiple of its alignment, + * then struct must be packed + */ + if ((t->size % max_align) != 0) + return 1; + return max_align; } default: @@ -990,7 +1003,8 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf, err = 0; if (!btf_data) { - err = -ENOENT; + pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path); + err = -ENODATA; goto done; } btf = btf_new(btf_data->d_buf, btf_data->d_size, base_btf); @@ -1336,9 +1350,9 @@ struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) void *ptr; int err; - /* we won't know btf_size until we call bpf_obj_get_info_by_fd(). so + /* we won't know btf_size until we call bpf_btf_get_info_by_fd(). so * let's start with a sane default - 4KiB here - and resize it only if - * bpf_obj_get_info_by_fd() needs a bigger buffer. + * bpf_btf_get_info_by_fd() needs a bigger buffer. */ last_size = 4096; ptr = malloc(last_size); @@ -1348,7 +1362,7 @@ struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) memset(&btf_info, 0, sizeof(btf_info)); btf_info.btf = ptr_to_u64(ptr); btf_info.btf_size = last_size; - err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len); + err = bpf_btf_get_info_by_fd(btf_fd, &btf_info, &len); if (!err && btf_info.btf_size > last_size) { void *temp_ptr; @@ -1366,7 +1380,7 @@ struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf) btf_info.btf = ptr_to_u64(ptr); btf_info.btf_size = last_size; - err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len); + err = bpf_btf_get_info_by_fd(btf_fd, &btf_info, &len); } if (err || btf_info.btf_size > last_size) { diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index deb2bc9a0a7b..580985ee5545 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -13,6 +13,7 @@ #include <ctype.h> #include <endian.h> #include <errno.h> +#include <limits.h> #include <linux/err.h> #include <linux/btf.h> #include <linux/kernel.h> @@ -833,14 +834,9 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id, const struct btf_type *t) { const struct btf_member *m; - int align, i, bit_sz; + int max_align = 1, align, i, bit_sz; __u16 vlen; - align = btf__align_of(btf, id); - /* size of a non-packed struct has to be a multiple of its alignment*/ - if (align && t->size % align) - return true; - m = btf_members(t); vlen = btf_vlen(t); /* all non-bitfield fields have to be naturally aligned */ @@ -849,8 +845,11 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id, bit_sz = btf_member_bitfield_size(t, i); if (align && bit_sz == 0 && m->offset % (8 * align) != 0) return true; + max_align = max(align, max_align); } - + /* size of a non-packed struct has to be a multiple of its alignment */ + if (t->size % max_align != 0) + return true; /* * if original struct was marked as packed, but its layout is * naturally aligned, we'll detect that it's not packed @@ -858,44 +857,97 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id, return false; } -static int chip_away_bits(int total, int at_most) -{ - return total % at_most ? : at_most; -} - static void btf_dump_emit_bit_padding(const struct btf_dump *d, - int cur_off, int m_off, int m_bit_sz, - int align, int lvl) + int cur_off, int next_off, int next_align, + bool in_bitfield, int lvl) { - int off_diff = m_off - cur_off; - int ptr_bits = d->ptr_sz * 8; + const struct { + const char *name; + int bits; + } pads[] = { + {"long", d->ptr_sz * 8}, {"int", 32}, {"short", 16}, {"char", 8} + }; + int new_off, pad_bits, bits, i; + const char *pad_type; + + if (cur_off >= next_off) + return; /* no gap */ + + /* For filling out padding we want to take advantage of + * natural alignment rules to minimize unnecessary explicit + * padding. First, we find the largest type (among long, int, + * short, or char) that can be used to force naturally aligned + * boundary. Once determined, we'll use such type to fill in + * the remaining padding gap. In some cases we can rely on + * compiler filling some gaps, but sometimes we need to force + * alignment to close natural alignment with markers like + * `long: 0` (this is always the case for bitfields). Note + * that even if struct itself has, let's say 4-byte alignment + * (i.e., it only uses up to int-aligned types), using `long: + * X;` explicit padding doesn't actually change struct's + * overall alignment requirements, but compiler does take into + * account that type's (long, in this example) natural + * alignment requirements when adding implicit padding. We use + * this fact heavily and don't worry about ruining correct + * struct alignment requirement. + */ + for (i = 0; i < ARRAY_SIZE(pads); i++) { + pad_bits = pads[i].bits; + pad_type = pads[i].name; - if (off_diff <= 0) - /* no gap */ - return; - if (m_bit_sz == 0 && off_diff < align * 8) - /* natural padding will take care of a gap */ - return; + new_off = roundup(cur_off, pad_bits); + if (new_off <= next_off) + break; + } - while (off_diff > 0) { - const char *pad_type; - int pad_bits; - - if (ptr_bits > 32 && off_diff > 32) { - pad_type = "long"; - pad_bits = chip_away_bits(off_diff, ptr_bits); - } else if (off_diff > 16) { - pad_type = "int"; - pad_bits = chip_away_bits(off_diff, 32); - } else if (off_diff > 8) { - pad_type = "short"; - pad_bits = chip_away_bits(off_diff, 16); - } else { - pad_type = "char"; - pad_bits = chip_away_bits(off_diff, 8); + if (new_off > cur_off && new_off <= next_off) { + /* We need explicit `<type>: 0` aligning mark if next + * field is right on alignment offset and its + * alignment requirement is less strict than <type>'s + * alignment (so compiler won't naturally align to the + * offset we expect), or if subsequent `<type>: X`, + * will actually completely fit in the remaining hole, + * making compiler basically ignore `<type>: X` + * completely. + */ + if (in_bitfield || + (new_off == next_off && roundup(cur_off, next_align * 8) != new_off) || + (new_off != next_off && next_off - new_off <= new_off - cur_off)) + /* but for bitfields we'll emit explicit bit count */ + btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, + in_bitfield ? new_off - cur_off : 0); + cur_off = new_off; + } + + /* Now we know we start at naturally aligned offset for a chosen + * padding type (long, int, short, or char), and so the rest is just + * a straightforward filling of remaining padding gap with full + * `<type>: sizeof(<type>);` markers, except for the last one, which + * might need smaller than sizeof(<type>) padding. + */ + while (cur_off != next_off) { + bits = min(next_off - cur_off, pad_bits); + if (bits == pad_bits) { + btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, pad_bits); + cur_off += bits; + continue; + } + /* For the remainder padding that doesn't cover entire + * pad_type bit length, we pick the smallest necessary type. + * This is pure aesthetics, we could have just used `long`, + * but having smallest necessary one communicates better the + * scale of the padding gap. + */ + for (i = ARRAY_SIZE(pads) - 1; i >= 0; i--) { + pad_type = pads[i].name; + pad_bits = pads[i].bits; + if (pad_bits < bits) + continue; + + btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, bits); + cur_off += bits; + break; } - btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, pad_bits); - off_diff -= pad_bits; } } @@ -915,9 +967,11 @@ static void btf_dump_emit_struct_def(struct btf_dump *d, { const struct btf_member *m = btf_members(t); bool is_struct = btf_is_struct(t); - int align, i, packed, off = 0; + bool packed, prev_bitfield = false; + int align, i, off = 0; __u16 vlen = btf_vlen(t); + align = btf__align_of(d->btf, id); packed = is_struct ? btf_is_struct_packed(d->btf, id, t) : 0; btf_dump_printf(d, "%s%s%s {", @@ -927,41 +981,47 @@ static void btf_dump_emit_struct_def(struct btf_dump *d, for (i = 0; i < vlen; i++, m++) { const char *fname; - int m_off, m_sz; + int m_off, m_sz, m_align; + bool in_bitfield; fname = btf_name_of(d, m->name_off); m_sz = btf_member_bitfield_size(t, i); m_off = btf_member_bit_offset(t, i); - align = packed ? 1 : btf__align_of(d->btf, m->type); + m_align = packed ? 1 : btf__align_of(d->btf, m->type); + + in_bitfield = prev_bitfield && m_sz != 0; - btf_dump_emit_bit_padding(d, off, m_off, m_sz, align, lvl + 1); + btf_dump_emit_bit_padding(d, off, m_off, m_align, in_bitfield, lvl + 1); btf_dump_printf(d, "\n%s", pfx(lvl + 1)); btf_dump_emit_type_decl(d, m->type, fname, lvl + 1); if (m_sz) { btf_dump_printf(d, ": %d", m_sz); off = m_off + m_sz; + prev_bitfield = true; } else { m_sz = max((__s64)0, btf__resolve_size(d->btf, m->type)); off = m_off + m_sz * 8; + prev_bitfield = false; } + btf_dump_printf(d, ";"); } /* pad at the end, if necessary */ - if (is_struct) { - align = packed ? 1 : btf__align_of(d->btf, id); - btf_dump_emit_bit_padding(d, off, t->size * 8, 0, align, - lvl + 1); - } + if (is_struct) + btf_dump_emit_bit_padding(d, off, t->size * 8, align, false, lvl + 1); /* * Keep `struct empty {}` on a single line, * only print newline when there are regular or padding fields. */ - if (vlen || t->size) + if (vlen || t->size) { btf_dump_printf(d, "\n"); - btf_dump_printf(d, "%s}", pfx(lvl)); + btf_dump_printf(d, "%s}", pfx(lvl)); + } else { + btf_dump_printf(d, "}"); + } if (packed) btf_dump_printf(d, " __attribute__((packed))"); } @@ -1073,6 +1133,43 @@ static void btf_dump_emit_enum_def(struct btf_dump *d, __u32 id, else btf_dump_emit_enum64_val(d, t, lvl, vlen); btf_dump_printf(d, "\n%s}", pfx(lvl)); + + /* special case enums with special sizes */ + if (t->size == 1) { + /* one-byte enums can be forced with mode(byte) attribute */ + btf_dump_printf(d, " __attribute__((mode(byte)))"); + } else if (t->size == 8 && d->ptr_sz == 8) { + /* enum can be 8-byte sized if one of the enumerator values + * doesn't fit in 32-bit integer, or by adding mode(word) + * attribute (but probably only on 64-bit architectures); do + * our best here to try to satisfy the contract without adding + * unnecessary attributes + */ + bool needs_word_mode; + + if (btf_is_enum(t)) { + /* enum can't represent 64-bit values, so we need word mode */ + needs_word_mode = true; + } else { + /* enum64 needs mode(word) if none of its values has + * non-zero upper 32-bits (which means that all values + * fit in 32-bit integers and won't cause compiler to + * bump enum to be 64-bit naturally + */ + int i; + + needs_word_mode = true; + for (i = 0; i < vlen; i++) { + if (btf_enum64(t)[i].val_hi32 != 0) { + needs_word_mode = false; + break; + } + } + } + if (needs_word_mode) + btf_dump_printf(d, " __attribute__((mode(word)))"); + } + } static void btf_dump_emit_fwd_def(struct btf_dump *d, __u32 id, diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2a82f49ce16f..05c4db355f28 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -34,7 +34,6 @@ #include <linux/limits.h> #include <linux/perf_event.h> #include <linux/ring_buffer.h> -#include <linux/version.h> #include <sys/epoll.h> #include <sys/ioctl.h> #include <sys/mman.h> @@ -870,42 +869,6 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; } -__u32 get_kernel_version(void) -{ - /* On Ubuntu LINUX_VERSION_CODE doesn't correspond to info.release, - * but Ubuntu provides /proc/version_signature file, as described at - * https://ubuntu.com/kernel, with an example contents below, which we - * can use to get a proper LINUX_VERSION_CODE. - * - * Ubuntu 5.4.0-12.15-generic 5.4.8 - * - * In the above, 5.4.8 is what kernel is actually expecting, while - * uname() call will return 5.4.0 in info.release. - */ - const char *ubuntu_kver_file = "/proc/version_signature"; - __u32 major, minor, patch; - struct utsname info; - - if (faccessat(AT_FDCWD, ubuntu_kver_file, R_OK, AT_EACCESS) == 0) { - FILE *f; - - f = fopen(ubuntu_kver_file, "r"); - if (f) { - if (fscanf(f, "%*s %*s %d.%d.%d\n", &major, &minor, &patch) == 3) { - fclose(f); - return KERNEL_VERSION(major, minor, patch); - } - fclose(f); - } - /* something went wrong, fall back to uname() approach */ - } - - uname(&info); - if (sscanf(info.release, "%u.%u.%u", &major, &minor, &patch) != 3) - return 0; - return KERNEL_VERSION(major, minor, patch); -} - static const struct btf_member * find_member_by_offset(const struct btf_type *t, __u32 bit_offset) { @@ -4382,7 +4345,7 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd) char *new_name; memset(&info, 0, len); - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_map_get_info_by_fd(fd, &info, &len); if (err && errno == EINVAL) err = bpf_get_map_info_from_fdinfo(fd, &info); if (err) @@ -4766,7 +4729,7 @@ static int probe_module_btf(void) * kernel's module BTF support coincides with support for * name/name_len fields in struct bpf_btf_info. */ - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_btf_get_info_by_fd(fd, &info, &len); close(fd); return !err; } @@ -4929,7 +4892,7 @@ static bool map_is_reuse_compat(const struct bpf_map *map, int map_fd) int err; memset(&map_info, 0, map_info_len); - err = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len); + err = bpf_map_get_info_by_fd(map_fd, &map_info, &map_info_len); if (err && errno == EINVAL) err = bpf_get_map_info_from_fdinfo(map_fd, &map_info); if (err) { @@ -5474,7 +5437,7 @@ static int load_module_btfs(struct bpf_object *obj) info.name = ptr_to_u64(name); info.name_len = sizeof(name); - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_btf_get_info_by_fd(fd, &info, &len); if (err) { err = -errno; pr_warn("failed to get BTF object #%d info: %d\n", id, err); @@ -7355,7 +7318,7 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj) if (!bpf_map__is_internal(m)) continue; if (!kernel_supports(obj, FEAT_ARRAY_MMAP)) - m->def.map_flags ^= BPF_F_MMAPABLE; + m->def.map_flags &= ~BPF_F_MMAPABLE; } return 0; @@ -8605,6 +8568,7 @@ static const struct bpf_sec_def section_defs[] = { SEC_DEF("cgroup/setsockopt", CGROUP_SOCKOPT, BPF_CGROUP_SETSOCKOPT, SEC_ATTACHABLE), SEC_DEF("cgroup/dev", CGROUP_DEVICE, BPF_CGROUP_DEVICE, SEC_ATTACHABLE_OPT), SEC_DEF("struct_ops+", STRUCT_OPS, 0, SEC_NONE), + SEC_DEF("struct_ops.s+", STRUCT_OPS, 0, SEC_SLEEPABLE), SEC_DEF("sk_lookup", SK_LOOKUP, BPF_SK_LOOKUP, SEC_ATTACHABLE), }; @@ -9066,9 +9030,9 @@ static int libbpf_find_prog_btf_id(const char *name, __u32 attach_prog_fd) int err; memset(&info, 0, info_len); - err = bpf_obj_get_info_by_fd(attach_prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(attach_prog_fd, &info, &info_len); if (err) { - pr_warn("failed bpf_obj_get_info_by_fd for FD %d: %d\n", + pr_warn("failed bpf_prog_get_info_by_fd for FD %d: %d\n", attach_prog_fd, err); return err; } @@ -9903,7 +9867,7 @@ static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name, char errmsg[STRERR_BUFSIZE]; int type, pfd; - if (ref_ctr_off >= (1ULL << PERF_UPROBE_REF_CTR_OFFSET_BITS)) + if ((__u64)ref_ctr_off >= (1ULL << PERF_UPROBE_REF_CTR_OFFSET_BITS)) return -EINVAL; memset(&attr, 0, attr_sz); @@ -9994,9 +9958,16 @@ static void gen_kprobe_legacy_event_name(char *buf, size_t buf_sz, const char *kfunc_name, size_t offset) { static int index = 0; + int i; snprintf(buf, buf_sz, "libbpf_%u_%s_0x%zx_%d", getpid(), kfunc_name, offset, __sync_fetch_and_add(&index, 1)); + + /* sanitize binary_path in the probe name */ + for (i = 0; buf[i]; i++) { + if (!isalnum(buf[i])) + buf[i] = '_'; + } } static int add_kprobe_event_legacy(const char *probe_name, bool retprobe, @@ -11702,17 +11673,22 @@ struct perf_buffer *perf_buffer__new(int map_fd, size_t page_cnt, const size_t attr_sz = sizeof(struct perf_event_attr); struct perf_buffer_params p = {}; struct perf_event_attr attr; + __u32 sample_period; if (!OPTS_VALID(opts, perf_buffer_opts)) return libbpf_err_ptr(-EINVAL); + sample_period = OPTS_GET(opts, sample_period, 1); + if (!sample_period) + sample_period = 1; + memset(&attr, 0, attr_sz); attr.size = attr_sz; attr.config = PERF_COUNT_SW_BPF_OUTPUT; attr.type = PERF_TYPE_SOFTWARE; attr.sample_type = PERF_SAMPLE_RAW; - attr.sample_period = 1; - attr.wakeup_events = 1; + attr.sample_period = sample_period; + attr.wakeup_events = sample_period; p.attr = &attr; p.sample_cb = sample_cb; @@ -11765,7 +11741,7 @@ static struct perf_buffer *__perf_buffer__new(int map_fd, size_t page_cnt, /* best-effort sanity checks */ memset(&map, 0, sizeof(map)); map_info_len = sizeof(map); - err = bpf_obj_get_info_by_fd(map_fd, &map, &map_info_len); + err = bpf_map_get_info_by_fd(map_fd, &map, &map_info_len); if (err) { err = -errno; /* if BPF_OBJ_GET_INFO_BY_FD is supported, will return diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index eee883f007f9..2efd80f6f7b9 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -96,6 +96,12 @@ enum libbpf_print_level { typedef int (*libbpf_print_fn_t)(enum libbpf_print_level level, const char *, va_list ap); +/** + * @brief **libbpf_set_print()** sets user-provided log callback function to + * be used for libbpf warnings and informational messages. + * @param fn The log print function. If NULL, libbpf won't print anything. + * @return Pointer to old print function. + */ LIBBPF_API libbpf_print_fn_t libbpf_set_print(libbpf_print_fn_t fn); /* Hide internal to user */ @@ -174,6 +180,14 @@ struct bpf_object_open_opts { }; #define bpf_object_open_opts__last_field kernel_log_level +/** + * @brief **bpf_object__open()** creates a bpf_object by opening + * the BPF ELF object file pointed to by the passed path and loading it + * into memory. + * @param path BPF object file path. + * @return pointer to the new bpf_object; or NULL is returned on error, + * error code is stored in errno + */ LIBBPF_API struct bpf_object *bpf_object__open(const char *path); /** @@ -203,16 +217,46 @@ LIBBPF_API struct bpf_object * bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz, const struct bpf_object_open_opts *opts); -/* Load/unload object into/from kernel */ +/** + * @brief **bpf_object__load()** loads BPF object into kernel. + * @param obj Pointer to a valid BPF object instance returned by + * **bpf_object__open*()** APIs + * @return 0, on success; negative error code, otherwise, error code is + * stored in errno + */ LIBBPF_API int bpf_object__load(struct bpf_object *obj); -LIBBPF_API void bpf_object__close(struct bpf_object *object); +/** + * @brief **bpf_object__close()** closes a BPF object and releases all + * resources. + * @param obj Pointer to a valid BPF object + */ +LIBBPF_API void bpf_object__close(struct bpf_object *obj); -/* pin_maps and unpin_maps can both be called with a NULL path, in which case - * they will use the pin_path attribute of each map (and ignore all maps that - * don't have a pin_path set). +/** + * @brief **bpf_object__pin_maps()** pins each map contained within + * the BPF object at the passed directory. + * @param obj Pointer to a valid BPF object + * @param path A directory where maps should be pinned. + * @return 0, on success; negative error code, otherwise + * + * If `path` is NULL `bpf_map__pin` (which is being used on each map) + * will use the pin_path attribute of each map. In this case, maps that + * don't have a pin_path set will be ignored. */ LIBBPF_API int bpf_object__pin_maps(struct bpf_object *obj, const char *path); + +/** + * @brief **bpf_object__unpin_maps()** unpins each map contained within + * the BPF object found in the passed directory. + * @param obj Pointer to a valid BPF object + * @param path A directory where pinned maps should be searched for. + * @return 0, on success; negative error code, otherwise + * + * If `path` is NULL `bpf_map__unpin` (which is being used on each map) + * will use the pin_path attribute of each map. In this case, maps that + * don't have a pin_path set will be ignored. + */ LIBBPF_API int bpf_object__unpin_maps(struct bpf_object *obj, const char *path); LIBBPF_API int bpf_object__pin_programs(struct bpf_object *obj, @@ -823,10 +867,57 @@ LIBBPF_API const void *bpf_map__initial_value(struct bpf_map *map, size_t *psize * @return true, if the map is an internal map; false, otherwise */ LIBBPF_API bool bpf_map__is_internal(const struct bpf_map *map); + +/** + * @brief **bpf_map__set_pin_path()** sets the path attribute that tells where the + * BPF map should be pinned. This does not actually create the 'pin'. + * @param map The bpf_map + * @param path The path + * @return 0, on success; negative error, otherwise + */ LIBBPF_API int bpf_map__set_pin_path(struct bpf_map *map, const char *path); + +/** + * @brief **bpf_map__pin_path()** gets the path attribute that tells where the + * BPF map should be pinned. + * @param map The bpf_map + * @return The path string; which can be NULL + */ LIBBPF_API const char *bpf_map__pin_path(const struct bpf_map *map); + +/** + * @brief **bpf_map__is_pinned()** tells the caller whether or not the + * passed map has been pinned via a 'pin' file. + * @param map The bpf_map + * @return true, if the map is pinned; false, otherwise + */ LIBBPF_API bool bpf_map__is_pinned(const struct bpf_map *map); + +/** + * @brief **bpf_map__pin()** creates a file that serves as a 'pin' + * for the BPF map. This increments the reference count on the + * BPF map which will keep the BPF map loaded even after the + * userspace process which loaded it has exited. + * @param map The bpf_map to pin + * @param path A file path for the 'pin' + * @return 0, on success; negative error, otherwise + * + * If `path` is NULL the maps `pin_path` attribute will be used. If this is + * also NULL, an error will be returned and the map will not be pinned. + */ LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path); + +/** + * @brief **bpf_map__unpin()** removes the file that serves as a + * 'pin' for the BPF map. + * @param map The bpf_map to unpin + * @param path A file path for the 'pin' + * @return 0, on success; negative error, otherwise + * + * The `path` parameter can be NULL, in which case the `pin_path` + * map attribute is unpinned. If both the `path` parameter and + * `pin_path` map attribute are set, they must be equal. + */ LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path); LIBBPF_API int bpf_map__set_inner_map_fd(struct bpf_map *map, int fd); @@ -957,9 +1048,10 @@ struct bpf_xdp_query_opts { __u32 hw_prog_id; /* output */ __u32 skb_prog_id; /* output */ __u8 attach_mode; /* output */ + __u64 feature_flags; /* output */ size_t :0; }; -#define bpf_xdp_query_opts__last_field attach_mode +#define bpf_xdp_query_opts__last_field feature_flags LIBBPF_API int bpf_xdp_attach(int ifindex, int prog_fd, __u32 flags, const struct bpf_xdp_attach_opts *opts); @@ -1039,7 +1131,8 @@ struct user_ring_buffer_opts { #define user_ring_buffer_opts__last_field sz -/* @brief **user_ring_buffer__new()** creates a new instance of a user ring +/** + * @brief **user_ring_buffer__new()** creates a new instance of a user ring * buffer. * * @param map_fd A file descriptor to a BPF_MAP_TYPE_USER_RINGBUF map. @@ -1050,7 +1143,8 @@ struct user_ring_buffer_opts { LIBBPF_API struct user_ring_buffer * user_ring_buffer__new(int map_fd, const struct user_ring_buffer_opts *opts); -/* @brief **user_ring_buffer__reserve()** reserves a pointer to a sample in the +/** + * @brief **user_ring_buffer__reserve()** reserves a pointer to a sample in the * user ring buffer. * @param rb A pointer to a user ring buffer. * @param size The size of the sample, in bytes. @@ -1070,7 +1164,8 @@ user_ring_buffer__new(int map_fd, const struct user_ring_buffer_opts *opts); */ LIBBPF_API void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size); -/* @brief **user_ring_buffer__reserve_blocking()** reserves a record in the +/** + * @brief **user_ring_buffer__reserve_blocking()** reserves a record in the * ring buffer, possibly blocking for up to @timeout_ms until a sample becomes * available. * @param rb The user ring buffer. @@ -1114,7 +1209,8 @@ LIBBPF_API void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb, __u32 size, int timeout_ms); -/* @brief **user_ring_buffer__submit()** submits a previously reserved sample +/** + * @brief **user_ring_buffer__submit()** submits a previously reserved sample * into the ring buffer. * @param rb The user ring buffer. * @param sample A reserved sample. @@ -1124,7 +1220,8 @@ LIBBPF_API void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb, */ LIBBPF_API void user_ring_buffer__submit(struct user_ring_buffer *rb, void *sample); -/* @brief **user_ring_buffer__discard()** discards a previously reserved sample. +/** + * @brief **user_ring_buffer__discard()** discards a previously reserved sample. * @param rb The user ring buffer. * @param sample A reserved sample. * @@ -1133,7 +1230,8 @@ LIBBPF_API void user_ring_buffer__submit(struct user_ring_buffer *rb, void *samp */ LIBBPF_API void user_ring_buffer__discard(struct user_ring_buffer *rb, void *sample); -/* @brief **user_ring_buffer__free()** frees a ring buffer that was previously +/** + * @brief **user_ring_buffer__free()** frees a ring buffer that was previously * created with **user_ring_buffer__new()**. * @param rb The user ring buffer being freed. */ @@ -1149,8 +1247,10 @@ typedef void (*perf_buffer_lost_fn)(void *ctx, int cpu, __u64 cnt); /* common use perf buffer options */ struct perf_buffer_opts { size_t sz; + __u32 sample_period; + size_t :0; }; -#define perf_buffer_opts__last_field sz +#define perf_buffer_opts__last_field sample_period /** * @brief **perf_buffer__new()** creates BPF perfbuf manager for a specified diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 71bf5691a689..50dde1f6521e 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -382,3 +382,11 @@ LIBBPF_1.1.0 { user_ring_buffer__reserve_blocking; user_ring_buffer__submit; } LIBBPF_1.0.0; + +LIBBPF_1.2.0 { + global: + bpf_btf_get_info_by_fd; + bpf_link_get_info_by_fd; + bpf_map_get_info_by_fd; + bpf_prog_get_info_by_fd; +} LIBBPF_1.1.0; diff --git a/tools/lib/bpf/libbpf_errno.c b/tools/lib/bpf/libbpf_errno.c index 96f67a772a1b..6b180172ec6b 100644 --- a/tools/lib/bpf/libbpf_errno.c +++ b/tools/lib/bpf/libbpf_errno.c @@ -39,14 +39,14 @@ static const char *libbpf_strerror_table[NR_ERRNO] = { int libbpf_strerror(int err, char *buf, size_t size) { + int ret; + if (!buf || !size) return libbpf_err(-EINVAL); err = err > 0 ? err : -err; if (err < __LIBBPF_ERRNO__START) { - int ret; - ret = strerror_r(err, buf, size); buf[size - 1] = '\0'; return libbpf_err_errno(ret); @@ -56,12 +56,20 @@ int libbpf_strerror(int err, char *buf, size_t size) const char *msg; msg = libbpf_strerror_table[ERRNO_OFFSET(err)]; - snprintf(buf, size, "%s", msg); + ret = snprintf(buf, size, "%s", msg); buf[size - 1] = '\0'; + /* The length of the buf and msg is positive. + * A negative number may be returned only when the + * size exceeds INT_MAX. Not likely to appear. + */ + if (ret >= size) + return libbpf_err(-ERANGE); return 0; } - snprintf(buf, size, "Unknown libbpf error %d", err); + ret = snprintf(buf, size, "Unknown libbpf error %d", err); buf[size - 1] = '\0'; + if (ret >= size) + return libbpf_err(-ERANGE); return libbpf_err(-ENOENT); } diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 377642ff51fc..fbaf68335394 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -20,8 +20,8 @@ /* make sure libbpf doesn't use kernel-only integer typedefs */ #pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64 -/* prevent accidental re-addition of reallocarray() */ -#pragma GCC poison reallocarray +/* prevent accidental re-addition of reallocarray()/strlcpy() */ +#pragma GCC poison reallocarray strlcpy #include "libbpf.h" #include "btf.h" @@ -543,6 +543,7 @@ static inline int ensure_good_fd(int fd) fd = fcntl(fd, F_DUPFD_CLOEXEC, 3); saved_errno = errno; close(old_fd); + errno = saved_errno; if (fd < 0) { pr_warn("failed to dup FD %d to FD > 2: %d\n", old_fd, -saved_errno); errno = saved_errno; diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index b44fcbb4b42e..4f3bc968ff8e 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -12,11 +12,94 @@ #include <linux/btf.h> #include <linux/filter.h> #include <linux/kernel.h> +#include <linux/version.h> #include "bpf.h" #include "libbpf.h" #include "libbpf_internal.h" +/* On Ubuntu LINUX_VERSION_CODE doesn't correspond to info.release, + * but Ubuntu provides /proc/version_signature file, as described at + * https://ubuntu.com/kernel, with an example contents below, which we + * can use to get a proper LINUX_VERSION_CODE. + * + * Ubuntu 5.4.0-12.15-generic 5.4.8 + * + * In the above, 5.4.8 is what kernel is actually expecting, while + * uname() call will return 5.4.0 in info.release. + */ +static __u32 get_ubuntu_kernel_version(void) +{ + const char *ubuntu_kver_file = "/proc/version_signature"; + __u32 major, minor, patch; + int ret; + FILE *f; + + if (faccessat(AT_FDCWD, ubuntu_kver_file, R_OK, AT_EACCESS) != 0) + return 0; + + f = fopen(ubuntu_kver_file, "r"); + if (!f) + return 0; + + ret = fscanf(f, "%*s %*s %u.%u.%u\n", &major, &minor, &patch); + fclose(f); + if (ret != 3) + return 0; + + return KERNEL_VERSION(major, minor, patch); +} + +/* On Debian LINUX_VERSION_CODE doesn't correspond to info.release. + * Instead, it is provided in info.version. An example content of + * Debian 10 looks like the below. + * + * utsname::release 4.19.0-22-amd64 + * utsname::version #1 SMP Debian 4.19.260-1 (2022-09-29) + * + * In the above, 4.19.260 is what kernel is actually expecting, while + * uname() call will return 4.19.0 in info.release. + */ +static __u32 get_debian_kernel_version(struct utsname *info) +{ + __u32 major, minor, patch; + char *p; + + p = strstr(info->version, "Debian "); + if (!p) { + /* This is not a Debian kernel. */ + return 0; + } + + if (sscanf(p, "Debian %u.%u.%u", &major, &minor, &patch) != 3) + return 0; + + return KERNEL_VERSION(major, minor, patch); +} + +__u32 get_kernel_version(void) +{ + __u32 major, minor, patch, version; + struct utsname info; + + /* Check if this is an Ubuntu kernel. */ + version = get_ubuntu_kernel_version(); + if (version != 0) + return version; + + uname(&info); + + /* Check if this is a Debian kernel. */ + version = get_debian_kernel_version(&info); + if (version != 0) + return version; + + if (sscanf(info.release, "%u.%u.%u", &major, &minor, &patch) != 3) + return 0; + + return KERNEL_VERSION(major, minor, patch); +} + static int probe_prog_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, size_t insns_cnt, char *log_buf, size_t log_buf_sz) diff --git a/tools/lib/bpf/libbpf_version.h b/tools/lib/bpf/libbpf_version.h index e944f5bce728..1fd2eeac5cfc 100644 --- a/tools/lib/bpf/libbpf_version.h +++ b/tools/lib/bpf/libbpf_version.h @@ -4,6 +4,6 @@ #define __LIBBPF_VERSION_H #define LIBBPF_MAJOR_VERSION 1 -#define LIBBPF_MINOR_VERSION 1 +#define LIBBPF_MINOR_VERSION 2 #endif /* __LIBBPF_VERSION_H */ diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 35104580870c..1653e7a8b0a1 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -9,6 +9,7 @@ #include <linux/if_ether.h> #include <linux/pkt_cls.h> #include <linux/rtnetlink.h> +#include <linux/netdev.h> #include <sys/socket.h> #include <errno.h> #include <time.h> @@ -39,9 +40,15 @@ struct xdp_id_md { int ifindex; __u32 flags; struct xdp_link_info info; + __u64 feature_flags; }; -static int libbpf_netlink_open(__u32 *nl_pid) +struct xdp_features_md { + int ifindex; + __u64 flags; +}; + +static int libbpf_netlink_open(__u32 *nl_pid, int proto) { struct sockaddr_nl sa; socklen_t addrlen; @@ -51,7 +58,7 @@ static int libbpf_netlink_open(__u32 *nl_pid) memset(&sa, 0, sizeof(sa)); sa.nl_family = AF_NETLINK; - sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE); + sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, proto); if (sock < 0) return -errno; @@ -212,14 +219,14 @@ done: } static int libbpf_netlink_send_recv(struct libbpf_nla_req *req, - __dump_nlmsg_t parse_msg, + int proto, __dump_nlmsg_t parse_msg, libbpf_dump_nlmsg_t parse_attr, void *cookie) { __u32 nl_pid = 0; int sock, ret; - sock = libbpf_netlink_open(&nl_pid); + sock = libbpf_netlink_open(&nl_pid, proto); if (sock < 0) return sock; @@ -238,6 +245,43 @@ out: return ret; } +static int parse_genl_family_id(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn, + void *cookie) +{ + struct genlmsghdr *gnl = NLMSG_DATA(nh); + struct nlattr *na = (struct nlattr *)((void *)gnl + GENL_HDRLEN); + struct nlattr *tb[CTRL_ATTR_FAMILY_ID + 1]; + __u16 *id = cookie; + + libbpf_nla_parse(tb, CTRL_ATTR_FAMILY_ID, na, + NLMSG_PAYLOAD(nh, sizeof(*gnl)), NULL); + if (!tb[CTRL_ATTR_FAMILY_ID]) + return NL_CONT; + + *id = libbpf_nla_getattr_u16(tb[CTRL_ATTR_FAMILY_ID]); + return NL_DONE; +} + +static int libbpf_netlink_resolve_genl_family_id(const char *name, + __u16 len, __u16 *id) +{ + struct libbpf_nla_req req = { + .nh.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN), + .nh.nlmsg_type = GENL_ID_CTRL, + .nh.nlmsg_flags = NLM_F_REQUEST, + .gnl.cmd = CTRL_CMD_GETFAMILY, + .gnl.version = 2, + }; + int err; + + err = nlattr_add(&req, CTRL_ATTR_FAMILY_NAME, name, len); + if (err < 0) + return err; + + return libbpf_netlink_send_recv(&req, NETLINK_GENERIC, + parse_genl_family_id, NULL, id); +} + static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd, __u32 flags) { @@ -271,7 +315,7 @@ static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd, } nlattr_end_nested(&req, nla); - return libbpf_netlink_send_recv(&req, NULL, NULL, NULL); + return libbpf_netlink_send_recv(&req, NETLINK_ROUTE, NULL, NULL, NULL); } int bpf_xdp_attach(int ifindex, int prog_fd, __u32 flags, const struct bpf_xdp_attach_opts *opts) @@ -357,6 +401,29 @@ static int get_xdp_info(void *cookie, void *msg, struct nlattr **tb) return 0; } +static int parse_xdp_features(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn, + void *cookie) +{ + struct genlmsghdr *gnl = NLMSG_DATA(nh); + struct nlattr *na = (struct nlattr *)((void *)gnl + GENL_HDRLEN); + struct nlattr *tb[NETDEV_CMD_MAX + 1]; + struct xdp_features_md *md = cookie; + __u32 ifindex; + + libbpf_nla_parse(tb, NETDEV_CMD_MAX, na, + NLMSG_PAYLOAD(nh, sizeof(*gnl)), NULL); + + if (!tb[NETDEV_A_DEV_IFINDEX] || !tb[NETDEV_A_DEV_XDP_FEATURES]) + return NL_CONT; + + ifindex = libbpf_nla_getattr_u32(tb[NETDEV_A_DEV_IFINDEX]); + if (ifindex != md->ifindex) + return NL_CONT; + + md->flags = libbpf_nla_getattr_u64(tb[NETDEV_A_DEV_XDP_FEATURES]); + return NL_DONE; +} + int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts) { struct libbpf_nla_req req = { @@ -366,6 +433,10 @@ int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts) .ifinfo.ifi_family = AF_PACKET, }; struct xdp_id_md xdp_id = {}; + struct xdp_features_md md = { + .ifindex = ifindex, + }; + __u16 id; int err; if (!OPTS_VALID(opts, bpf_xdp_query_opts)) @@ -382,7 +453,7 @@ int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts) xdp_id.ifindex = ifindex; xdp_id.flags = xdp_flags; - err = libbpf_netlink_send_recv(&req, __dump_link_nlmsg, + err = libbpf_netlink_send_recv(&req, NETLINK_ROUTE, __dump_link_nlmsg, get_xdp_info, &xdp_id); if (err) return libbpf_err(err); @@ -393,6 +464,31 @@ int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts) OPTS_SET(opts, skb_prog_id, xdp_id.info.skb_prog_id); OPTS_SET(opts, attach_mode, xdp_id.info.attach_mode); + if (!OPTS_HAS(opts, feature_flags)) + return 0; + + err = libbpf_netlink_resolve_genl_family_id("netdev", sizeof("netdev"), &id); + if (err < 0) + return libbpf_err(err); + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN); + req.nh.nlmsg_flags = NLM_F_REQUEST; + req.nh.nlmsg_type = id; + req.gnl.cmd = NETDEV_CMD_DEV_GET; + req.gnl.version = 2; + + err = nlattr_add(&req, NETDEV_A_DEV_IFINDEX, &ifindex, sizeof(ifindex)); + if (err < 0) + return libbpf_err(err); + + err = libbpf_netlink_send_recv(&req, NETLINK_GENERIC, + parse_xdp_features, NULL, &md); + if (err) + return libbpf_err(err); + + opts->feature_flags = md.flags; + return 0; } @@ -493,7 +589,7 @@ static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags) if (ret < 0) return ret; - return libbpf_netlink_send_recv(&req, NULL, NULL, NULL); + return libbpf_netlink_send_recv(&req, NETLINK_ROUTE, NULL, NULL, NULL); } static int tc_qdisc_create_excl(struct bpf_tc_hook *hook) @@ -593,7 +689,7 @@ static int tc_add_fd_and_name(struct libbpf_nla_req *req, int fd) int len, ret; memset(&info, 0, info_len); - ret = bpf_obj_get_info_by_fd(fd, &info, &info_len); + ret = bpf_prog_get_info_by_fd(fd, &info, &info_len); if (ret < 0) return ret; @@ -673,7 +769,8 @@ int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) info.opts = opts; - ret = libbpf_netlink_send_recv(&req, get_tc_info, NULL, &info); + ret = libbpf_netlink_send_recv(&req, NETLINK_ROUTE, get_tc_info, NULL, + &info); if (ret < 0) return libbpf_err(ret); if (!info.processed) @@ -739,7 +836,7 @@ static int __bpf_tc_detach(const struct bpf_tc_hook *hook, return ret; } - return libbpf_netlink_send_recv(&req, NULL, NULL, NULL); + return libbpf_netlink_send_recv(&req, NETLINK_ROUTE, NULL, NULL, NULL); } int bpf_tc_detach(const struct bpf_tc_hook *hook, @@ -804,7 +901,8 @@ int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts) info.opts = opts; - ret = libbpf_netlink_send_recv(&req, get_tc_info, NULL, &info); + ret = libbpf_netlink_send_recv(&req, NETLINK_ROUTE, get_tc_info, NULL, + &info); if (ret < 0) return libbpf_err(ret); if (!info.processed) diff --git a/tools/lib/bpf/nlattr.c b/tools/lib/bpf/nlattr.c index 3900d052ed19..975e265eab3b 100644 --- a/tools/lib/bpf/nlattr.c +++ b/tools/lib/bpf/nlattr.c @@ -178,7 +178,7 @@ int libbpf_nla_dump_errormsg(struct nlmsghdr *nlh) hlen += nlmsg_len(&err->msg); attr = (struct nlattr *) ((void *) err + hlen); - alen = nlh->nlmsg_len - hlen; + alen = (void *)nlh + nlh->nlmsg_len - (void *)attr; if (libbpf_nla_parse(tb, NLMSGERR_ATTR_MAX, attr, alen, extack_policy) != 0) { diff --git a/tools/lib/bpf/nlattr.h b/tools/lib/bpf/nlattr.h index 4d15ae2ff812..d92d1c1de700 100644 --- a/tools/lib/bpf/nlattr.h +++ b/tools/lib/bpf/nlattr.h @@ -14,6 +14,7 @@ #include <errno.h> #include <linux/netlink.h> #include <linux/rtnetlink.h> +#include <linux/genetlink.h> /* avoid multiple definition of netlink features */ #define __LINUX_NETLINK_H @@ -58,6 +59,7 @@ struct libbpf_nla_req { union { struct ifinfomsg ifinfo; struct tcmsg tc; + struct genlmsghdr gnl; }; char buf[128]; }; @@ -89,11 +91,21 @@ static inline uint8_t libbpf_nla_getattr_u8(const struct nlattr *nla) return *(uint8_t *)libbpf_nla_data(nla); } +static inline uint16_t libbpf_nla_getattr_u16(const struct nlattr *nla) +{ + return *(uint16_t *)libbpf_nla_data(nla); +} + static inline uint32_t libbpf_nla_getattr_u32(const struct nlattr *nla) { return *(uint32_t *)libbpf_nla_data(nla); } +static inline uint64_t libbpf_nla_getattr_u64(const struct nlattr *nla) +{ + return *(uint64_t *)libbpf_nla_data(nla); +} + static inline const char *libbpf_nla_getattr_str(const struct nlattr *nla) { return (const char *)libbpf_nla_data(nla); diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c index 47855af25f3b..02199364db13 100644 --- a/tools/lib/bpf/ringbuf.c +++ b/tools/lib/bpf/ringbuf.c @@ -83,7 +83,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(map_fd, &info, &len); + err = bpf_map_get_info_by_fd(map_fd, &info, &len); if (err) { err = -errno; pr_warn("ringbuf: failed to get map info for fd=%d: %d\n", @@ -359,7 +359,7 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd) memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(map_fd, &info, &len); + err = bpf_map_get_info_by_fd(map_fd, &info, &len); if (err) { err = -errno; pr_warn("user ringbuf: failed to get map info for fd=%d: %d\n", map_fd, err); diff --git a/tools/lib/bpf/usdt.bpf.h b/tools/lib/bpf/usdt.bpf.h index fdfd235e52c4..0bd4c135acc2 100644 --- a/tools/lib/bpf/usdt.bpf.h +++ b/tools/lib/bpf/usdt.bpf.h @@ -130,7 +130,10 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res) if (!spec) return -ESRCH; - if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt) + if (arg_num >= BPF_USDT_MAX_ARG_CNT) + return -ENOENT; + barrier_var(arg_num); + if (arg_num >= spec->arg_cnt) return -ENOENT; arg_spec = &spec->args[arg_num]; diff --git a/tools/net/ynl/cli.py b/tools/net/ynl/cli.py new file mode 100755 index 000000000000..db410b74d539 --- /dev/null +++ b/tools/net/ynl/cli.py @@ -0,0 +1,52 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause + +import argparse +import json +import pprint +import time + +from lib import YnlFamily + + +def main(): + parser = argparse.ArgumentParser(description='YNL CLI sample') + parser.add_argument('--spec', dest='spec', type=str, required=True) + parser.add_argument('--schema', dest='schema', type=str) + parser.add_argument('--no-schema', action='store_true') + parser.add_argument('--json', dest='json_text', type=str) + parser.add_argument('--do', dest='do', type=str) + parser.add_argument('--dump', dest='dump', type=str) + parser.add_argument('--sleep', dest='sleep', type=int) + parser.add_argument('--subscribe', dest='ntf', type=str) + args = parser.parse_args() + + if args.no_schema: + args.schema = '' + + attrs = {} + if args.json_text: + attrs = json.loads(args.json_text) + + ynl = YnlFamily(args.spec, args.schema) + + if args.ntf: + ynl.ntf_subscribe(args.ntf) + + if args.sleep: + time.sleep(args.sleep) + + if args.do: + reply = ynl.do(args.do, attrs) + pprint.PrettyPrinter().pprint(reply) + if args.dump: + reply = ynl.dump(args.dump, attrs) + pprint.PrettyPrinter().pprint(reply) + + if args.ntf: + ynl.check_ntf() + pprint.PrettyPrinter().pprint(ynl.async_msg_queue) + + +if __name__ == "__main__": + main() diff --git a/tools/net/ynl/lib/__init__.py b/tools/net/ynl/lib/__init__.py new file mode 100644 index 000000000000..3c73f59eabab --- /dev/null +++ b/tools/net/ynl/lib/__init__.py @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause + +from .nlspec import SpecAttr, SpecAttrSet, SpecFamily, SpecOperation +from .ynl import YnlFamily + +__all__ = ["SpecAttr", "SpecAttrSet", "SpecFamily", "SpecOperation", + "YnlFamily"] diff --git a/tools/net/ynl/lib/nlspec.py b/tools/net/ynl/lib/nlspec.py new file mode 100644 index 000000000000..e204679ad8b7 --- /dev/null +++ b/tools/net/ynl/lib/nlspec.py @@ -0,0 +1,310 @@ +# SPDX-License-Identifier: BSD-3-Clause + +import collections +import importlib +import os +import traceback +import yaml + + +# To be loaded dynamically as needed +jsonschema = None + + +class SpecElement: + """Netlink spec element. + + Abstract element of the Netlink spec. Implements the dictionary interface + for access to the raw spec. Supports iterative resolution of dependencies + across elements and class inheritance levels. The elements of the spec + may refer to each other, and although loops should be very rare, having + to maintain correct ordering of instantiation is painful, so the resolve() + method should be used to perform parts of init which require access to + other parts of the spec. + + Attributes: + yaml raw spec as loaded from the spec file + family back reference to the full family + + name name of the entity as listed in the spec (optional) + ident_name name which can be safely used as identifier in code (optional) + """ + def __init__(self, family, yaml): + self.yaml = yaml + self.family = family + + if 'name' in self.yaml: + self.name = self.yaml['name'] + self.ident_name = self.name.replace('-', '_') + + self._super_resolved = False + family.add_unresolved(self) + + def __getitem__(self, key): + return self.yaml[key] + + def __contains__(self, key): + return key in self.yaml + + def get(self, key, default=None): + return self.yaml.get(key, default) + + def resolve_up(self, up): + if not self._super_resolved: + up.resolve() + self._super_resolved = True + + def resolve(self): + pass + + +class SpecAttr(SpecElement): + """ Single Netlink atttribute type + + Represents a single attribute type within an attr space. + + Attributes: + value numerical ID when serialized + attr_set Attribute Set containing this attr + """ + def __init__(self, family, attr_set, yaml, value): + super().__init__(family, yaml) + + self.value = value + self.attr_set = attr_set + self.is_multi = yaml.get('multi-attr', False) + + +class SpecAttrSet(SpecElement): + """ Netlink Attribute Set class. + + Represents a ID space of attributes within Netlink. + + Note that unlike other elements, which expose contents of the raw spec + via the dictionary interface Attribute Set exposes attributes by name. + + Attributes: + attrs ordered dict of all attributes (indexed by name) + attrs_by_val ordered dict of all attributes (indexed by value) + subset_of parent set if this is a subset, otherwise None + """ + def __init__(self, family, yaml): + super().__init__(family, yaml) + + self.subset_of = self.yaml.get('subset-of', None) + + self.attrs = collections.OrderedDict() + self.attrs_by_val = collections.OrderedDict() + + val = 0 + for elem in self.yaml['attributes']: + if 'value' in elem: + val = elem['value'] + + attr = self.new_attr(elem, val) + self.attrs[attr.name] = attr + self.attrs_by_val[attr.value] = attr + val += 1 + + def new_attr(self, elem, value): + return SpecAttr(self.family, self, elem, value) + + def __getitem__(self, key): + return self.attrs[key] + + def __contains__(self, key): + return key in self.attrs + + def __iter__(self): + yield from self.attrs + + def items(self): + return self.attrs.items() + + +class SpecOperation(SpecElement): + """Netlink Operation + + Information about a single Netlink operation. + + Attributes: + value numerical ID when serialized, None if req/rsp values differ + + req_value numerical ID when serialized, user -> kernel + rsp_value numerical ID when serialized, user <- kernel + is_call bool, whether the operation is a call + is_async bool, whether the operation is a notification + is_resv bool, whether the operation does not exist (it's just a reserved ID) + attr_set attribute set name + + yaml raw spec as loaded from the spec file + """ + def __init__(self, family, yaml, req_value, rsp_value): + super().__init__(family, yaml) + + self.value = req_value if req_value == rsp_value else None + self.req_value = req_value + self.rsp_value = rsp_value + + self.is_call = 'do' in yaml or 'dump' in yaml + self.is_async = 'notify' in yaml or 'event' in yaml + self.is_resv = not self.is_async and not self.is_call + + # Added by resolve: + self.attr_set = None + delattr(self, "attr_set") + + def resolve(self): + self.resolve_up(super()) + + if 'attribute-set' in self.yaml: + attr_set_name = self.yaml['attribute-set'] + elif 'notify' in self.yaml: + msg = self.family.msgs[self.yaml['notify']] + attr_set_name = msg['attribute-set'] + elif self.is_resv: + attr_set_name = '' + else: + raise Exception(f"Can't resolve attribute set for op '{self.name}'") + if attr_set_name: + self.attr_set = self.family.attr_sets[attr_set_name] + + +class SpecFamily(SpecElement): + """ Netlink Family Spec class. + + Netlink family information loaded from a spec (e.g. in YAML). + Takes care of unfolding implicit information which can be skipped + in the spec itself for brevity. + + The class can be used like a dictionary to access the raw spec + elements but that's usually a bad idea. + + Attributes: + proto protocol type (e.g. genetlink) + + attr_sets dict of attribute sets + msgs dict of all messages (index by name) + msgs_by_value dict of all messages (indexed by name) + ops dict of all valid requests / responses + """ + def __init__(self, spec_path, schema_path=None): + with open(spec_path, "r") as stream: + spec = yaml.safe_load(stream) + + self._resolution_list = [] + + super().__init__(self, spec) + + self.proto = self.yaml.get('protocol', 'genetlink') + + if schema_path is None: + schema_path = os.path.dirname(os.path.dirname(spec_path)) + f'/{self.proto}.yaml' + if schema_path: + global jsonschema + + with open(schema_path, "r") as stream: + schema = yaml.safe_load(stream) + + if jsonschema is None: + jsonschema = importlib.import_module("jsonschema") + + jsonschema.validate(self.yaml, schema) + + self.attr_sets = collections.OrderedDict() + self.msgs = collections.OrderedDict() + self.req_by_value = collections.OrderedDict() + self.rsp_by_value = collections.OrderedDict() + self.ops = collections.OrderedDict() + + last_exception = None + while len(self._resolution_list) > 0: + resolved = [] + unresolved = self._resolution_list + self._resolution_list = [] + + for elem in unresolved: + try: + elem.resolve() + except (KeyError, AttributeError) as e: + self._resolution_list.append(elem) + last_exception = e + continue + + resolved.append(elem) + + if len(resolved) == 0: + traceback.print_exception(last_exception) + raise Exception("Could not resolve any spec element, infinite loop?") + + def new_attr_set(self, elem): + return SpecAttrSet(self, elem) + + def new_operation(self, elem, req_val, rsp_val): + return SpecOperation(self, elem, req_val, rsp_val) + + def add_unresolved(self, elem): + self._resolution_list.append(elem) + + def _dictify_ops_unified(self): + val = 0 + for elem in self.yaml['operations']['list']: + if 'value' in elem: + val = elem['value'] + + op = self.new_operation(elem, val, val) + val += 1 + + self.msgs[op.name] = op + + def _dictify_ops_directional(self): + req_val = rsp_val = 0 + for elem in self.yaml['operations']['list']: + if 'notify' in elem: + if 'value' in elem: + rsp_val = elem['value'] + req_val_next = req_val + rsp_val_next = rsp_val + 1 + req_val = None + elif 'do' in elem or 'dump' in elem: + mode = elem['do'] if 'do' in elem else elem['dump'] + + v = mode.get('request', {}).get('value', None) + if v: + req_val = v + v = mode.get('reply', {}).get('value', None) + if v: + rsp_val = v + + rsp_inc = 1 if 'reply' in mode else 0 + req_val_next = req_val + 1 + rsp_val_next = rsp_val + rsp_inc + else: + raise Exception("Can't parse directional ops") + + op = self.new_operation(elem, req_val, rsp_val) + req_val = req_val_next + rsp_val = rsp_val_next + + self.msgs[op.name] = op + + def resolve(self): + self.resolve_up(super()) + + for elem in self.yaml['attribute-sets']: + attr_set = self.new_attr_set(elem) + self.attr_sets[elem['name']] = attr_set + + msg_id_model = self.yaml['operations'].get('enum-model', 'unified') + if msg_id_model == 'unified': + self._dictify_ops_unified() + elif msg_id_model == 'directional': + self._dictify_ops_directional() + + for op in self.msgs.values(): + if op.req_value is not None: + self.req_by_value[op.req_value] = op + if op.rsp_value is not None: + self.rsp_by_value[op.rsp_value] = op + if not op.is_async and 'attribute-set' in op: + self.ops[op.name] = op diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py new file mode 100644 index 000000000000..1c7411ee04dc --- /dev/null +++ b/tools/net/ynl/lib/ynl.py @@ -0,0 +1,528 @@ +# SPDX-License-Identifier: BSD-3-Clause + +import functools +import os +import random +import socket +import struct +import yaml + +from .nlspec import SpecFamily + +# +# Generic Netlink code which should really be in some library, but I can't quickly find one. +# + + +class Netlink: + # Netlink socket + SOL_NETLINK = 270 + + NETLINK_ADD_MEMBERSHIP = 1 + NETLINK_CAP_ACK = 10 + NETLINK_EXT_ACK = 11 + + # Netlink message + NLMSG_ERROR = 2 + NLMSG_DONE = 3 + + NLM_F_REQUEST = 1 + NLM_F_ACK = 4 + NLM_F_ROOT = 0x100 + NLM_F_MATCH = 0x200 + NLM_F_APPEND = 0x800 + + NLM_F_CAPPED = 0x100 + NLM_F_ACK_TLVS = 0x200 + + NLM_F_DUMP = NLM_F_ROOT | NLM_F_MATCH + + NLA_F_NESTED = 0x8000 + NLA_F_NET_BYTEORDER = 0x4000 + + NLA_TYPE_MASK = NLA_F_NESTED | NLA_F_NET_BYTEORDER + + # Genetlink defines + NETLINK_GENERIC = 16 + + GENL_ID_CTRL = 0x10 + + # nlctrl + CTRL_CMD_GETFAMILY = 3 + + CTRL_ATTR_FAMILY_ID = 1 + CTRL_ATTR_FAMILY_NAME = 2 + CTRL_ATTR_MAXATTR = 5 + CTRL_ATTR_MCAST_GROUPS = 7 + + CTRL_ATTR_MCAST_GRP_NAME = 1 + CTRL_ATTR_MCAST_GRP_ID = 2 + + # Extack types + NLMSGERR_ATTR_MSG = 1 + NLMSGERR_ATTR_OFFS = 2 + NLMSGERR_ATTR_COOKIE = 3 + NLMSGERR_ATTR_POLICY = 4 + NLMSGERR_ATTR_MISS_TYPE = 5 + NLMSGERR_ATTR_MISS_NEST = 6 + + +class NlAttr: + def __init__(self, raw, offset): + self._len, self._type = struct.unpack("HH", raw[offset:offset + 4]) + self.type = self._type & ~Netlink.NLA_TYPE_MASK + self.payload_len = self._len + self.full_len = (self.payload_len + 3) & ~3 + self.raw = raw[offset + 4:offset + self.payload_len] + + def as_u8(self): + return struct.unpack("B", self.raw)[0] + + def as_u16(self): + return struct.unpack("H", self.raw)[0] + + def as_u32(self): + return struct.unpack("I", self.raw)[0] + + def as_u64(self): + return struct.unpack("Q", self.raw)[0] + + def as_strz(self): + return self.raw.decode('ascii')[:-1] + + def as_bin(self): + return self.raw + + def __repr__(self): + return f"[type:{self.type} len:{self._len}] {self.raw}" + + +class NlAttrs: + def __init__(self, msg): + self.attrs = [] + + offset = 0 + while offset < len(msg): + attr = NlAttr(msg, offset) + offset += attr.full_len + self.attrs.append(attr) + + def __iter__(self): + yield from self.attrs + + def __repr__(self): + msg = '' + for a in self.attrs: + if msg: + msg += '\n' + msg += repr(a) + return msg + + +class NlMsg: + def __init__(self, msg, offset, attr_space=None): + self.hdr = msg[offset:offset + 16] + + self.nl_len, self.nl_type, self.nl_flags, self.nl_seq, self.nl_portid = \ + struct.unpack("IHHII", self.hdr) + + self.raw = msg[offset + 16:offset + self.nl_len] + + self.error = 0 + self.done = 0 + + extack_off = None + if self.nl_type == Netlink.NLMSG_ERROR: + self.error = struct.unpack("i", self.raw[0:4])[0] + self.done = 1 + extack_off = 20 + elif self.nl_type == Netlink.NLMSG_DONE: + self.done = 1 + extack_off = 4 + + self.extack = None + if self.nl_flags & Netlink.NLM_F_ACK_TLVS and extack_off: + self.extack = dict() + extack_attrs = NlAttrs(self.raw[extack_off:]) + for extack in extack_attrs: + if extack.type == Netlink.NLMSGERR_ATTR_MSG: + self.extack['msg'] = extack.as_strz() + elif extack.type == Netlink.NLMSGERR_ATTR_MISS_TYPE: + self.extack['miss-type'] = extack.as_u32() + elif extack.type == Netlink.NLMSGERR_ATTR_MISS_NEST: + self.extack['miss-nest'] = extack.as_u32() + elif extack.type == Netlink.NLMSGERR_ATTR_OFFS: + self.extack['bad-attr-offs'] = extack.as_u32() + else: + if 'unknown' not in self.extack: + self.extack['unknown'] = [] + self.extack['unknown'].append(extack) + + if attr_space: + # We don't have the ability to parse nests yet, so only do global + if 'miss-type' in self.extack and 'miss-nest' not in self.extack: + miss_type = self.extack['miss-type'] + if miss_type in attr_space.attrs_by_val: + spec = attr_space.attrs_by_val[miss_type] + desc = spec['name'] + if 'doc' in spec: + desc += f" ({spec['doc']})" + self.extack['miss-type'] = desc + + def __repr__(self): + msg = f"nl_len = {self.nl_len} ({len(self.raw)}) nl_flags = 0x{self.nl_flags:x} nl_type = {self.nl_type}\n" + if self.error: + msg += '\terror: ' + str(self.error) + if self.extack: + msg += '\textack: ' + repr(self.extack) + return msg + + +class NlMsgs: + def __init__(self, data, attr_space=None): + self.msgs = [] + + offset = 0 + while offset < len(data): + msg = NlMsg(data, offset, attr_space=attr_space) + offset += msg.nl_len + self.msgs.append(msg) + + def __iter__(self): + yield from self.msgs + + +genl_family_name_to_id = None + + +def _genl_msg(nl_type, nl_flags, genl_cmd, genl_version, seq=None): + # we prepend length in _genl_msg_finalize() + if seq is None: + seq = random.randint(1, 1024) + nlmsg = struct.pack("HHII", nl_type, nl_flags, seq, 0) + genlmsg = struct.pack("bbH", genl_cmd, genl_version, 0) + return nlmsg + genlmsg + + +def _genl_msg_finalize(msg): + return struct.pack("I", len(msg) + 4) + msg + + +def _genl_load_families(): + with socket.socket(socket.AF_NETLINK, socket.SOCK_RAW, Netlink.NETLINK_GENERIC) as sock: + sock.setsockopt(Netlink.SOL_NETLINK, Netlink.NETLINK_CAP_ACK, 1) + + msg = _genl_msg(Netlink.GENL_ID_CTRL, + Netlink.NLM_F_REQUEST | Netlink.NLM_F_ACK | Netlink.NLM_F_DUMP, + Netlink.CTRL_CMD_GETFAMILY, 1) + msg = _genl_msg_finalize(msg) + + sock.send(msg, 0) + + global genl_family_name_to_id + genl_family_name_to_id = dict() + + while True: + reply = sock.recv(128 * 1024) + nms = NlMsgs(reply) + for nl_msg in nms: + if nl_msg.error: + print("Netlink error:", nl_msg.error) + return + if nl_msg.done: + return + + gm = GenlMsg(nl_msg) + fam = dict() + for attr in gm.raw_attrs: + if attr.type == Netlink.CTRL_ATTR_FAMILY_ID: + fam['id'] = attr.as_u16() + elif attr.type == Netlink.CTRL_ATTR_FAMILY_NAME: + fam['name'] = attr.as_strz() + elif attr.type == Netlink.CTRL_ATTR_MAXATTR: + fam['maxattr'] = attr.as_u32() + elif attr.type == Netlink.CTRL_ATTR_MCAST_GROUPS: + fam['mcast'] = dict() + for entry in NlAttrs(attr.raw): + mcast_name = None + mcast_id = None + for entry_attr in NlAttrs(entry.raw): + if entry_attr.type == Netlink.CTRL_ATTR_MCAST_GRP_NAME: + mcast_name = entry_attr.as_strz() + elif entry_attr.type == Netlink.CTRL_ATTR_MCAST_GRP_ID: + mcast_id = entry_attr.as_u32() + if mcast_name and mcast_id is not None: + fam['mcast'][mcast_name] = mcast_id + if 'name' in fam and 'id' in fam: + genl_family_name_to_id[fam['name']] = fam + + +class GenlMsg: + def __init__(self, nl_msg): + self.nl = nl_msg + + self.hdr = nl_msg.raw[0:4] + self.raw = nl_msg.raw[4:] + + self.genl_cmd, self.genl_version, _ = struct.unpack("bbH", self.hdr) + + self.raw_attrs = NlAttrs(self.raw) + + def __repr__(self): + msg = repr(self.nl) + msg += f"\tgenl_cmd = {self.genl_cmd} genl_ver = {self.genl_version}\n" + for a in self.raw_attrs: + msg += '\t\t' + repr(a) + '\n' + return msg + + +class GenlFamily: + def __init__(self, family_name): + self.family_name = family_name + + global genl_family_name_to_id + if genl_family_name_to_id is None: + _genl_load_families() + + self.genl_family = genl_family_name_to_id[family_name] + self.family_id = genl_family_name_to_id[family_name]['id'] + + +# +# YNL implementation details. +# + + +class YnlFamily(SpecFamily): + def __init__(self, def_path, schema=None): + super().__init__(def_path, schema) + + self.include_raw = False + + self.sock = socket.socket(socket.AF_NETLINK, socket.SOCK_RAW, Netlink.NETLINK_GENERIC) + self.sock.setsockopt(Netlink.SOL_NETLINK, Netlink.NETLINK_CAP_ACK, 1) + self.sock.setsockopt(Netlink.SOL_NETLINK, Netlink.NETLINK_EXT_ACK, 1) + + self._types = dict() + + for elem in self.yaml.get('definitions', []): + self._types[elem['name']] = elem + + self.async_msg_ids = set() + self.async_msg_queue = [] + + for msg in self.msgs.values(): + if msg.is_async: + self.async_msg_ids.add(msg.rsp_value) + + for op_name, op in self.ops.items(): + bound_f = functools.partial(self._op, op_name) + setattr(self, op.ident_name, bound_f) + + self.family = GenlFamily(self.yaml['name']) + + def ntf_subscribe(self, mcast_name): + if mcast_name not in self.family.genl_family['mcast']: + raise Exception(f'Multicast group "{mcast_name}" not present in the family') + + self.sock.bind((0, 0)) + self.sock.setsockopt(Netlink.SOL_NETLINK, Netlink.NETLINK_ADD_MEMBERSHIP, + self.family.genl_family['mcast'][mcast_name]) + + def _add_attr(self, space, name, value): + attr = self.attr_sets[space][name] + nl_type = attr.value + if attr["type"] == 'nest': + nl_type |= Netlink.NLA_F_NESTED + attr_payload = b'' + for subname, subvalue in value.items(): + attr_payload += self._add_attr(attr['nested-attributes'], subname, subvalue) + elif attr["type"] == 'flag': + attr_payload = b'' + elif attr["type"] == 'u32': + attr_payload = struct.pack("I", int(value)) + elif attr["type"] == 'string': + attr_payload = str(value).encode('ascii') + b'\x00' + elif attr["type"] == 'binary': + attr_payload = value + else: + raise Exception(f'Unknown type at {space} {name} {value} {attr["type"]}') + + pad = b'\x00' * ((4 - len(attr_payload) % 4) % 4) + return struct.pack('HH', len(attr_payload) + 4, nl_type) + attr_payload + pad + + def _decode_enum(self, rsp, attr_spec): + raw = rsp[attr_spec['name']] + enum = self._types[attr_spec['enum']] + i = attr_spec.get('value-start', 0) + if 'enum-as-flags' in attr_spec and attr_spec['enum-as-flags']: + value = set() + while raw: + if raw & 1: + value.add(enum['entries'][i]) + raw >>= 1 + i += 1 + else: + value = enum['entries'][raw - i] + rsp[attr_spec['name']] = value + + def _decode(self, attrs, space): + attr_space = self.attr_sets[space] + rsp = dict() + for attr in attrs: + attr_spec = attr_space.attrs_by_val[attr.type] + if attr_spec["type"] == 'nest': + subdict = self._decode(NlAttrs(attr.raw), attr_spec['nested-attributes']) + decoded = subdict + elif attr_spec['type'] == 'u8': + decoded = attr.as_u8() + elif attr_spec['type'] == 'u32': + decoded = attr.as_u32() + elif attr_spec['type'] == 'u64': + decoded = attr.as_u64() + elif attr_spec["type"] == 'string': + decoded = attr.as_strz() + elif attr_spec["type"] == 'binary': + decoded = attr.as_bin() + elif attr_spec["type"] == 'flag': + decoded = True + else: + raise Exception(f'Unknown {attr.type} {attr_spec["name"]} {attr_spec["type"]}') + + if not attr_spec.is_multi: + rsp[attr_spec['name']] = decoded + elif attr_spec.name in rsp: + rsp[attr_spec.name].append(decoded) + else: + rsp[attr_spec.name] = [decoded] + + if 'enum' in attr_spec: + self._decode_enum(rsp, attr_spec) + return rsp + + def _decode_extack_path(self, attrs, attr_set, offset, target): + for attr in attrs: + attr_spec = attr_set.attrs_by_val[attr.type] + if offset > target: + break + if offset == target: + return '.' + attr_spec.name + + if offset + attr.full_len <= target: + offset += attr.full_len + continue + if attr_spec['type'] != 'nest': + raise Exception(f"Can't dive into {attr.type} ({attr_spec['name']}) for extack") + offset += 4 + subpath = self._decode_extack_path(NlAttrs(attr.raw), + self.attr_sets[attr_spec['nested-attributes']], + offset, target) + if subpath is None: + return None + return '.' + attr_spec.name + subpath + + return None + + def _decode_extack(self, request, attr_space, extack): + if 'bad-attr-offs' not in extack: + return + + genl_req = GenlMsg(NlMsg(request, 0, attr_space=attr_space)) + path = self._decode_extack_path(genl_req.raw_attrs, attr_space, + 20, extack['bad-attr-offs']) + if path: + del extack['bad-attr-offs'] + extack['bad-attr'] = path + + def handle_ntf(self, nl_msg, genl_msg): + msg = dict() + if self.include_raw: + msg['nlmsg'] = nl_msg + msg['genlmsg'] = genl_msg + op = self.rsp_by_value[genl_msg.genl_cmd] + msg['name'] = op['name'] + msg['msg'] = self._decode(genl_msg.raw_attrs, op.attr_set.name) + self.async_msg_queue.append(msg) + + def check_ntf(self): + while True: + try: + reply = self.sock.recv(128 * 1024, socket.MSG_DONTWAIT) + except BlockingIOError: + return + + nms = NlMsgs(reply) + for nl_msg in nms: + if nl_msg.error: + print("Netlink error in ntf!?", os.strerror(-nl_msg.error)) + print(nl_msg) + continue + if nl_msg.done: + print("Netlink done while checking for ntf!?") + continue + + gm = GenlMsg(nl_msg) + if gm.genl_cmd not in self.async_msg_ids: + print("Unexpected msg id done while checking for ntf", gm) + continue + + self.handle_ntf(nl_msg, gm) + + def _op(self, method, vals, dump=False): + op = self.ops[method] + + nl_flags = Netlink.NLM_F_REQUEST | Netlink.NLM_F_ACK + if dump: + nl_flags |= Netlink.NLM_F_DUMP + + req_seq = random.randint(1024, 65535) + msg = _genl_msg(self.family.family_id, nl_flags, op.req_value, 1, req_seq) + for name, value in vals.items(): + msg += self._add_attr(op.attr_set.name, name, value) + msg = _genl_msg_finalize(msg) + + self.sock.send(msg, 0) + + done = False + rsp = [] + while not done: + reply = self.sock.recv(128 * 1024) + nms = NlMsgs(reply, attr_space=op.attr_set) + for nl_msg in nms: + if nl_msg.extack: + self._decode_extack(msg, op.attr_set, nl_msg.extack) + + if nl_msg.error: + print("Netlink error:", os.strerror(-nl_msg.error)) + print(nl_msg) + return + if nl_msg.done: + if nl_msg.extack: + print("Netlink warning:") + print(nl_msg) + done = True + break + + gm = GenlMsg(nl_msg) + # Check if this is a reply to our request + if nl_msg.nl_seq != req_seq or gm.genl_cmd != op.rsp_value: + if gm.genl_cmd in self.async_msg_ids: + self.handle_ntf(nl_msg, gm) + continue + else: + print('Unexpected message: ' + repr(gm)) + continue + + rsp.append(self._decode(gm.raw_attrs, op.attr_set.name)) + + if not rsp: + return None + if not dump and len(rsp) == 1: + return rsp[0] + return rsp + + def do(self, method, vals): + return self._op(method, vals) + + def dump(self, method, vals): + return self._op(method, vals, dump=True) diff --git a/tools/net/ynl/ynl-gen-c.py b/tools/net/ynl/ynl-gen-c.py new file mode 100755 index 000000000000..3942f24b9163 --- /dev/null +++ b/tools/net/ynl/ynl-gen-c.py @@ -0,0 +1,2357 @@ +#!/usr/bin/env python3 + +import argparse +import collections +import os +import yaml + +from lib import SpecFamily, SpecAttrSet, SpecAttr, SpecOperation + + +def c_upper(name): + return name.upper().replace('-', '_') + + +def c_lower(name): + return name.lower().replace('-', '_') + + +class BaseNlLib: + def get_family_id(self): + return 'ys->family_id' + + def parse_cb_run(self, cb, data, is_dump=False, indent=1): + ind = '\n\t\t' + '\t' * indent + ' ' + if is_dump: + return f"mnl_cb_run2(ys->rx_buf, len, 0, 0, {cb}, {data},{ind}ynl_cb_array, NLMSG_MIN_TYPE)" + else: + return f"mnl_cb_run2(ys->rx_buf, len, ys->seq, ys->portid,{ind}{cb}, {data},{ind}" + \ + "ynl_cb_array, NLMSG_MIN_TYPE)" + + +class Type(SpecAttr): + def __init__(self, family, attr_set, attr, value): + super().__init__(family, attr_set, attr, value) + + self.attr = attr + self.attr_set = attr_set + self.type = attr['type'] + self.checks = attr.get('checks', {}) + + if 'len' in attr: + self.len = attr['len'] + if 'nested-attributes' in attr: + self.nested_attrs = attr['nested-attributes'] + if self.nested_attrs == family.name: + self.nested_render_name = f"{family.name}" + else: + self.nested_render_name = f"{family.name}_{c_lower(self.nested_attrs)}" + + self.c_name = c_lower(self.name) + if self.c_name in _C_KW: + self.c_name += '_' + + # Added by resolve(): + self.enum_name = None + delattr(self, "enum_name") + + def resolve(self): + self.enum_name = f"{self.attr_set.name_prefix}{self.name}" + self.enum_name = c_upper(self.enum_name) + + def is_multi_val(self): + return None + + def is_scalar(self): + return self.type in {'u8', 'u16', 'u32', 'u64', 's32', 's64'} + + def presence_type(self): + return 'bit' + + def presence_member(self, space, type_filter): + if self.presence_type() != type_filter: + return + + if self.presence_type() == 'bit': + pfx = '__' if space == 'user' else '' + return f"{pfx}u32 {self.c_name}:1;" + + if self.presence_type() == 'len': + pfx = '__' if space == 'user' else '' + return f"{pfx}u32 {self.c_name}_len;" + + def _complex_member_type(self, ri): + return None + + def free_needs_iter(self): + return False + + def free(self, ri, var, ref): + if self.is_multi_val() or self.presence_type() == 'len': + ri.cw.p(f'free({var}->{ref}{self.c_name});') + + def arg_member(self, ri): + member = self._complex_member_type(ri) + if member: + return [member + ' *' + self.c_name] + raise Exception(f"Struct member not implemented for class type {self.type}") + + def struct_member(self, ri): + if self.is_multi_val(): + ri.cw.p(f"unsigned int n_{self.c_name};") + member = self._complex_member_type(ri) + if member: + ptr = '*' if self.is_multi_val() else '' + ri.cw.p(f"{member} {ptr}{self.c_name};") + return + members = self.arg_member(ri) + for one in members: + ri.cw.p(one + ';') + + def _attr_policy(self, policy): + return '{ .type = ' + policy + ', }' + + def attr_policy(self, cw): + policy = c_upper('nla-' + self.attr['type']) + + spec = self._attr_policy(policy) + cw.p(f"\t[{self.enum_name}] = {spec},") + + def _attr_typol(self): + raise Exception(f"Type policy not implemented for class type {self.type}") + + def attr_typol(self, cw): + typol = self._attr_typol() + cw.p(f'[{self.enum_name}] = {"{"} .name = "{self.name}", {typol}{"}"},') + + def _attr_put_line(self, ri, var, line): + if self.presence_type() == 'bit': + ri.cw.p(f"if ({var}->_present.{self.c_name})") + elif self.presence_type() == 'len': + ri.cw.p(f"if ({var}->_present.{self.c_name}_len)") + ri.cw.p(f"{line};") + + def _attr_put_simple(self, ri, var, put_type): + line = f"mnl_attr_put_{put_type}(nlh, {self.enum_name}, {var}->{self.c_name})" + self._attr_put_line(ri, var, line) + + def attr_put(self, ri, var): + raise Exception(f"Put not implemented for class type {self.type}") + + def _attr_get(self, ri, var): + raise Exception(f"Attr get not implemented for class type {self.type}") + + def attr_get(self, ri, var, first): + lines, init_lines, local_vars = self._attr_get(ri, var) + if type(lines) is str: + lines = [lines] + if type(init_lines) is str: + init_lines = [init_lines] + + kw = 'if' if first else 'else if' + ri.cw.block_start(line=f"{kw} (mnl_attr_get_type(attr) == {self.enum_name})") + if local_vars: + for local in local_vars: + ri.cw.p(local) + ri.cw.nl() + + if not self.is_multi_val(): + ri.cw.p("if (ynl_attr_validate(yarg, attr))") + ri.cw.p("return MNL_CB_ERROR;") + if self.presence_type() == 'bit': + ri.cw.p(f"{var}->_present.{self.c_name} = 1;") + + if init_lines: + ri.cw.nl() + for line in init_lines: + ri.cw.p(line) + + for line in lines: + ri.cw.p(line) + ri.cw.block_end() + + def _setter_lines(self, ri, member, presence): + raise Exception(f"Setter not implemented for class type {self.type}") + + def setter(self, ri, space, direction, deref=False, ref=None): + ref = (ref if ref else []) + [self.c_name] + var = "req" + member = f"{var}->{'.'.join(ref)}" + + code = [] + presence = '' + for i in range(0, len(ref)): + presence = f"{var}->{'.'.join(ref[:i] + [''])}_present.{ref[i]}" + if self.presence_type() == 'bit': + code.append(presence + ' = 1;') + code += self._setter_lines(ri, member, presence) + + ri.cw.write_func('static inline void', + f"{op_prefix(ri, direction, deref=deref)}_set_{'_'.join(ref)}", + body=code, + args=[f'{type_name(ri, direction, deref=deref)} *{var}'] + self.arg_member(ri)) + + +class TypeUnused(Type): + def presence_type(self): + return '' + + def _attr_typol(self): + return '.type = YNL_PT_REJECT, ' + + def attr_policy(self, cw): + pass + + +class TypePad(Type): + def presence_type(self): + return '' + + def _attr_typol(self): + return '.type = YNL_PT_REJECT, ' + + def attr_policy(self, cw): + pass + + +class TypeScalar(Type): + def __init__(self, family, attr_set, attr, value): + super().__init__(family, attr_set, attr, value) + + self.byte_order_comment = '' + if 'byte-order' in attr: + self.byte_order_comment = f" /* {attr['byte-order']} */" + + # Added by resolve(): + self.is_bitfield = None + delattr(self, "is_bitfield") + self.type_name = None + delattr(self, "type_name") + + def resolve(self): + self.resolve_up(super()) + + if 'enum-as-flags' in self.attr and self.attr['enum-as-flags']: + self.is_bitfield = True + elif 'enum' in self.attr: + self.is_bitfield = self.family.consts[self.attr['enum']]['type'] == 'flags' + else: + self.is_bitfield = False + + if 'enum' in self.attr and not self.is_bitfield: + self.type_name = f"enum {self.family.name}_{c_lower(self.attr['enum'])}" + else: + self.type_name = '__' + self.type + + def _mnl_type(self): + t = self.type + # mnl does not have a helper for signed types + if t[0] == 's': + t = 'u' + t[1:] + return t + + def _attr_policy(self, policy): + if 'flags-mask' in self.checks or self.is_bitfield: + if self.is_bitfield: + mask = self.family.consts[self.attr['enum']].get_mask() + else: + flags = self.family.consts[self.checks['flags-mask']] + flag_cnt = len(flags['entries']) + mask = (1 << flag_cnt) - 1 + return f"NLA_POLICY_MASK({policy}, 0x{mask:x})" + elif 'min' in self.checks: + return f"NLA_POLICY_MIN({policy}, {self.checks['min']})" + elif 'enum' in self.attr: + enum = self.family.consts[self.attr['enum']] + cnt = len(enum['entries']) + return f"NLA_POLICY_MAX({policy}, {cnt - 1})" + return super()._attr_policy(policy) + + def _attr_typol(self): + return f'.type = YNL_PT_U{self.type[1:]}, ' + + def arg_member(self, ri): + return [f'{self.type_name} {self.c_name}{self.byte_order_comment}'] + + def attr_put(self, ri, var): + self._attr_put_simple(ri, var, self._mnl_type()) + + def _attr_get(self, ri, var): + return f"{var}->{self.c_name} = mnl_attr_get_{self._mnl_type()}(attr);", None, None + + def _setter_lines(self, ri, member, presence): + return [f"{member} = {self.c_name};"] + + +class TypeFlag(Type): + def arg_member(self, ri): + return [] + + def _attr_typol(self): + return '.type = YNL_PT_FLAG, ' + + def attr_put(self, ri, var): + self._attr_put_line(ri, var, f"mnl_attr_put(nlh, {self.enum_name}, 0, NULL)") + + def _attr_get(self, ri, var): + return [], None, None + + def _setter_lines(self, ri, member, presence): + return [] + + +class TypeString(Type): + def arg_member(self, ri): + return [f"const char *{self.c_name}"] + + def presence_type(self): + return 'len' + + def struct_member(self, ri): + ri.cw.p(f"char *{self.c_name};") + + def _attr_typol(self): + return f'.type = YNL_PT_NUL_STR, ' + + def _attr_policy(self, policy): + mem = '{ .type = ' + policy + if 'max-len' in self.checks: + mem += ', .len = ' + str(self.checks['max-len']) + mem += ', }' + return mem + + def attr_policy(self, cw): + if self.checks.get('unterminated-ok', False): + policy = 'NLA_STRING' + else: + policy = 'NLA_NUL_STRING' + + spec = self._attr_policy(policy) + cw.p(f"\t[{self.enum_name}] = {spec},") + + def attr_put(self, ri, var): + self._attr_put_simple(ri, var, 'strz') + + def _attr_get(self, ri, var): + len_mem = var + '->_present.' + self.c_name + '_len' + return [f"{len_mem} = len;", + f"{var}->{self.c_name} = malloc(len + 1);", + f"memcpy({var}->{self.c_name}, mnl_attr_get_str(attr), len);", + f"{var}->{self.c_name}[len] = 0;"], \ + ['len = strnlen(mnl_attr_get_str(attr), mnl_attr_get_payload_len(attr));'], \ + ['unsigned int len;'] + + def _setter_lines(self, ri, member, presence): + return [f"free({member});", + f"{presence}_len = strlen({self.c_name});", + f"{member} = malloc({presence}_len + 1);", + f'memcpy({member}, {self.c_name}, {presence}_len);', + f'{member}[{presence}_len] = 0;'] + + +class TypeBinary(Type): + def arg_member(self, ri): + return [f"const void *{self.c_name}", 'size_t len'] + + def presence_type(self): + return 'len' + + def struct_member(self, ri): + ri.cw.p(f"void *{self.c_name};") + + def _attr_typol(self): + return f'.type = YNL_PT_BINARY,' + + def _attr_policy(self, policy): + mem = '{ ' + if len(self.checks) == 1 and 'min-len' in self.checks: + mem += '.len = ' + str(self.checks['min-len']) + elif len(self.checks) == 0: + mem += '.type = NLA_BINARY' + else: + raise Exception('One or more of binary type checks not implemented, yet') + mem += ', }' + return mem + + def attr_put(self, ri, var): + self._attr_put_line(ri, var, f"mnl_attr_put(nlh, {self.enum_name}, " + + f"{var}->_present.{self.c_name}_len, {var}->{self.c_name})") + + def _attr_get(self, ri, var): + len_mem = var + '->_present.' + self.c_name + '_len' + return [f"{len_mem} = len;", + f"{var}->{self.c_name} = malloc(len);", + f"memcpy({var}->{self.c_name}, mnl_attr_get_payload(attr), len);"], \ + ['len = mnl_attr_get_payload_len(attr);'], \ + ['unsigned int len;'] + + def _setter_lines(self, ri, member, presence): + return [f"free({member});", + f"{member} = malloc({presence}_len);", + f'memcpy({member}, {self.c_name}, {presence}_len);'] + + +class TypeNest(Type): + def _complex_member_type(self, ri): + return f"struct {self.nested_render_name}" + + def free(self, ri, var, ref): + ri.cw.p(f'{self.nested_render_name}_free(&{var}->{ref}{self.c_name});') + + def _attr_typol(self): + return f'.type = YNL_PT_NEST, .nest = &{self.nested_render_name}_nest, ' + + def _attr_policy(self, policy): + return 'NLA_POLICY_NESTED(' + self.nested_render_name + '_nl_policy)' + + def attr_put(self, ri, var): + self._attr_put_line(ri, var, f"{self.nested_render_name}_put(nlh, " + + f"{self.enum_name}, &{var}->{self.c_name})") + + def _attr_get(self, ri, var): + get_lines = [f"{self.nested_render_name}_parse(&parg, attr);"] + init_lines = [f"parg.rsp_policy = &{self.nested_render_name}_nest;", + f"parg.data = &{var}->{self.c_name};"] + return get_lines, init_lines, None + + def setter(self, ri, space, direction, deref=False, ref=None): + ref = (ref if ref else []) + [self.c_name] + + for _, attr in ri.family.pure_nested_structs[self.nested_attrs].member_list(): + attr.setter(ri, self.nested_attrs, direction, deref=deref, ref=ref) + + +class TypeMultiAttr(Type): + def is_multi_val(self): + return True + + def presence_type(self): + return 'count' + + def _complex_member_type(self, ri): + if 'type' not in self.attr or self.attr['type'] == 'nest': + return f"struct {self.nested_render_name}" + elif self.attr['type'] in scalars: + scalar_pfx = '__' if ri.ku_space == 'user' else '' + return scalar_pfx + self.attr['type'] + else: + raise Exception(f"Sub-type {self.attr['type']} not supported yet") + + def free_needs_iter(self): + return 'type' not in self.attr or self.attr['type'] == 'nest' + + def free(self, ri, var, ref): + if 'type' not in self.attr or self.attr['type'] == 'nest': + ri.cw.p(f"for (i = 0; i < {var}->{ref}n_{self.c_name}; i++)") + ri.cw.p(f'{self.nested_render_name}_free(&{var}->{ref}{self.c_name}[i]);') + + def _attr_typol(self): + if 'type' not in self.attr or self.attr['type'] == 'nest': + return f'.type = YNL_PT_NEST, .nest = &{self.nested_render_name}_nest, ' + elif self.attr['type'] in scalars: + return f".type = YNL_PT_U{self.attr['type'][1:]}, " + else: + raise Exception(f"Sub-type {self.attr['type']} not supported yet") + + def _attr_get(self, ri, var): + return f'{var}->n_{self.c_name}++;', None, None + + +class TypeArrayNest(Type): + def is_multi_val(self): + return True + + def presence_type(self): + return 'count' + + def _complex_member_type(self, ri): + if 'sub-type' not in self.attr or self.attr['sub-type'] == 'nest': + return f"struct {self.nested_render_name}" + elif self.attr['sub-type'] in scalars: + scalar_pfx = '__' if ri.ku_space == 'user' else '' + return scalar_pfx + self.attr['sub-type'] + else: + raise Exception(f"Sub-type {self.attr['sub-type']} not supported yet") + + def _attr_typol(self): + return f'.type = YNL_PT_NEST, .nest = &{self.nested_render_name}_nest, ' + + def _attr_get(self, ri, var): + local_vars = ['const struct nlattr *attr2;'] + get_lines = [f'attr_{self.c_name} = attr;', + 'mnl_attr_for_each_nested(attr2, attr)', + f'\t{var}->n_{self.c_name}++;'] + return get_lines, None, local_vars + + +class TypeNestTypeValue(Type): + def _complex_member_type(self, ri): + return f"struct {self.nested_render_name}" + + def _attr_typol(self): + return f'.type = YNL_PT_NEST, .nest = &{self.nested_render_name}_nest, ' + + def _attr_get(self, ri, var): + prev = 'attr' + tv_args = '' + get_lines = [] + local_vars = [] + init_lines = [f"parg.rsp_policy = &{self.nested_render_name}_nest;", + f"parg.data = &{var}->{self.c_name};"] + if 'type-value' in self.attr: + tv_names = [c_lower(x) for x in self.attr["type-value"]] + local_vars += [f'const struct nlattr *attr_{", *attr_".join(tv_names)};'] + local_vars += [f'__u32 {", ".join(tv_names)};'] + for level in self.attr["type-value"]: + level = c_lower(level) + get_lines += [f'attr_{level} = mnl_attr_get_payload({prev});'] + get_lines += [f'{level} = mnl_attr_get_type(attr_{level});'] + prev = 'attr_' + level + + tv_args = f", {', '.join(tv_names)}" + + get_lines += [f"{self.nested_render_name}_parse(&parg, {prev}{tv_args});"] + return get_lines, init_lines, local_vars + + +class Struct: + def __init__(self, family, space_name, type_list=None, inherited=None): + self.family = family + self.space_name = space_name + self.attr_set = family.attr_sets[space_name] + # Use list to catch comparisons with empty sets + self._inherited = inherited if inherited is not None else [] + self.inherited = [] + + self.nested = type_list is None + if family.name == c_lower(space_name): + self.render_name = f"{family.name}" + else: + self.render_name = f"{family.name}_{c_lower(space_name)}" + self.struct_name = 'struct ' + self.render_name + self.ptr_name = self.struct_name + ' *' + + self.request = False + self.reply = False + + self.attr_list = [] + self.attrs = dict() + if type_list: + for t in type_list: + self.attr_list.append((t, self.attr_set[t]),) + else: + for t in self.attr_set: + self.attr_list.append((t, self.attr_set[t]),) + + max_val = 0 + self.attr_max_val = None + for name, attr in self.attr_list: + if attr.value > max_val: + max_val = attr.value + self.attr_max_val = attr + self.attrs[name] = attr + + def __iter__(self): + yield from self.attrs + + def __getitem__(self, key): + return self.attrs[key] + + def member_list(self): + return self.attr_list + + def set_inherited(self, new_inherited): + if self._inherited != new_inherited: + raise Exception("Inheriting different members not supported") + self.inherited = [c_lower(x) for x in sorted(self._inherited)] + + +class EnumEntry: + def __init__(self, enum_set, yaml, prev, value_start): + if isinstance(yaml, str): + self.name = yaml + yaml = {} + self.doc = '' + else: + self.name = yaml['name'] + self.doc = yaml.get('doc', '') + + self.yaml = yaml + self.enum_set = enum_set + self.c_name = c_upper(enum_set.value_pfx + self.name) + + if 'value' in yaml: + self.value = yaml['value'] + if prev: + self.value_change = (self.value != prev.value + 1) + elif prev: + self.value_change = False + self.value = prev.value + 1 + else: + self.value = value_start + self.value_change = (self.value != 0) + + self.value_change = self.value_change or self.enum_set['type'] == 'flags' + + def __getitem__(self, key): + return self.yaml[key] + + def __contains__(self, key): + return key in self.yaml + + def has_doc(self): + return bool(self.doc) + + # raw value, i.e. the id in the enum, unlike user value which is a mask for flags + def raw_value(self): + return self.value + + # user value, same as raw value for enums, for flags it's the mask + def user_value(self): + if self.enum_set['type'] == 'flags': + return 1 << self.value + else: + return self.value + + +class EnumSet: + def __init__(self, family, yaml): + self.yaml = yaml + self.family = family + + self.render_name = c_lower(family.name + '-' + yaml['name']) + self.enum_name = 'enum ' + self.render_name + + self.value_pfx = yaml.get('name-prefix', f"{family.name}-{yaml['name']}-") + + self.type = yaml['type'] + + prev_entry = None + value_start = self.yaml.get('value-start', 0) + self.entries = {} + self.entry_list = [] + for entry in self.yaml['entries']: + e = EnumEntry(self, entry, prev_entry, value_start) + self.entries[e.name] = e + self.entry_list.append(e) + prev_entry = e + + def __getitem__(self, key): + return self.yaml[key] + + def __contains__(self, key): + return key in self.yaml + + def has_doc(self): + if 'doc' in self.yaml: + return True + for entry in self.entry_list: + if entry.has_doc(): + return True + return False + + def get_mask(self): + mask = 0 + idx = self.yaml.get('value-start', 0) + for _ in self.entry_list: + mask |= 1 << idx + idx += 1 + return mask + + +class AttrSet(SpecAttrSet): + def __init__(self, family, yaml): + super().__init__(family, yaml) + + if self.subset_of is None: + if 'name-prefix' in yaml: + pfx = yaml['name-prefix'] + elif self.name == family.name: + pfx = family.name + '-a-' + else: + pfx = f"{family.name}-a-{self.name}-" + self.name_prefix = c_upper(pfx) + self.max_name = c_upper(self.yaml.get('attr-max-name', f"{self.name_prefix}max")) + else: + self.name_prefix = family.attr_sets[self.subset_of].name_prefix + self.max_name = family.attr_sets[self.subset_of].max_name + + # Added by resolve: + self.c_name = None + delattr(self, "c_name") + + def resolve(self): + self.c_name = c_lower(self.name) + if self.c_name in _C_KW: + self.c_name += '_' + if self.c_name == self.family.c_name: + self.c_name = '' + + def new_attr(self, elem, value): + if 'multi-attr' in elem and elem['multi-attr']: + return TypeMultiAttr(self.family, self, elem, value) + elif elem['type'] in scalars: + return TypeScalar(self.family, self, elem, value) + elif elem['type'] == 'unused': + return TypeUnused(self.family, self, elem, value) + elif elem['type'] == 'pad': + return TypePad(self.family, self, elem, value) + elif elem['type'] == 'flag': + return TypeFlag(self.family, self, elem, value) + elif elem['type'] == 'string': + return TypeString(self.family, self, elem, value) + elif elem['type'] == 'binary': + return TypeBinary(self.family, self, elem, value) + elif elem['type'] == 'nest': + return TypeNest(self.family, self, elem, value) + elif elem['type'] == 'array-nest': + return TypeArrayNest(self.family, self, elem, value) + elif elem['type'] == 'nest-type-value': + return TypeNestTypeValue(self.family, self, elem, value) + else: + raise Exception(f"No typed class for type {elem['type']}") + + +class Operation(SpecOperation): + def __init__(self, family, yaml, req_value, rsp_value): + super().__init__(family, yaml, req_value, rsp_value) + + if req_value != rsp_value: + raise Exception("Directional messages not supported by codegen") + + self.render_name = family.name + '_' + c_lower(self.name) + + self.dual_policy = ('do' in yaml and 'request' in yaml['do']) and \ + ('dump' in yaml and 'request' in yaml['dump']) + + # Added by resolve: + self.enum_name = None + delattr(self, "enum_name") + + def resolve(self): + self.resolve_up(super()) + + if not self.is_async: + self.enum_name = self.family.op_prefix + c_upper(self.name) + else: + self.enum_name = self.family.async_op_prefix + c_upper(self.name) + + def add_notification(self, op): + if 'notify' not in self.yaml: + self.yaml['notify'] = dict() + self.yaml['notify']['reply'] = self.yaml['do']['reply'] + self.yaml['notify']['cmds'] = [] + self.yaml['notify']['cmds'].append(op) + + +class Family(SpecFamily): + def __init__(self, file_name): + # Added by resolve: + self.c_name = None + delattr(self, "c_name") + self.op_prefix = None + delattr(self, "op_prefix") + self.async_op_prefix = None + delattr(self, "async_op_prefix") + self.mcgrps = None + delattr(self, "mcgrps") + self.consts = None + delattr(self, "consts") + self.hooks = None + delattr(self, "hooks") + + super().__init__(file_name) + + self.fam_key = c_upper(self.yaml.get('c-family-name', self.yaml["name"] + '_FAMILY_NAME')) + self.ver_key = c_upper(self.yaml.get('c-version-name', self.yaml["name"] + '_FAMILY_VERSION')) + + if 'definitions' not in self.yaml: + self.yaml['definitions'] = [] + + if 'uapi-header' in self.yaml: + self.uapi_header = self.yaml['uapi-header'] + else: + self.uapi_header = f"linux/{self.name}.h" + + def resolve(self): + self.resolve_up(super()) + + if self.yaml.get('protocol', 'genetlink') not in {'genetlink', 'genetlink-c', 'genetlink-legacy'}: + raise Exception("Codegen only supported for genetlink") + + self.c_name = c_lower(self.name) + if 'name-prefix' in self.yaml['operations']: + self.op_prefix = c_upper(self.yaml['operations']['name-prefix']) + else: + self.op_prefix = c_upper(self.yaml['name'] + '-cmd-') + if 'async-prefix' in self.yaml['operations']: + self.async_op_prefix = c_upper(self.yaml['operations']['async-prefix']) + else: + self.async_op_prefix = self.op_prefix + + self.mcgrps = self.yaml.get('mcast-groups', {'list': []}) + + self.consts = dict() + + self.hooks = dict() + for when in ['pre', 'post']: + self.hooks[when] = dict() + for op_mode in ['do', 'dump']: + self.hooks[when][op_mode] = dict() + self.hooks[when][op_mode]['set'] = set() + self.hooks[when][op_mode]['list'] = [] + + # dict space-name -> 'request': set(attrs), 'reply': set(attrs) + self.root_sets = dict() + # dict space-name -> set('request', 'reply') + self.pure_nested_structs = dict() + self.all_notify = dict() + + self._mock_up_events() + + self._dictify() + self._load_root_sets() + self._load_nested_sets() + self._load_all_notify() + self._load_hooks() + + self.kernel_policy = self.yaml.get('kernel-policy', 'split') + if self.kernel_policy == 'global': + self._load_global_policy() + + def new_attr_set(self, elem): + return AttrSet(self, elem) + + def new_operation(self, elem, req_value, rsp_value): + return Operation(self, elem, req_value, rsp_value) + + # Fake a 'do' equivalent of all events, so that we can render their response parsing + def _mock_up_events(self): + for op in self.yaml['operations']['list']: + if 'event' in op: + op['do'] = { + 'reply': { + 'attributes': op['event']['attributes'] + } + } + + def _dictify(self): + for elem in self.yaml['definitions']: + if elem['type'] == 'enum' or elem['type'] == 'flags': + self.consts[elem['name']] = EnumSet(self, elem) + else: + self.consts[elem['name']] = elem + + ntf = [] + for msg in self.msgs.values(): + if 'notify' in msg: + ntf.append(msg) + for n in ntf: + self.ops[n['notify']].add_notification(n) + + def _load_root_sets(self): + for op_name, op in self.ops.items(): + if 'attribute-set' not in op: + continue + + req_attrs = set() + rsp_attrs = set() + for op_mode in ['do', 'dump']: + if op_mode in op and 'request' in op[op_mode]: + req_attrs.update(set(op[op_mode]['request']['attributes'])) + if op_mode in op and 'reply' in op[op_mode]: + rsp_attrs.update(set(op[op_mode]['reply']['attributes'])) + + if op['attribute-set'] not in self.root_sets: + self.root_sets[op['attribute-set']] = {'request': req_attrs, 'reply': rsp_attrs} + else: + self.root_sets[op['attribute-set']]['request'].update(req_attrs) + self.root_sets[op['attribute-set']]['reply'].update(rsp_attrs) + + def _load_nested_sets(self): + for root_set, rs_members in self.root_sets.items(): + for attr, spec in self.attr_sets[root_set].items(): + if 'nested-attributes' in spec: + inherit = set() + nested = spec['nested-attributes'] + if nested not in self.root_sets: + self.pure_nested_structs[nested] = Struct(self, nested, inherited=inherit) + if attr in rs_members['request']: + self.pure_nested_structs[nested].request = True + if attr in rs_members['reply']: + self.pure_nested_structs[nested].reply = True + + if 'type-value' in spec: + if nested in self.root_sets: + raise Exception("Inheriting members to a space used as root not supported") + inherit.update(set(spec['type-value'])) + elif spec['type'] == 'array-nest': + inherit.add('idx') + self.pure_nested_structs[nested].set_inherited(inherit) + + def _load_all_notify(self): + for op_name, op in self.ops.items(): + if not op: + continue + + if 'notify' in op: + self.all_notify[op_name] = op['notify']['cmds'] + + def _load_global_policy(self): + global_set = set() + attr_set_name = None + for op_name, op in self.ops.items(): + if not op: + continue + if 'attribute-set' not in op: + continue + + if attr_set_name is None: + attr_set_name = op['attribute-set'] + if attr_set_name != op['attribute-set']: + raise Exception('For a global policy all ops must use the same set') + + for op_mode in ['do', 'dump']: + if op_mode in op: + global_set.update(op[op_mode].get('request', [])) + + self.global_policy = [] + self.global_policy_set = attr_set_name + for attr in self.attr_sets[attr_set_name]: + if attr in global_set: + self.global_policy.append(attr) + + def _load_hooks(self): + for op in self.ops.values(): + for op_mode in ['do', 'dump']: + if op_mode not in op: + continue + for when in ['pre', 'post']: + if when not in op[op_mode]: + continue + name = op[op_mode][when] + if name in self.hooks[when][op_mode]['set']: + continue + self.hooks[when][op_mode]['set'].add(name) + self.hooks[when][op_mode]['list'].append(name) + + +class RenderInfo: + def __init__(self, cw, family, ku_space, op, op_name, op_mode, attr_set=None): + self.family = family + self.nl = cw.nlib + self.ku_space = ku_space + self.op = op + self.op_name = op_name + self.op_mode = op_mode + + # 'do' and 'dump' response parsing is identical + if op_mode != 'do' and 'dump' in op and 'do' in op and 'reply' in op['do'] and \ + op["do"]["reply"] == op["dump"]["reply"]: + self.type_consistent = True + else: + self.type_consistent = op_mode == 'event' + + self.attr_set = attr_set + if not self.attr_set: + self.attr_set = op['attribute-set'] + + if op: + self.type_name = c_lower(op_name) + else: + self.type_name = c_lower(attr_set) + + self.cw = cw + + self.struct = dict() + for op_dir in ['request', 'reply']: + if op and op_dir in op[op_mode]: + self.struct[op_dir] = Struct(family, self.attr_set, + type_list=op[op_mode][op_dir]['attributes']) + if op_mode == 'event': + self.struct['reply'] = Struct(family, self.attr_set, type_list=op['event']['attributes']) + + +class CodeWriter: + def __init__(self, nlib, out_file): + self.nlib = nlib + + self._nl = False + self._silent_block = False + self._ind = 0 + self._out = out_file + + @classmethod + def _is_cond(cls, line): + return line.startswith('if') or line.startswith('while') or line.startswith('for') + + def p(self, line, add_ind=0): + if self._nl: + self._out.write('\n') + self._nl = False + ind = self._ind + if line[-1] == ':': + ind -= 1 + if self._silent_block: + ind += 1 + self._silent_block = line.endswith(')') and CodeWriter._is_cond(line) + if add_ind: + ind += add_ind + self._out.write('\t' * ind + line + '\n') + + def nl(self): + self._nl = True + + def block_start(self, line=''): + if line: + line = line + ' ' + self.p(line + '{') + self._ind += 1 + + def block_end(self, line=''): + if line and line[0] not in {';', ','}: + line = ' ' + line + self._ind -= 1 + self.p('}' + line) + + def write_doc_line(self, doc, indent=True): + words = doc.split() + line = ' *' + for word in words: + if len(line) + len(word) >= 79: + self.p(line) + line = ' *' + if indent: + line += ' ' + line += ' ' + word + self.p(line) + + def write_func_prot(self, qual_ret, name, args=None, doc=None, suffix=''): + if not args: + args = ['void'] + + if doc: + self.p('/*') + self.p(' * ' + doc) + self.p(' */') + + oneline = qual_ret + if qual_ret[-1] != '*': + oneline += ' ' + oneline += f"{name}({', '.join(args)}){suffix}" + + if len(oneline) < 80: + self.p(oneline) + return + + v = qual_ret + if len(v) > 3: + self.p(v) + v = '' + elif qual_ret[-1] != '*': + v += ' ' + v += name + '(' + ind = '\t' * (len(v) // 8) + ' ' * (len(v) % 8) + delta_ind = len(v) - len(ind) + v += args[0] + i = 1 + while i < len(args): + next_len = len(v) + len(args[i]) + if v[0] == '\t': + next_len += delta_ind + if next_len > 76: + self.p(v + ',') + v = ind + else: + v += ', ' + v += args[i] + i += 1 + self.p(v + ')' + suffix) + + def write_func_lvar(self, local_vars): + if not local_vars: + return + + if type(local_vars) is str: + local_vars = [local_vars] + + local_vars.sort(key=len, reverse=True) + for var in local_vars: + self.p(var) + self.nl() + + def write_func(self, qual_ret, name, body, args=None, local_vars=None): + self.write_func_prot(qual_ret=qual_ret, name=name, args=args) + self.write_func_lvar(local_vars=local_vars) + + self.block_start() + for line in body: + self.p(line) + self.block_end() + + def writes_defines(self, defines): + longest = 0 + for define in defines: + if len(define[0]) > longest: + longest = len(define[0]) + longest = ((longest + 8) // 8) * 8 + for define in defines: + line = '#define ' + define[0] + line += '\t' * ((longest - len(define[0]) + 7) // 8) + if type(define[1]) is int: + line += str(define[1]) + elif type(define[1]) is str: + line += '"' + define[1] + '"' + self.p(line) + + def write_struct_init(self, members): + longest = max([len(x[0]) for x in members]) + longest += 1 # because we prepend a . + longest = ((longest + 8) // 8) * 8 + for one in members: + line = '.' + one[0] + line += '\t' * ((longest - len(one[0]) - 1 + 7) // 8) + line += '= ' + one[1] + ',' + self.p(line) + + +scalars = {'u8', 'u16', 'u32', 'u64', 's32', 's64'} + +direction_to_suffix = { + 'reply': '_rsp', + 'request': '_req', + '': '' +} + +op_mode_to_wrapper = { + 'do': '', + 'dump': '_list', + 'notify': '_ntf', + 'event': '', +} + +_C_KW = { + 'do' +} + + +def rdir(direction): + if direction == 'reply': + return 'request' + if direction == 'request': + return 'reply' + return direction + + +def op_prefix(ri, direction, deref=False): + suffix = f"_{ri.type_name}" + + if not ri.op_mode or ri.op_mode == 'do': + suffix += f"{direction_to_suffix[direction]}" + else: + if direction == 'request': + suffix += '_req_dump' + else: + if ri.type_consistent: + if deref: + suffix += f"{direction_to_suffix[direction]}" + else: + suffix += op_mode_to_wrapper[ri.op_mode] + else: + suffix += '_rsp' + suffix += '_dump' if deref else '_list' + + return f"{ri.family['name']}{suffix}" + + +def type_name(ri, direction, deref=False): + return f"struct {op_prefix(ri, direction, deref=deref)}" + + +def print_prototype(ri, direction, terminate=True, doc=None): + suffix = ';' if terminate else '' + + fname = ri.op.render_name + if ri.op_mode == 'dump': + fname += '_dump' + + args = ['struct ynl_sock *ys'] + if 'request' in ri.op[ri.op_mode]: + args.append(f"{type_name(ri, direction)} *" + f"{direction_to_suffix[direction][1:]}") + + ret = 'int' + if 'reply' in ri.op[ri.op_mode]: + ret = f"{type_name(ri, rdir(direction))} *" + + ri.cw.write_func_prot(ret, fname, args, doc=doc, suffix=suffix) + + +def print_req_prototype(ri): + print_prototype(ri, "request", doc=ri.op['doc']) + + +def print_dump_prototype(ri): + print_prototype(ri, "request") + + +def put_typol_fwd(cw, struct): + cw.p(f'extern struct ynl_policy_nest {struct.render_name}_nest;') + + +def put_typol(cw, struct): + type_max = struct.attr_set.max_name + cw.block_start(line=f'struct ynl_policy_attr {struct.render_name}_policy[{type_max} + 1] =') + + for _, arg in struct.member_list(): + arg.attr_typol(cw) + + cw.block_end(line=';') + cw.nl() + + cw.block_start(line=f'struct ynl_policy_nest {struct.render_name}_nest =') + cw.p(f'.max_attr = {type_max},') + cw.p(f'.table = {struct.render_name}_policy,') + cw.block_end(line=';') + cw.nl() + + +def put_req_nested(ri, struct): + func_args = ['struct nlmsghdr *nlh', + 'unsigned int attr_type', + f'{struct.ptr_name}obj'] + + ri.cw.write_func_prot('int', f'{struct.render_name}_put', func_args) + ri.cw.block_start() + ri.cw.write_func_lvar('struct nlattr *nest;') + + ri.cw.p("nest = mnl_attr_nest_start(nlh, attr_type);") + + for _, arg in struct.member_list(): + arg.attr_put(ri, "obj") + + ri.cw.p("mnl_attr_nest_end(nlh, nest);") + + ri.cw.nl() + ri.cw.p('return 0;') + ri.cw.block_end() + ri.cw.nl() + + +def _multi_parse(ri, struct, init_lines, local_vars): + if struct.nested: + iter_line = "mnl_attr_for_each_nested(attr, nested)" + else: + iter_line = "mnl_attr_for_each(attr, nlh, sizeof(struct genlmsghdr))" + + array_nests = set() + multi_attrs = set() + needs_parg = False + for arg, aspec in struct.member_list(): + if aspec['type'] == 'array-nest': + local_vars.append(f'const struct nlattr *attr_{aspec.c_name};') + array_nests.add(arg) + if 'multi-attr' in aspec: + multi_attrs.add(arg) + needs_parg |= 'nested-attributes' in aspec + if array_nests or multi_attrs: + local_vars.append('int i;') + if needs_parg: + local_vars.append('struct ynl_parse_arg parg;') + init_lines.append('parg.ys = yarg->ys;') + + ri.cw.block_start() + ri.cw.write_func_lvar(local_vars) + + for line in init_lines: + ri.cw.p(line) + ri.cw.nl() + + for arg in struct.inherited: + ri.cw.p(f'dst->{arg} = {arg};') + + ri.cw.nl() + ri.cw.block_start(line=iter_line) + + first = True + for _, arg in struct.member_list(): + arg.attr_get(ri, 'dst', first=first) + first = False + + ri.cw.block_end() + ri.cw.nl() + + for anest in sorted(array_nests): + aspec = struct[anest] + + ri.cw.block_start(line=f"if (dst->n_{aspec.c_name})") + ri.cw.p(f"dst->{aspec.c_name} = calloc(dst->n_{aspec.c_name}, sizeof(*dst->{aspec.c_name}));") + ri.cw.p('i = 0;') + ri.cw.p(f"parg.rsp_policy = &{aspec.nested_render_name}_nest;") + ri.cw.block_start(line=f"mnl_attr_for_each_nested(attr, attr_{aspec.c_name})") + ri.cw.p(f"parg.data = &dst->{aspec.c_name}[i];") + ri.cw.p(f"if ({aspec.nested_render_name}_parse(&parg, attr, mnl_attr_get_type(attr)))") + ri.cw.p('return MNL_CB_ERROR;') + ri.cw.p('i++;') + ri.cw.block_end() + ri.cw.block_end() + ri.cw.nl() + + for anest in sorted(multi_attrs): + aspec = struct[anest] + ri.cw.block_start(line=f"if (dst->n_{aspec.c_name})") + ri.cw.p(f"dst->{aspec.c_name} = calloc(dst->n_{aspec.c_name}, sizeof(*dst->{aspec.c_name}));") + ri.cw.p('i = 0;') + if 'nested-attributes' in aspec: + ri.cw.p(f"parg.rsp_policy = &{aspec.nested_render_name}_nest;") + ri.cw.block_start(line=iter_line) + ri.cw.block_start(line=f"if (mnl_attr_get_type(attr) == {aspec.enum_name})") + if 'nested-attributes' in aspec: + ri.cw.p(f"parg.data = &dst->{aspec.c_name}[i];") + ri.cw.p(f"if ({aspec.nested_render_name}_parse(&parg, attr))") + ri.cw.p('return MNL_CB_ERROR;') + elif aspec['type'] in scalars: + t = aspec['type'] + if t[0] == 's': + t = 'u' + t[1:] + ri.cw.p(f"dst->{aspec.c_name}[i] = mnl_attr_get_{t}(attr);") + else: + raise Exception('Nest parsing type not supported yet') + ri.cw.p('i++;') + ri.cw.block_end() + ri.cw.block_end() + ri.cw.block_end() + ri.cw.nl() + + if struct.nested: + ri.cw.p('return 0;') + else: + ri.cw.p('return MNL_CB_OK;') + ri.cw.block_end() + ri.cw.nl() + + +def parse_rsp_nested(ri, struct): + func_args = ['struct ynl_parse_arg *yarg', + 'const struct nlattr *nested'] + for arg in struct.inherited: + func_args.append('__u32 ' + arg) + + local_vars = ['const struct nlattr *attr;', + f'{struct.ptr_name}dst = yarg->data;'] + init_lines = [] + + ri.cw.write_func_prot('int', f'{struct.render_name}_parse', func_args) + + _multi_parse(ri, struct, init_lines, local_vars) + + +def parse_rsp_msg(ri, deref=False): + if 'reply' not in ri.op[ri.op_mode] and ri.op_mode != 'event': + return + + func_args = ['const struct nlmsghdr *nlh', + 'void *data'] + + local_vars = [f'{type_name(ri, "reply", deref=deref)} *dst;', + 'struct ynl_parse_arg *yarg = data;', + 'const struct nlattr *attr;'] + init_lines = ['dst = yarg->data;'] + + ri.cw.write_func_prot('int', f'{op_prefix(ri, "reply", deref=deref)}_parse', func_args) + + _multi_parse(ri, ri.struct["reply"], init_lines, local_vars) + + +def print_req(ri): + ret_ok = '0' + ret_err = '-1' + direction = "request" + local_vars = ['struct nlmsghdr *nlh;', + 'int len, err;'] + + if 'reply' in ri.op[ri.op_mode]: + ret_ok = 'rsp' + ret_err = 'NULL' + local_vars += [f'{type_name(ri, rdir(direction))} *rsp;', + 'struct ynl_parse_arg yarg = { .ys = ys, };'] + + print_prototype(ri, direction, terminate=False) + ri.cw.block_start() + ri.cw.write_func_lvar(local_vars) + + ri.cw.p(f"nlh = ynl_gemsg_start_req(ys, {ri.nl.get_family_id()}, {ri.op.enum_name}, 1);") + + ri.cw.p(f"ys->req_policy = &{ri.struct['request'].render_name}_nest;") + if 'reply' in ri.op[ri.op_mode]: + ri.cw.p(f"yarg.rsp_policy = &{ri.struct['reply'].render_name}_nest;") + ri.cw.nl() + for _, attr in ri.struct["request"].member_list(): + attr.attr_put(ri, "req") + ri.cw.nl() + + ri.cw.p('err = mnl_socket_sendto(ys->sock, nlh, nlh->nlmsg_len);') + ri.cw.p('if (err < 0)') + ri.cw.p(f"return {ret_err};") + ri.cw.nl() + ri.cw.p('len = mnl_socket_recvfrom(ys->sock, ys->rx_buf, MNL_SOCKET_BUFFER_SIZE);') + ri.cw.p('if (len < 0)') + ri.cw.p(f"return {ret_err};") + ri.cw.nl() + + if 'reply' in ri.op[ri.op_mode]: + ri.cw.p('rsp = calloc(1, sizeof(*rsp));') + ri.cw.p('yarg.data = rsp;') + ri.cw.nl() + ri.cw.p(f"err = {ri.nl.parse_cb_run(op_prefix(ri, 'reply') + '_parse', '&yarg', False)};") + ri.cw.p('if (err < 0)') + ri.cw.p('goto err_free;') + ri.cw.nl() + + ri.cw.p('err = ynl_recv_ack(ys, err);') + ri.cw.p('if (err)') + ri.cw.p('goto err_free;') + ri.cw.nl() + ri.cw.p(f"return {ret_ok};") + ri.cw.nl() + ri.cw.p('err_free:') + + if 'reply' in ri.op[ri.op_mode]: + ri.cw.p(f"{call_free(ri, rdir(direction), 'rsp')}") + ri.cw.p(f"return {ret_err};") + ri.cw.block_end() + + +def print_dump(ri): + direction = "request" + print_prototype(ri, direction, terminate=False) + ri.cw.block_start() + local_vars = ['struct ynl_dump_state yds = {};', + 'struct nlmsghdr *nlh;', + 'int len, err;'] + + for var in local_vars: + ri.cw.p(f'{var}') + ri.cw.nl() + + ri.cw.p('yds.ys = ys;') + ri.cw.p(f"yds.alloc_sz = sizeof({type_name(ri, rdir(direction))});") + ri.cw.p(f"yds.cb = {op_prefix(ri, 'reply', deref=True)}_parse;") + ri.cw.p(f"yds.rsp_policy = &{ri.struct['reply'].render_name}_nest;") + ri.cw.nl() + ri.cw.p(f"nlh = ynl_gemsg_start_dump(ys, {ri.nl.get_family_id()}, {ri.op.enum_name}, 1);") + + if "request" in ri.op[ri.op_mode]: + ri.cw.p(f"ys->req_policy = &{ri.struct['request'].render_name}_nest;") + ri.cw.nl() + for _, attr in ri.struct["request"].member_list(): + attr.attr_put(ri, "req") + ri.cw.nl() + + ri.cw.p('err = mnl_socket_sendto(ys->sock, nlh, nlh->nlmsg_len);') + ri.cw.p('if (err < 0)') + ri.cw.p('return NULL;') + ri.cw.nl() + + ri.cw.block_start(line='do') + ri.cw.p('len = mnl_socket_recvfrom(ys->sock, ys->rx_buf, MNL_SOCKET_BUFFER_SIZE);') + ri.cw.p('if (len < 0)') + ri.cw.p('goto free_list;') + ri.cw.nl() + ri.cw.p(f"err = {ri.nl.parse_cb_run('ynl_dump_trampoline', '&yds', False, indent=2)};") + ri.cw.p('if (err < 0)') + ri.cw.p('goto free_list;') + ri.cw.block_end(line='while (err > 0);') + ri.cw.nl() + + ri.cw.p('return yds.first;') + ri.cw.nl() + ri.cw.p('free_list:') + ri.cw.p(call_free(ri, rdir(direction), 'yds.first')) + ri.cw.p('return NULL;') + ri.cw.block_end() + + +def call_free(ri, direction, var): + return f"{op_prefix(ri, direction)}_free({var});" + + +def free_arg_name(direction): + if direction: + return direction_to_suffix[direction][1:] + return 'obj' + + +def print_free_prototype(ri, direction, suffix=';'): + name = op_prefix(ri, direction) + arg = free_arg_name(direction) + ri.cw.write_func_prot('void', f"{name}_free", [f"struct {name} *{arg}"], suffix=suffix) + + +def _print_type(ri, direction, struct): + suffix = f'_{ri.type_name}{direction_to_suffix[direction]}' + + if ri.op_mode == 'dump': + suffix += '_dump' + + ri.cw.block_start(line=f"struct {ri.family['name']}{suffix}") + + meta_started = False + for _, attr in struct.member_list(): + for type_filter in ['len', 'bit']: + line = attr.presence_member(ri.ku_space, type_filter) + if line: + if not meta_started: + ri.cw.block_start(line=f"struct") + meta_started = True + ri.cw.p(line) + if meta_started: + ri.cw.block_end(line='_present;') + ri.cw.nl() + + for arg in struct.inherited: + ri.cw.p(f"__u32 {arg};") + + for _, attr in struct.member_list(): + attr.struct_member(ri) + + ri.cw.block_end(line=';') + ri.cw.nl() + + +def print_type(ri, direction): + _print_type(ri, direction, ri.struct[direction]) + + +def print_type_full(ri, struct): + _print_type(ri, "", struct) + + +def print_type_helpers(ri, direction, deref=False): + print_free_prototype(ri, direction) + + if ri.ku_space == 'user' and direction == 'request': + for _, attr in ri.struct[direction].member_list(): + attr.setter(ri, ri.attr_set, direction, deref=deref) + ri.cw.nl() + + +def print_req_type_helpers(ri): + print_type_helpers(ri, "request") + + +def print_rsp_type_helpers(ri): + if 'reply' not in ri.op[ri.op_mode]: + return + print_type_helpers(ri, "reply") + + +def print_parse_prototype(ri, direction, terminate=True): + suffix = "_rsp" if direction == "reply" else "_req" + term = ';' if terminate else '' + + ri.cw.write_func_prot('void', f"{ri.op.render_name}{suffix}_parse", + ['const struct nlattr **tb', + f"struct {ri.op.render_name}{suffix} *req"], + suffix=term) + + +def print_req_type(ri): + print_type(ri, "request") + + +def print_rsp_type(ri): + if (ri.op_mode == 'do' or ri.op_mode == 'dump') and 'reply' in ri.op[ri.op_mode]: + direction = 'reply' + elif ri.op_mode == 'event': + direction = 'reply' + else: + return + print_type(ri, direction) + + +def print_wrapped_type(ri): + ri.cw.block_start(line=f"{type_name(ri, 'reply')}") + if ri.op_mode == 'dump': + ri.cw.p(f"{type_name(ri, 'reply')} *next;") + elif ri.op_mode == 'notify' or ri.op_mode == 'event': + ri.cw.p('__u16 family;') + ri.cw.p('__u8 cmd;') + ri.cw.p(f"void (*free)({type_name(ri, 'reply')} *ntf);") + ri.cw.p(f"{type_name(ri, 'reply', deref=True)} obj __attribute__ ((aligned (8)));") + ri.cw.block_end(line=';') + ri.cw.nl() + print_free_prototype(ri, 'reply') + ri.cw.nl() + + +def _free_type_members_iter(ri, struct): + for _, attr in struct.member_list(): + if attr.free_needs_iter(): + ri.cw.p('unsigned int i;') + ri.cw.nl() + break + + +def _free_type_members(ri, var, struct, ref=''): + for _, attr in struct.member_list(): + attr.free(ri, var, ref) + + +def _free_type(ri, direction, struct): + var = free_arg_name(direction) + + print_free_prototype(ri, direction, suffix='') + ri.cw.block_start() + _free_type_members_iter(ri, struct) + _free_type_members(ri, var, struct) + if direction: + ri.cw.p(f'free({var});') + ri.cw.block_end() + ri.cw.nl() + + +def free_rsp_nested(ri, struct): + _free_type(ri, "", struct) + + +def print_rsp_free(ri): + if 'reply' not in ri.op[ri.op_mode]: + return + _free_type(ri, 'reply', ri.struct['reply']) + + +def print_dump_type_free(ri): + sub_type = type_name(ri, 'reply') + + print_free_prototype(ri, 'reply', suffix='') + ri.cw.block_start() + ri.cw.p(f"{sub_type} *next = rsp;") + ri.cw.nl() + ri.cw.block_start(line='while (next)') + _free_type_members_iter(ri, ri.struct['reply']) + ri.cw.p('rsp = next;') + ri.cw.p('next = rsp->next;') + ri.cw.nl() + + _free_type_members(ri, 'rsp', ri.struct['reply'], ref='obj.') + ri.cw.p(f'free(rsp);') + ri.cw.block_end() + ri.cw.block_end() + ri.cw.nl() + + +def print_ntf_type_free(ri): + print_free_prototype(ri, 'reply', suffix='') + ri.cw.block_start() + _free_type_members_iter(ri, ri.struct['reply']) + _free_type_members(ri, 'rsp', ri.struct['reply'], ref='obj.') + ri.cw.p(f'free(rsp);') + ri.cw.block_end() + ri.cw.nl() + + +def print_ntf_parse_prototype(family, cw, suffix=';'): + cw.write_func_prot('struct ynl_ntf_base_type *', f"{family['name']}_ntf_parse", + ['struct ynl_sock *ys'], suffix=suffix) + + +def print_ntf_type_parse(family, cw, ku_mode): + print_ntf_parse_prototype(family, cw, suffix='') + cw.block_start() + cw.write_func_lvar(['struct genlmsghdr *genlh;', + 'struct nlmsghdr *nlh;', + 'struct ynl_parse_arg yarg = { .ys = ys, };', + 'struct ynl_ntf_base_type *rsp;', + 'int len, err;', + 'mnl_cb_t parse;']) + cw.p('len = mnl_socket_recvfrom(ys->sock, ys->rx_buf, MNL_SOCKET_BUFFER_SIZE);') + cw.p('if (len < (ssize_t)(sizeof(*nlh) + sizeof(*genlh)))') + cw.p('return NULL;') + cw.nl() + cw.p('nlh = (struct nlmsghdr *)ys->rx_buf;') + cw.p('genlh = mnl_nlmsg_get_payload(nlh);') + cw.nl() + cw.block_start(line='switch (genlh->cmd)') + for ntf_op in sorted(family.all_notify.keys()): + op = family.ops[ntf_op] + ri = RenderInfo(cw, family, ku_mode, op, ntf_op, "notify") + for ntf in op['notify']['cmds']: + cw.p(f"case {ntf.enum_name}:") + cw.p(f"rsp = calloc(1, sizeof({type_name(ri, 'notify')}));") + cw.p(f"parse = {op_prefix(ri, 'reply', deref=True)}_parse;") + cw.p(f"yarg.rsp_policy = &{ri.struct['reply'].render_name}_nest;") + cw.p(f"rsp->free = (void *){op_prefix(ri, 'notify')}_free;") + cw.p('break;') + for op_name, op in family.ops.items(): + if 'event' not in op: + continue + ri = RenderInfo(cw, family, ku_mode, op, op_name, "event") + cw.p(f"case {op.enum_name}:") + cw.p(f"rsp = calloc(1, sizeof({type_name(ri, 'event')}));") + cw.p(f"parse = {op_prefix(ri, 'reply', deref=True)}_parse;") + cw.p(f"yarg.rsp_policy = &{ri.struct['reply'].render_name}_nest;") + cw.p(f"rsp->free = (void *){op_prefix(ri, 'notify')}_free;") + cw.p('break;') + cw.p('default:') + cw.p('ynl_error_unknown_notification(ys, genlh->cmd);') + cw.p('return NULL;') + cw.block_end() + cw.nl() + cw.p('yarg.data = rsp->data;') + cw.nl() + cw.p(f"err = {cw.nlib.parse_cb_run('parse', '&yarg', True)};") + cw.p('if (err < 0)') + cw.p('goto err_free;') + cw.nl() + cw.p('rsp->family = nlh->nlmsg_type;') + cw.p('rsp->cmd = genlh->cmd;') + cw.p('return rsp;') + cw.nl() + cw.p('err_free:') + cw.p('free(rsp);') + cw.p('return NULL;') + cw.block_end() + cw.nl() + + +def print_req_policy_fwd(cw, struct, ri=None, terminate=True): + if terminate and ri and kernel_can_gen_family_struct(struct.family): + return + + if terminate: + prefix = 'extern ' + else: + if kernel_can_gen_family_struct(struct.family) and ri: + prefix = 'static ' + else: + prefix = '' + + suffix = ';' if terminate else ' = {' + + max_attr = struct.attr_max_val + if ri: + name = ri.op.render_name + if ri.op.dual_policy: + name += '_' + ri.op_mode + else: + name = struct.render_name + cw.p(f"{prefix}const struct nla_policy {name}_nl_policy[{max_attr.enum_name} + 1]{suffix}") + + +def print_req_policy(cw, struct, ri=None): + print_req_policy_fwd(cw, struct, ri=ri, terminate=False) + for _, arg in struct.member_list(): + arg.attr_policy(cw) + cw.p("};") + + +def kernel_can_gen_family_struct(family): + return family.proto == 'genetlink' + + +def print_kernel_op_table_fwd(family, cw, terminate): + exported = not kernel_can_gen_family_struct(family) + + if not terminate or exported: + cw.p(f"/* Ops table for {family.name} */") + + pol_to_struct = {'global': 'genl_small_ops', + 'per-op': 'genl_ops', + 'split': 'genl_split_ops'} + struct_type = pol_to_struct[family.kernel_policy] + + if family.kernel_policy == 'split': + cnt = 0 + for op in family.ops.values(): + if 'do' in op: + cnt += 1 + if 'dump' in op: + cnt += 1 + else: + cnt = len(family.ops) + + qual = 'static const' if not exported else 'const' + line = f"{qual} struct {struct_type} {family.name}_nl_ops[{cnt}]" + if terminate: + cw.p(f"extern {line};") + else: + cw.block_start(line=line + ' =') + + if not terminate: + return + + cw.nl() + for name in family.hooks['pre']['do']['list']: + cw.write_func_prot('int', c_lower(name), + ['const struct genl_split_ops *ops', + 'struct sk_buff *skb', 'struct genl_info *info'], suffix=';') + for name in family.hooks['post']['do']['list']: + cw.write_func_prot('void', c_lower(name), + ['const struct genl_split_ops *ops', + 'struct sk_buff *skb', 'struct genl_info *info'], suffix=';') + for name in family.hooks['pre']['dump']['list']: + cw.write_func_prot('int', c_lower(name), + ['struct netlink_callback *cb'], suffix=';') + for name in family.hooks['post']['dump']['list']: + cw.write_func_prot('int', c_lower(name), + ['struct netlink_callback *cb'], suffix=';') + + cw.nl() + + for op_name, op in family.ops.items(): + if op.is_async: + continue + + if 'do' in op: + name = c_lower(f"{family.name}-nl-{op_name}-doit") + cw.write_func_prot('int', name, + ['struct sk_buff *skb', 'struct genl_info *info'], suffix=';') + + if 'dump' in op: + name = c_lower(f"{family.name}-nl-{op_name}-dumpit") + cw.write_func_prot('int', name, + ['struct sk_buff *skb', 'struct netlink_callback *cb'], suffix=';') + cw.nl() + + +def print_kernel_op_table_hdr(family, cw): + print_kernel_op_table_fwd(family, cw, terminate=True) + + +def print_kernel_op_table(family, cw): + print_kernel_op_table_fwd(family, cw, terminate=False) + if family.kernel_policy == 'global' or family.kernel_policy == 'per-op': + for op_name, op in family.ops.items(): + if op.is_async: + continue + + cw.block_start() + members = [('cmd', op.enum_name)] + if 'dont-validate' in op: + members.append(('validate', + ' | '.join([c_upper('genl-dont-validate-' + x) + for x in op['dont-validate']])), ) + for op_mode in ['do', 'dump']: + if op_mode in op: + name = c_lower(f"{family.name}-nl-{op_name}-{op_mode}it") + members.append((op_mode + 'it', name)) + if family.kernel_policy == 'per-op': + struct = Struct(family, op['attribute-set'], + type_list=op['do']['request']['attributes']) + + name = c_lower(f"{family.name}-{op_name}-nl-policy") + members.append(('policy', name)) + members.append(('maxattr', struct.attr_max_val.enum_name)) + if 'flags' in op: + members.append(('flags', ' | '.join([c_upper('genl-' + x) for x in op['flags']]))) + cw.write_struct_init(members) + cw.block_end(line=',') + elif family.kernel_policy == 'split': + cb_names = {'do': {'pre': 'pre_doit', 'post': 'post_doit'}, + 'dump': {'pre': 'start', 'post': 'done'}} + + for op_name, op in family.ops.items(): + for op_mode in ['do', 'dump']: + if op.is_async or op_mode not in op: + continue + + cw.block_start() + members = [('cmd', op.enum_name)] + if 'dont-validate' in op: + members.append(('validate', + ' | '.join([c_upper('genl-dont-validate-' + x) + for x in op['dont-validate']])), ) + name = c_lower(f"{family.name}-nl-{op_name}-{op_mode}it") + if 'pre' in op[op_mode]: + members.append((cb_names[op_mode]['pre'], c_lower(op[op_mode]['pre']))) + members.append((op_mode + 'it', name)) + if 'post' in op[op_mode]: + members.append((cb_names[op_mode]['post'], c_lower(op[op_mode]['post']))) + if 'request' in op[op_mode]: + struct = Struct(family, op['attribute-set'], + type_list=op[op_mode]['request']['attributes']) + + if op.dual_policy: + name = c_lower(f"{family.name}-{op_name}-{op_mode}-nl-policy") + else: + name = c_lower(f"{family.name}-{op_name}-nl-policy") + members.append(('policy', name)) + members.append(('maxattr', struct.attr_max_val.enum_name)) + flags = (op['flags'] if 'flags' in op else []) + ['cmd-cap-' + op_mode] + members.append(('flags', ' | '.join([c_upper('genl-' + x) for x in flags]))) + cw.write_struct_init(members) + cw.block_end(line=',') + + cw.block_end(line=';') + cw.nl() + + +def print_kernel_mcgrp_hdr(family, cw): + if not family.mcgrps['list']: + return + + cw.block_start('enum') + for grp in family.mcgrps['list']: + grp_id = c_upper(f"{family.name}-nlgrp-{grp['name']},") + cw.p(grp_id) + cw.block_end(';') + cw.nl() + + +def print_kernel_mcgrp_src(family, cw): + if not family.mcgrps['list']: + return + + cw.block_start('static const struct genl_multicast_group ' + family.name + '_nl_mcgrps[] =') + for grp in family.mcgrps['list']: + name = grp['name'] + grp_id = c_upper(f"{family.name}-nlgrp-{name}") + cw.p('[' + grp_id + '] = { "' + name + '", },') + cw.block_end(';') + cw.nl() + + +def print_kernel_family_struct_hdr(family, cw): + if not kernel_can_gen_family_struct(family): + return + + cw.p(f"extern struct genl_family {family.name}_nl_family;") + cw.nl() + + +def print_kernel_family_struct_src(family, cw): + if not kernel_can_gen_family_struct(family): + return + + cw.block_start(f"struct genl_family {family.name}_nl_family __ro_after_init =") + cw.p('.name\t\t= ' + family.fam_key + ',') + cw.p('.version\t= ' + family.ver_key + ',') + cw.p('.netnsok\t= true,') + cw.p('.parallel_ops\t= true,') + cw.p('.module\t\t= THIS_MODULE,') + if family.kernel_policy == 'per-op': + cw.p(f'.ops\t\t= {family.name}_nl_ops,') + cw.p(f'.n_ops\t\t= ARRAY_SIZE({family.name}_nl_ops),') + elif family.kernel_policy == 'split': + cw.p(f'.split_ops\t= {family.name}_nl_ops,') + cw.p(f'.n_split_ops\t= ARRAY_SIZE({family.name}_nl_ops),') + if family.mcgrps['list']: + cw.p(f'.mcgrps\t\t= {family.name}_nl_mcgrps,') + cw.p(f'.n_mcgrps\t= ARRAY_SIZE({family.name}_nl_mcgrps),') + cw.block_end(';') + + +def uapi_enum_start(family, cw, obj, ckey='', enum_name='enum-name'): + start_line = 'enum' + if enum_name in obj: + if obj[enum_name]: + start_line = 'enum ' + c_lower(obj[enum_name]) + elif ckey and ckey in obj: + start_line = 'enum ' + family.name + '_' + c_lower(obj[ckey]) + cw.block_start(line=start_line) + + +def render_uapi(family, cw): + hdr_prot = f"_UAPI_LINUX_{family.name.upper()}_H" + cw.p('#ifndef ' + hdr_prot) + cw.p('#define ' + hdr_prot) + cw.nl() + + defines = [(family.fam_key, family["name"]), + (family.ver_key, family.get('version', 1))] + cw.writes_defines(defines) + cw.nl() + + defines = [] + for const in family['definitions']: + if const['type'] != 'const': + cw.writes_defines(defines) + defines = [] + cw.nl() + + # Write kdoc for enum and flags (one day maybe also structs) + if const['type'] == 'enum' or const['type'] == 'flags': + enum = family.consts[const['name']] + + if enum.has_doc(): + cw.p('/**') + doc = '' + if 'doc' in enum: + doc = ' - ' + enum['doc'] + cw.write_doc_line(enum.enum_name + doc) + for entry in enum.entry_list: + if entry.has_doc(): + doc = '@' + entry.c_name + ': ' + entry['doc'] + cw.write_doc_line(doc) + cw.p(' */') + + uapi_enum_start(family, cw, const, 'name') + name_pfx = const.get('name-prefix', f"{family.name}-{const['name']}-") + for entry in enum.entry_list: + suffix = ',' + if entry.value_change: + suffix = f" = {entry.user_value()}" + suffix + cw.p(entry.c_name + suffix) + + if const.get('render-max', False): + cw.nl() + max_name = c_upper(name_pfx + 'max') + cw.p('__' + max_name + ',') + cw.p(max_name + ' = (__' + max_name + ' - 1)') + cw.block_end(line=';') + cw.nl() + elif const['type'] == 'const': + defines.append([c_upper(family.get('c-define-name', + f"{family.name}-{const['name']}")), + const['value']]) + + if defines: + cw.writes_defines(defines) + cw.nl() + + max_by_define = family.get('max-by-define', False) + + for _, attr_set in family.attr_sets.items(): + if attr_set.subset_of: + continue + + cnt_name = c_upper(family.get('attr-cnt-name', f"__{attr_set.name_prefix}MAX")) + max_value = f"({cnt_name} - 1)" + + val = 0 + uapi_enum_start(family, cw, attr_set.yaml, 'enum-name') + for _, attr in attr_set.items(): + suffix = ',' + if attr.value != val: + suffix = f" = {attr.value}," + val = attr.value + val += 1 + cw.p(attr.enum_name + suffix) + cw.nl() + cw.p(cnt_name + ('' if max_by_define else ',')) + if not max_by_define: + cw.p(f"{attr_set.max_name} = {max_value}") + cw.block_end(line=';') + if max_by_define: + cw.p(f"#define {attr_set.max_name} {max_value}") + cw.nl() + + # Commands + separate_ntf = 'async-prefix' in family['operations'] + + max_name = c_upper(family.get('cmd-max-name', f"{family.op_prefix}MAX")) + cnt_name = c_upper(family.get('cmd-cnt-name', f"__{family.op_prefix}MAX")) + max_value = f"({cnt_name} - 1)" + + uapi_enum_start(family, cw, family['operations'], 'enum-name') + for op in family.msgs.values(): + if separate_ntf and ('notify' in op or 'event' in op): + continue + + suffix = ',' + if 'value' in op: + suffix = f" = {op['value']}," + cw.p(op.enum_name + suffix) + cw.nl() + cw.p(cnt_name + ('' if max_by_define else ',')) + if not max_by_define: + cw.p(f"{max_name} = {max_value}") + cw.block_end(line=';') + if max_by_define: + cw.p(f"#define {max_name} {max_value}") + cw.nl() + + if separate_ntf: + uapi_enum_start(family, cw, family['operations'], enum_name='async-enum') + for op in family.msgs.values(): + if separate_ntf and not ('notify' in op or 'event' in op): + continue + + suffix = ',' + if 'value' in op: + suffix = f" = {op['value']}," + cw.p(op.enum_name + suffix) + cw.block_end(line=';') + cw.nl() + + # Multicast + defines = [] + for grp in family.mcgrps['list']: + name = grp['name'] + defines.append([c_upper(grp.get('c-define-name', f"{family.name}-mcgrp-{name}")), + f'{name}']) + cw.nl() + if defines: + cw.writes_defines(defines) + cw.nl() + + cw.p(f'#endif /* {hdr_prot} */') + + +def find_kernel_root(full_path): + sub_path = '' + while True: + sub_path = os.path.join(os.path.basename(full_path), sub_path) + full_path = os.path.dirname(full_path) + maintainers = os.path.join(full_path, "MAINTAINERS") + if os.path.exists(maintainers): + return full_path, sub_path[:-1] + + +def main(): + parser = argparse.ArgumentParser(description='Netlink simple parsing generator') + parser.add_argument('--mode', dest='mode', type=str, required=True) + parser.add_argument('--spec', dest='spec', type=str, required=True) + parser.add_argument('--header', dest='header', action='store_true', default=None) + parser.add_argument('--source', dest='header', action='store_false') + parser.add_argument('--user-header', nargs='+', default=[]) + parser.add_argument('-o', dest='out_file', type=str) + args = parser.parse_args() + + out_file = open(args.out_file, 'w+') if args.out_file else os.sys.stdout + + if args.header is None: + parser.error("--header or --source is required") + + try: + parsed = Family(args.spec) + except yaml.YAMLError as exc: + print(exc) + os.sys.exit(1) + return + + cw = CodeWriter(BaseNlLib(), out_file) + + _, spec_kernel = find_kernel_root(args.spec) + if args.mode == 'uapi': + cw.p('/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */') + else: + if args.header: + cw.p('/* SPDX-License-Identifier: BSD-3-Clause */') + else: + cw.p('// SPDX-License-Identifier: BSD-3-Clause') + cw.p("/* Do not edit directly, auto-generated from: */") + cw.p(f"/*\t{spec_kernel} */") + cw.p(f"/* YNL-GEN {args.mode} {'header' if args.header else 'source'} */") + cw.nl() + + if args.mode == 'uapi': + render_uapi(parsed, cw) + return + + hdr_prot = f"_LINUX_{parsed.name.upper()}_GEN_H" + if args.header: + cw.p('#ifndef ' + hdr_prot) + cw.p('#define ' + hdr_prot) + cw.nl() + + if args.mode == 'kernel': + cw.p('#include <net/netlink.h>') + cw.p('#include <net/genetlink.h>') + cw.nl() + if not args.header: + if args.out_file: + cw.p(f'#include "{os.path.basename(args.out_file[:-2])}.h"') + cw.nl() + headers = [parsed.uapi_header] + for definition in parsed['definitions']: + if 'header' in definition: + headers.append(definition['header']) + for one in headers: + cw.p(f"#include <{one}>") + cw.nl() + + if args.mode == "user": + if not args.header: + cw.p("#include <stdlib.h>") + cw.p("#include <stdio.h>") + cw.p("#include <string.h>") + cw.p("#include <libmnl/libmnl.h>") + cw.p("#include <linux/genetlink.h>") + cw.nl() + for one in args.user_header: + cw.p(f'#include "{one}"') + else: + cw.p('struct ynl_sock;') + cw.nl() + + if args.mode == "kernel": + if args.header: + for _, struct in sorted(parsed.pure_nested_structs.items()): + if struct.request: + cw.p('/* Common nested types */') + break + for attr_set, struct in sorted(parsed.pure_nested_structs.items()): + if struct.request: + print_req_policy_fwd(cw, struct) + cw.nl() + + if parsed.kernel_policy == 'global': + cw.p(f"/* Global operation policy for {parsed.name} */") + + struct = Struct(parsed, parsed.global_policy_set, type_list=parsed.global_policy) + print_req_policy_fwd(cw, struct) + cw.nl() + + if parsed.kernel_policy in {'per-op', 'split'}: + for op_name, op in parsed.ops.items(): + if 'do' in op and 'event' not in op: + ri = RenderInfo(cw, parsed, args.mode, op, op_name, "do") + print_req_policy_fwd(cw, ri.struct['request'], ri=ri) + cw.nl() + + print_kernel_op_table_hdr(parsed, cw) + print_kernel_mcgrp_hdr(parsed, cw) + print_kernel_family_struct_hdr(parsed, cw) + else: + for _, struct in sorted(parsed.pure_nested_structs.items()): + if struct.request: + cw.p('/* Common nested types */') + break + for attr_set, struct in sorted(parsed.pure_nested_structs.items()): + if struct.request: + print_req_policy(cw, struct) + cw.nl() + + if parsed.kernel_policy == 'global': + cw.p(f"/* Global operation policy for {parsed.name} */") + + struct = Struct(parsed, parsed.global_policy_set, type_list=parsed.global_policy) + print_req_policy(cw, struct) + cw.nl() + + for op_name, op in parsed.ops.items(): + if parsed.kernel_policy in {'per-op', 'split'}: + for op_mode in ['do', 'dump']: + if op_mode in op and 'request' in op[op_mode]: + cw.p(f"/* {op.enum_name} - {op_mode} */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, op_mode) + print_req_policy(cw, ri.struct['request'], ri=ri) + cw.nl() + + print_kernel_op_table(parsed, cw) + print_kernel_mcgrp_src(parsed, cw) + print_kernel_family_struct_src(parsed, cw) + + if args.mode == "user": + has_ntf = False + if args.header: + cw.p('/* Common nested types */') + for attr_set, struct in sorted(parsed.pure_nested_structs.items()): + ri = RenderInfo(cw, parsed, args.mode, "", "", "", attr_set) + print_type_full(ri, struct) + + for op_name, op in parsed.ops.items(): + cw.p(f"/* ============== {op.enum_name} ============== */") + + if 'do' in op and 'event' not in op: + cw.p(f"/* {op.enum_name} - do */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, "do") + print_req_type(ri) + print_req_type_helpers(ri) + cw.nl() + print_rsp_type(ri) + print_rsp_type_helpers(ri) + cw.nl() + print_req_prototype(ri) + cw.nl() + + if 'dump' in op: + cw.p(f"/* {op.enum_name} - dump */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, 'dump') + if 'request' in op['dump']: + print_req_type(ri) + print_req_type_helpers(ri) + if not ri.type_consistent: + print_rsp_type(ri) + print_wrapped_type(ri) + print_dump_prototype(ri) + cw.nl() + + if 'notify' in op: + cw.p(f"/* {op.enum_name} - notify */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, 'notify') + has_ntf = True + if not ri.type_consistent: + raise Exception('Only notifications with consistent types supported') + print_wrapped_type(ri) + + if 'event' in op: + ri = RenderInfo(cw, parsed, args.mode, op, op_name, 'event') + cw.p(f"/* {op.enum_name} - event */") + print_rsp_type(ri) + cw.nl() + print_wrapped_type(ri) + + if has_ntf: + cw.p('/* --------------- Common notification parsing --------------- */') + print_ntf_parse_prototype(parsed, cw) + cw.nl() + else: + cw.p('/* Policies */') + for name, _ in parsed.attr_sets.items(): + struct = Struct(parsed, name) + put_typol_fwd(cw, struct) + cw.nl() + + for name, _ in parsed.attr_sets.items(): + struct = Struct(parsed, name) + put_typol(cw, struct) + + cw.p('/* Common nested types */') + for attr_set, struct in sorted(parsed.pure_nested_structs.items()): + ri = RenderInfo(cw, parsed, args.mode, "", "", "", attr_set) + + free_rsp_nested(ri, struct) + if struct.request: + put_req_nested(ri, struct) + if struct.reply: + parse_rsp_nested(ri, struct) + + for op_name, op in parsed.ops.items(): + cw.p(f"/* ============== {op.enum_name} ============== */") + if 'do' in op and 'event' not in op: + cw.p(f"/* {op.enum_name} - do */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, "do") + print_rsp_free(ri) + parse_rsp_msg(ri) + print_req(ri) + cw.nl() + + if 'dump' in op: + cw.p(f"/* {op.enum_name} - dump */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, "dump") + if not ri.type_consistent: + parse_rsp_msg(ri, deref=True) + print_dump_type_free(ri) + print_dump(ri) + cw.nl() + + if 'notify' in op: + cw.p(f"/* {op.enum_name} - notify */") + ri = RenderInfo(cw, parsed, args.mode, op, op_name, 'notify') + has_ntf = True + if not ri.type_consistent: + raise Exception('Only notifications with consistent types supported') + print_ntf_type_free(ri) + + if 'event' in op: + cw.p(f"/* {op.enum_name} - event */") + has_ntf = True + + ri = RenderInfo(cw, parsed, args.mode, op, op_name, "do") + parse_rsp_msg(ri) + + ri = RenderInfo(cw, parsed, args.mode, op, op_name, "event") + print_ntf_type_free(ri) + + if has_ntf: + cw.p('/* --------------- Common notification parsing --------------- */') + print_ntf_type_parse(parsed, cw, args.mode) + + if args.header: + cw.p(f'#endif /* {hdr_prot} */') + + +if __name__ == "__main__": + main() diff --git a/tools/net/ynl/ynl-regen.sh b/tools/net/ynl/ynl-regen.sh new file mode 100755 index 000000000000..43989ae48ed0 --- /dev/null +++ b/tools/net/ynl/ynl-regen.sh @@ -0,0 +1,30 @@ +#!/bin/bash +# SPDX-License-Identifier: BSD-3-Clause + +TOOL=$(dirname $(realpath $0))/ynl-gen-c.py + +force= + +while [ ! -z "$1" ]; do + case "$1" in + -f ) force=yes; shift ;; + * ) echo "Unrecognized option '$1'"; exit 1 ;; + esac +done + +KDIR=$(dirname $(dirname $(dirname $(dirname $(realpath $0))))) + +files=$(git grep --files-with-matches '^/\* YNL-GEN \(kernel\|uapi\)') +for f in $files; do + # params: 0 1 2 3 + # $YAML YNL-GEN kernel $mode + params=( $(git grep -B1 -h '/\* YNL-GEN' $f | sed 's@/\*\(.*\)\*/@\1@') ) + + if [ $f -nt ${params[0]} -a -z "$force" ]; then + echo -e "\tSKIP $f" + continue + fi + + echo -e "\tGEN ${params[2]}\t$f" + $TOOL --mode ${params[2]} --${params[3]} --spec $KDIR/${params[0]} -o $f +done diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index 401a75844cc0..116fecf80ca1 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -47,3 +47,5 @@ test_cpp xskxceiver xdp_redirect_multi xdp_synproxy +xdp_hw_metadata +xdp_features diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x index 3fc3e54b19aa..b89eb87034e4 100644 --- a/tools/testing/selftests/bpf/DENYLIST.s390x +++ b/tools/testing/selftests/bpf/DENYLIST.s390x @@ -1,89 +1,24 @@ # TEMPORARY # Alphabetical order -atomics # attach(add): actual -524 <= expected 0 (trampoline) bloom_filter_map # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?) bpf_cookie # failed to open_and_load program: -524 (trampoline) -bpf_iter_setsockopt # JIT does not support calling kernel function (kfunc) bpf_loop # attaches to __x64_sys_nanosleep -bpf_mod_race # BPF trampoline -bpf_nf # JIT does not support calling kernel function -bpf_tcp_ca # JIT does not support calling kernel function (kfunc) -cb_refs # expected error message unexpected error: -524 (trampoline) -cgroup_hierarchical_stats # JIT does not support calling kernel function (kfunc) -cgrp_kfunc # JIT does not support calling kernel function cgrp_local_storage # prog_attach unexpected error: -524 (trampoline) -core_read_macros # unknown func bpf_probe_read#4 (overlapping) -d_path # failed to auto-attach program 'prog_stat': -524 (trampoline) -decap_sanity # JIT does not support calling kernel function (kfunc) -deny_namespace # failed to attach: ERROR: strerror_r(-524)=22 (trampoline) -dummy_st_ops # test_run unexpected error: -524 (errno 524) (trampoline) -fentry_fexit # fentry attach failed: -524 (trampoline) -fentry_test # fentry_first_attach unexpected error: -524 (trampoline) -fexit_bpf2bpf # freplace_attach_trace unexpected error: -524 (trampoline) fexit_sleep # fexit_skel_load fexit skeleton failed (trampoline) -fexit_stress # fexit attach failed prog 0 failed: -524 (trampoline) -fexit_test # fexit_first_attach unexpected error: -524 (trampoline) -get_func_args_test # trampoline -get_func_ip_test # get_func_ip_test__attach unexpected error: -524 (trampoline) get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace) -htab_update # failed to attach: ERROR: strerror_r(-524)=22 (trampoline) -kfree_skb # attach fentry unexpected error: -524 (trampoline) -kfunc_call # 'bpf_prog_active': not found in kernel BTF (?) -kfunc_dynptr_param # JIT does not support calling kernel function (kfunc) kprobe_multi_bench_attach # bpf_program__attach_kprobe_multi_opts unexpected error: -95 kprobe_multi_test # relies on fentry ksyms_module # test_ksyms_module__open_and_load unexpected error: -9 (?) ksyms_module_libbpf # JIT does not support calling kernel function (kfunc) ksyms_module_lskel # test_ksyms_module_lskel__open_and_load unexpected error: -9 (?) -libbpf_get_fd_by_id_opts # failed to attach: ERROR: strerror_r(-524)=22 (trampoline) -linked_list # JIT does not support calling kernel function (kfunc) -lookup_key # JIT does not support calling kernel function (kfunc) -lru_bug # prog 'printk': failed to auto-attach: -524 -map_kptr # failed to open_and_load program: -524 (trampoline) -modify_return # modify_return attach failed: -524 (trampoline) module_attach # skel_attach skeleton attach failed: -524 (trampoline) -mptcp -netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?) -probe_user # check_kprobe_res wrong kprobe res from probe read (?) -rcu_read_lock # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?) -recursion # skel_attach unexpected error: -524 (trampoline) ringbuf # skel_load skeleton load failed (?) -select_reuseport # intermittently fails on new s390x setup -send_signal # intermittently fails to receive signal -setget_sockopt # attach unexpected error: -524 (trampoline) -sk_assign # Can't read on server: Invalid argument (?) -sk_lookup # endianness problem -sk_storage_tracing # test_sk_storage_tracing__attach unexpected error: -524 (trampoline) -skc_to_unix_sock # could not attach BPF object unexpected error: -524 (trampoline) -socket_cookie # prog_attach unexpected error: -524 (trampoline) stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) -tailcalls # tail_calls are not allowed in non-JITed programs with bpf-to-bpf calls (?) -task_kfunc # JIT does not support calling kernel function -task_local_storage # failed to auto-attach program 'trace_exit_creds': -524 (trampoline) -test_bpffs # bpffs test failed 255 (iterator) -test_bprm_opts # failed to auto-attach program 'secure_exec': -524 (trampoline) -test_ima # failed to auto-attach program 'ima': -524 (trampoline) -test_local_storage # failed to auto-attach program 'unlink_hook': -524 (trampoline) test_lsm # attach unexpected error: -524 (trampoline) -test_overhead # attach_fentry unexpected error: -524 (trampoline) -test_profiler # unknown func bpf_probe_read_str#45 (overlapping) -timer # failed to auto-attach program 'test1': -524 (trampoline) -timer_crash # trampoline -timer_mim # failed to auto-attach program 'test1': -524 (trampoline) -trace_ext # failed to auto-attach program 'test_pkt_md_access_new': -524 (trampoline) trace_printk # trace_printk__load unexpected error: -2 (errno 2) (?) trace_vprintk # trace_vprintk__open_and_load unexpected error: -9 (?) -tracing_struct # failed to auto-attach: -524 (trampoline) -trampoline_count # prog 'prog1': failed to attach: ERROR: strerror_r(-524)=22 (trampoline) -type_cast # JIT does not support calling kernel function unpriv_bpf_disabled # fentry user_ringbuf # failed to find kernel BTF type ID of '__s390x_sys_prctl': -3 (?) verif_stats # trace_vprintk__open_and_load unexpected error: -9 (?) -verify_pkcs7_sig # JIT does not support calling kernel function (kfunc) -vmlinux # failed to auto-attach program 'handle__fentry': -524 (trampoline) -xdp_adjust_tail # case-128 err 0 errno 28 retval 1 size 128 expect-size 3520 (?) xdp_bonding # failed to auto-attach program 'trace_on_entry': -524 (trampoline) -xdp_bpf2bpf # failed to auto-attach program 'trace_on_entry': -524 (trampoline) -xdp_do_redirect # prog_run_max_size unexpected error: -22 (errno 22) -xdp_synproxy # JIT does not support calling kernel function (kfunc) -xfrm_info # JIT does not support calling kernel function (kfunc) +xdp_metadata # JIT does not support calling kernel function (kfunc) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index c22c43bbee19..b677dcd0b77a 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -22,10 +22,11 @@ endif BPF_GCC ?= $(shell command -v bpf-gcc;) SAN_CFLAGS ?= +SAN_LDFLAGS ?= $(SAN_CFLAGS) CFLAGS += -g -O0 -rdynamic -Wall -Werror $(GENFLAGS) $(SAN_CFLAGS) \ -I$(CURDIR) -I$(INCLUDE_DIR) -I$(GENDIR) -I$(LIBDIR) \ -I$(TOOLSINCDIR) -I$(APIDIR) -I$(OUTPUT) -LDFLAGS += $(SAN_CFLAGS) +LDFLAGS += $(SAN_LDFLAGS) LDLIBS += -lelf -lz -lrt -lpthread # Silence some warnings when compiled with clang @@ -73,7 +74,8 @@ TEST_PROGS := test_kmod.sh \ test_bpftool.sh \ test_bpftool_metadata.sh \ test_doc_build.sh \ - test_xsk.sh + test_xsk.sh \ + test_xdp_features.sh TEST_PROGS_EXTENDED := with_addr.sh \ with_tunnels.sh ima_setup.sh verify_sig_setup.sh \ @@ -83,7 +85,8 @@ TEST_PROGS_EXTENDED := with_addr.sh \ TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ - xskxceiver xdp_redirect_multi xdp_synproxy veristat + xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \ + xdp_features TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read $(OUTPUT)/sign-file TEST_GEN_FILES += liburandom_read.so @@ -149,14 +152,13 @@ endif # NOTE: Semicolon at the end is critical to override lib.mk's default static # rule for binaries. $(notdir $(TEST_GEN_PROGS) \ - $(TEST_PROGS) \ - $(TEST_PROGS_EXTENDED) \ $(TEST_GEN_PROGS_EXTENDED) \ $(TEST_CUSTOM_PROGS)): %: $(OUTPUT)/% ; # sort removes libbpf duplicates when not cross-building -MAKE_DIRS := $(sort $(BUILD_DIR)/libbpf $(HOST_BUILD_DIR)/libbpf \ - $(HOST_BUILD_DIR)/bpftool $(HOST_BUILD_DIR)/resolve_btfids \ +MAKE_DIRS := $(sort $(BUILD_DIR)/libbpf $(HOST_BUILD_DIR)/libbpf \ + $(BUILD_DIR)/bpftool $(HOST_BUILD_DIR)/bpftool \ + $(HOST_BUILD_DIR)/resolve_btfids \ $(RUNQSLOWER_OUTPUT) $(INCLUDE_DIR)) $(MAKE_DIRS): $(call msg,MKDIR,,$@) @@ -181,14 +183,15 @@ endif # do not fail. Static builds leave urandom_read relying on system-wide shared libraries. $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c $(call msg,LIB,,$@) - $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) \ + $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) \ + $^ $(filter-out -static,$(LDLIBS)) \ -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \ -fPIC -shared -o $@ $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so $(call msg,BINARY,,$@) $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \ - liburandom_read.so $(LDLIBS) \ + -lurandom_read $(filter-out -static,$(LDLIBS)) -L$(OUTPUT) \ -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \ -Wl,-rpath=. -o $@ @@ -205,16 +208,26 @@ $(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_tes $(Q)cp bpf_testmod/bpf_testmod.ko $@ DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool +ifneq ($(CROSS_COMPILE),) +CROSS_BPFTOOL := $(SCRATCH_DIR)/sbin/bpftool +TRUNNER_BPFTOOL := $(CROSS_BPFTOOL) +USE_BOOTSTRAP := "" +else +TRUNNER_BPFTOOL := $(DEFAULT_BPFTOOL) +USE_BOOTSTRAP := "bootstrap/" +endif $(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL) $(RUNQSLOWER_OUTPUT) $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/runqslower \ OUTPUT=$(RUNQSLOWER_OUTPUT) VMLINUX_BTF=$(VMLINUX_BTF) \ BPFTOOL_OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ BPFOBJ_OUTPUT=$(BUILD_DIR)/libbpf \ - BPFOBJ=$(BPFOBJ) BPF_INCLUDE=$(INCLUDE_DIR) && \ + BPFOBJ=$(BPFOBJ) BPF_INCLUDE=$(INCLUDE_DIR) \ + EXTRA_CFLAGS='-g -O0 $(SAN_CFLAGS)' \ + EXTRA_LDFLAGS='$(SAN_LDFLAGS)' && \ cp $(RUNQSLOWER_OUTPUT)runqslower $@ -TEST_GEN_PROGS_EXTENDED += $(DEFAULT_BPFTOOL) +TEST_GEN_PROGS_EXTENDED += $(TRUNNER_BPFTOOL) $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED): $(BPFOBJ) @@ -240,19 +253,30 @@ $(OUTPUT)/flow_dissector_load: $(TESTING_HELPERS) $(OUTPUT)/test_maps: $(TESTING_HELPERS) $(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS) $(OUTPUT)/xsk.o: $(BPFOBJ) -$(OUTPUT)/xskxceiver: $(OUTPUT)/xsk.o BPFTOOL ?= $(DEFAULT_BPFTOOL) $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/bpftool $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) \ - ARCH= CROSS_COMPILE= CC=$(HOSTCC) LD=$(HOSTLD) \ + ARCH= CROSS_COMPILE= CC="$(HOSTCC)" LD="$(HOSTLD)" \ EXTRA_CFLAGS='-g -O0' \ OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \ LIBBPF_OUTPUT=$(HOST_BUILD_DIR)/libbpf/ \ LIBBPF_DESTDIR=$(HOST_SCRATCH_DIR)/ \ prefix= DESTDIR=$(HOST_SCRATCH_DIR)/ install-bin +ifneq ($(CROSS_COMPILE),) +$(CROSS_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ + $(BPFOBJ) | $(BUILD_DIR)/bpftool + $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) \ + ARCH=$(ARCH) CROSS_COMPILE=$(CROSS_COMPILE) \ + EXTRA_CFLAGS='-g -O0' \ + OUTPUT=$(BUILD_DIR)/bpftool/ \ + LIBBPF_OUTPUT=$(BUILD_DIR)/libbpf/ \ + LIBBPF_DESTDIR=$(SCRATCH_DIR)/ \ + prefix= DESTDIR=$(SCRATCH_DIR)/ install-bin +endif + all: docs docs: @@ -269,7 +293,8 @@ $(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ $(APIDIR)/linux/bpf.h \ | $(BUILD_DIR)/libbpf $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/ \ - EXTRA_CFLAGS='-g -O0' \ + EXTRA_CFLAGS='-g -O0 $(SAN_CFLAGS)' \ + EXTRA_LDFLAGS='$(SAN_LDFLAGS)' \ DESTDIR=$(SCRATCH_DIR) prefix= all install_headers ifneq ($(BPFOBJ),$(HOST_BPFOBJ)) @@ -278,7 +303,8 @@ $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ | $(HOST_BUILD_DIR)/libbpf $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) \ EXTRA_CFLAGS='-g -O0' ARCH= CROSS_COMPILE= \ - OUTPUT=$(HOST_BUILD_DIR)/libbpf/ CC=$(HOSTCC) LD=$(HOSTLD) \ + OUTPUT=$(HOST_BUILD_DIR)/libbpf/ \ + CC="$(HOSTCC)" LD="$(HOSTLD)" \ DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers endif @@ -299,7 +325,7 @@ $(RESOLVE_BTFIDS): $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/resolve_btfids \ $(TOOLSDIR)/lib/ctype.c \ $(TOOLSDIR)/lib/str_error_r.c $(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/resolve_btfids \ - CC=$(HOSTCC) LD=$(HOSTLD) AR=$(HOSTAR) \ + CC="$(HOSTCC)" LD="$(HOSTLD)" AR="$(HOSTAR)" \ LIBBPF_INCLUDE=$(HOST_INCLUDE_DIR) \ OUTPUT=$(HOST_BUILD_DIR)/resolve_btfids/ BPFOBJ=$(HOST_BPFOBJ) @@ -383,6 +409,9 @@ linked_maps.skel.h-deps := linked_maps1.bpf.o linked_maps2.bpf.o test_subskeleton.skel.h-deps := test_subskeleton_lib2.bpf.o test_subskeleton_lib.bpf.o test_subskeleton.bpf.o test_subskeleton_lib.skel.h-deps := test_subskeleton_lib2.bpf.o test_subskeleton_lib.bpf.o test_usdt.skel.h-deps := test_usdt.bpf.o test_usdt_multispec.bpf.o +xsk_xdp_progs.skel.h-deps := xsk_xdp_progs.bpf.o +xdp_hw_metadata.skel.h-deps := xdp_hw_metadata.bpf.o +xdp_features.skel.h-deps := xdp_features.bpf.o LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.c,$(foreach skel,$(LINKED_SKELS),$($(skel)-deps))) @@ -513,11 +542,13 @@ endif $(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS) \ $(TRUNNER_EXTRA_OBJS) $$(BPFOBJ) \ $(RESOLVE_BTFIDS) \ + $(TRUNNER_BPFTOOL) \ | $(TRUNNER_BINARY)-extras $$(call msg,BINARY,,$$@) $(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@ $(Q)$(RESOLVE_BTFIDS) --btf $(TRUNNER_OUTPUT)/btf_data.bpf.o $$@ - $(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/bootstrap/bpftool $(if $2,$2/)bpftool + $(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/$(USE_BOOTSTRAP)bpftool \ + $(OUTPUT)/$(if $2,$2/)bpftool endef @@ -527,7 +558,7 @@ TRUNNER_BPF_PROGS_DIR := progs TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ network_helpers.c testing_helpers.c \ btf_helpers.c flow_dissector_load.h \ - cap_helpers.c test_loader.c + cap_helpers.c test_loader.c xsk.c TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(OUTPUT)/liburandom_read.so \ $(OUTPUT)/xdp_synproxy \ @@ -576,6 +607,18 @@ $(OUTPUT)/test_verifier: test_verifier.c verifier/tests.h $(BPFOBJ) | $(OUTPUT) $(call msg,BINARY,,$@) $(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@ +$(OUTPUT)/xskxceiver: xskxceiver.c $(OUTPUT)/xsk.o $(OUTPUT)/xsk_xdp_progs.skel.h $(BPFOBJ) | $(OUTPUT) + $(call msg,BINARY,,$@) + $(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@ + +$(OUTPUT)/xdp_hw_metadata: xdp_hw_metadata.c $(OUTPUT)/network_helpers.o $(OUTPUT)/xsk.o $(OUTPUT)/xdp_hw_metadata.skel.h | $(OUTPUT) + $(call msg,BINARY,,$@) + $(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@ + +$(OUTPUT)/xdp_features: xdp_features.c $(OUTPUT)/network_helpers.o $(OUTPUT)/xdp_features.skel.h | $(OUTPUT) + $(call msg,BINARY,,$@) + $(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@ + # Make sure we are able to include and link libbpf against c++. $(OUTPUT)/test_cpp: test_cpp.cpp $(OUTPUT)/test_core_extern.skel.h $(BPFOBJ) $(call msg,CXX,,$@) @@ -595,6 +638,7 @@ $(OUTPUT)/bench_strncmp.o: $(OUTPUT)/strncmp_bench.skel.h $(OUTPUT)/bench_bpf_hashmap_full_update.o: $(OUTPUT)/bpf_hashmap_full_update_bench.skel.h $(OUTPUT)/bench_local_storage.o: $(OUTPUT)/local_storage_bench.skel.h $(OUTPUT)/bench_local_storage_rcu_tasks_trace.o: $(OUTPUT)/local_storage_rcu_tasks_trace_bench.skel.h +$(OUTPUT)/bench_bpf_hashmap_lookup.o: $(OUTPUT)/bpf_hashmap_lookup.skel.h $(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ) $(OUTPUT)/bench: LDLIBS += -lm $(OUTPUT)/bench: $(OUTPUT)/bench.o \ @@ -609,7 +653,9 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \ $(OUTPUT)/bench_strncmp.o \ $(OUTPUT)/bench_bpf_hashmap_full_update.o \ $(OUTPUT)/bench_local_storage.o \ - $(OUTPUT)/bench_local_storage_rcu_tasks_trace.o + $(OUTPUT)/bench_local_storage_rcu_tasks_trace.o \ + $(OUTPUT)/bench_bpf_hashmap_lookup.o \ + # $(call msg,BINARY,,$@) $(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@ @@ -626,3 +672,6 @@ EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \ liburandom_read.so) .PHONY: docs docs-clean + +# Delete partially updated (corrupted) files on error +.DELETE_ON_ERROR: diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index c1f20a147462..0b2a53bb8460 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -16,6 +16,7 @@ struct env env = { .warmup_sec = 1, .duration_sec = 5, .affinity = false, + .quiet = false, .consumer_cnt = 1, .producer_cnt = 1, }; @@ -262,6 +263,7 @@ static const struct argp_option opts[] = { { "consumers", 'c', "NUM", 0, "Number of consumer threads"}, { "verbose", 'v', NULL, 0, "Verbose debug output"}, { "affinity", 'a', NULL, 0, "Set consumer/producer thread affinity"}, + { "quiet", 'q', NULL, 0, "Be more quiet"}, { "prod-affinity", ARG_PROD_AFFINITY_SET, "CPUSET", 0, "Set of CPUs for producer threads; implies --affinity"}, { "cons-affinity", ARG_CONS_AFFINITY_SET, "CPUSET", 0, @@ -275,6 +277,7 @@ extern struct argp bench_bpf_loop_argp; extern struct argp bench_local_storage_argp; extern struct argp bench_local_storage_rcu_tasks_trace_argp; extern struct argp bench_strncmp_argp; +extern struct argp bench_hashmap_lookup_argp; static const struct argp_child bench_parsers[] = { { &bench_ringbufs_argp, 0, "Ring buffers benchmark", 0 }, @@ -284,13 +287,15 @@ static const struct argp_child bench_parsers[] = { { &bench_strncmp_argp, 0, "bpf_strncmp helper benchmark", 0 }, { &bench_local_storage_rcu_tasks_trace_argp, 0, "local_storage RCU Tasks Trace slowdown benchmark", 0 }, + { &bench_hashmap_lookup_argp, 0, "Hashmap lookup benchmark", 0 }, {}, }; +/* Make pos_args global, so that we can run argp_parse twice, if necessary */ +static int pos_args; + static error_t parse_arg(int key, char *arg, struct argp_state *state) { - static int pos_args; - switch (key) { case 'v': env.verbose = true; @@ -329,6 +334,9 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) case 'a': env.affinity = true; break; + case 'q': + env.quiet = true; + break; case ARG_PROD_AFFINITY_SET: env.affinity = true; if (parse_num_list(arg, &env.prod_cpus.cpus, @@ -359,7 +367,7 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) return 0; } -static void parse_cmdline_args(int argc, char **argv) +static void parse_cmdline_args_init(int argc, char **argv) { static const struct argp argp = { .options = opts, @@ -369,9 +377,25 @@ static void parse_cmdline_args(int argc, char **argv) }; if (argp_parse(&argp, argc, argv, 0, NULL, NULL)) exit(1); - if (!env.list && !env.bench_name) { - argp_help(&argp, stderr, ARGP_HELP_DOC, "bench"); - exit(1); +} + +static void parse_cmdline_args_final(int argc, char **argv) +{ + struct argp_child bench_parsers[2] = {}; + const struct argp argp = { + .options = opts, + .parser = parse_arg, + .doc = argp_program_doc, + .children = bench_parsers, + }; + + /* Parse arguments the second time with the correct set of parsers */ + if (bench->argp) { + bench_parsers[0].argp = bench->argp; + bench_parsers[0].header = bench->name; + pos_args = 0; + if (argp_parse(&argp, argc, argv, 0, NULL, NULL)) + exit(1); } } @@ -490,6 +514,7 @@ extern const struct bench bench_local_storage_cache_seq_get; extern const struct bench bench_local_storage_cache_interleaved_get; extern const struct bench bench_local_storage_cache_hashmap_control; extern const struct bench bench_local_storage_tasks_trace; +extern const struct bench bench_bpf_hashmap_lookup; static const struct bench *benchs[] = { &bench_count_global, @@ -529,17 +554,17 @@ static const struct bench *benchs[] = { &bench_local_storage_cache_interleaved_get, &bench_local_storage_cache_hashmap_control, &bench_local_storage_tasks_trace, + &bench_bpf_hashmap_lookup, }; -static void setup_benchmark() +static void find_benchmark(void) { - int i, err; + int i; if (!env.bench_name) { fprintf(stderr, "benchmark name is not specified\n"); exit(1); } - for (i = 0; i < ARRAY_SIZE(benchs); i++) { if (strcmp(benchs[i]->name, env.bench_name) == 0) { bench = benchs[i]; @@ -550,8 +575,14 @@ static void setup_benchmark() fprintf(stderr, "benchmark '%s' not found\n", env.bench_name); exit(1); } +} - printf("Setting up benchmark '%s'...\n", bench->name); +static void setup_benchmark(void) +{ + int i, err; + + if (!env.quiet) + printf("Setting up benchmark '%s'...\n", bench->name); state.producers = calloc(env.producer_cnt, sizeof(*state.producers)); state.consumers = calloc(env.consumer_cnt, sizeof(*state.consumers)); @@ -597,7 +628,8 @@ static void setup_benchmark() next_cpu(&env.prod_cpus)); } - printf("Benchmark '%s' started.\n", bench->name); + if (!env.quiet) + printf("Benchmark '%s' started.\n", bench->name); } static pthread_mutex_t bench_done_mtx = PTHREAD_MUTEX_INITIALIZER; @@ -621,7 +653,7 @@ static void collect_measurements(long delta_ns) { int main(int argc, char **argv) { - parse_cmdline_args(argc, argv); + parse_cmdline_args_init(argc, argv); if (env.list) { int i; @@ -633,6 +665,9 @@ int main(int argc, char **argv) return 0; } + find_benchmark(); + parse_cmdline_args_final(argc, argv); + setup_benchmark(); setup_timer(); diff --git a/tools/testing/selftests/bpf/bench.h b/tools/testing/selftests/bpf/bench.h index d748255877e2..402729c6a3ac 100644 --- a/tools/testing/selftests/bpf/bench.h +++ b/tools/testing/selftests/bpf/bench.h @@ -24,6 +24,7 @@ struct env { bool verbose; bool list; bool affinity; + bool quiet; int consumer_cnt; int producer_cnt; struct cpu_set prod_cpus; @@ -47,6 +48,7 @@ struct bench_res { struct bench { const char *name; + const struct argp *argp; void (*validate)(void); void (*setup)(void); void *(*producer_thread)(void *ctx); diff --git a/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c b/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c index 5bcb8a8cdeb2..7c8ccc108313 100644 --- a/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c +++ b/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c @@ -428,6 +428,7 @@ static void *consumer(void *input) const struct bench bench_bloom_lookup = { .name = "bloom-lookup", + .argp = &bench_bloom_map_argp, .validate = validate, .setup = bloom_lookup_setup, .producer_thread = producer, @@ -439,6 +440,7 @@ const struct bench bench_bloom_lookup = { const struct bench bench_bloom_update = { .name = "bloom-update", + .argp = &bench_bloom_map_argp, .validate = validate, .setup = bloom_update_setup, .producer_thread = producer, @@ -450,6 +452,7 @@ const struct bench bench_bloom_update = { const struct bench bench_bloom_false_positive = { .name = "bloom-false-positive", + .argp = &bench_bloom_map_argp, .validate = validate, .setup = false_positive_setup, .producer_thread = producer, @@ -461,6 +464,7 @@ const struct bench bench_bloom_false_positive = { const struct bench bench_hashmap_without_bloom = { .name = "hashmap-without-bloom", + .argp = &bench_bloom_map_argp, .validate = validate, .setup = hashmap_no_bloom_setup, .producer_thread = producer, @@ -472,6 +476,7 @@ const struct bench bench_hashmap_without_bloom = { const struct bench bench_hashmap_with_bloom = { .name = "hashmap-with-bloom", + .argp = &bench_bloom_map_argp, .validate = validate, .setup = hashmap_with_bloom_setup, .producer_thread = producer, diff --git a/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c b/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c index cec51e0ff4b8..75abe8137b6c 100644 --- a/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c +++ b/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_full_update.c @@ -1,7 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2022 Bytedance */ -#include <argp.h> #include "bench.h" #include "bpf_hashmap_full_update_bench.skel.h" #include "bpf_util.h" @@ -68,7 +67,7 @@ static void setup(void) bpf_map_update_elem(map_fd, &i, &i, BPF_ANY); } -void hashmap_report_final(struct bench_res res[], int res_cnt) +static void hashmap_report_final(struct bench_res res[], int res_cnt) { unsigned int nr_cpus = bpf_num_possible_cpus(); int i; @@ -85,7 +84,7 @@ void hashmap_report_final(struct bench_res res[], int res_cnt) } const struct bench bench_bpf_hashmap_full_update = { - .name = "bpf-hashmap-ful-update", + .name = "bpf-hashmap-full-update", .validate = validate, .setup = setup, .producer_thread = producer, diff --git a/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_lookup.c b/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_lookup.c new file mode 100644 index 000000000000..8dbb02f75cff --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/bench_bpf_hashmap_lookup.c @@ -0,0 +1,283 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ + +#include <sys/random.h> +#include <argp.h> +#include "bench.h" +#include "bpf_hashmap_lookup.skel.h" +#include "bpf_util.h" + +/* BPF triggering benchmarks */ +static struct ctx { + struct bpf_hashmap_lookup *skel; +} ctx; + +/* only available to kernel, so define it here */ +#define BPF_MAX_LOOPS (1<<23) + +#define MAX_KEY_SIZE 1024 /* the size of the key map */ + +static struct { + __u32 key_size; + __u32 map_flags; + __u32 max_entries; + __u32 nr_entries; + __u32 nr_loops; +} args = { + .key_size = 4, + .map_flags = 0, + .max_entries = 1000, + .nr_entries = 500, + .nr_loops = 1000000, +}; + +enum { + ARG_KEY_SIZE = 8001, + ARG_MAP_FLAGS, + ARG_MAX_ENTRIES, + ARG_NR_ENTRIES, + ARG_NR_LOOPS, +}; + +static const struct argp_option opts[] = { + { "key_size", ARG_KEY_SIZE, "KEY_SIZE", 0, + "The hashmap key size (max 1024)"}, + { "map_flags", ARG_MAP_FLAGS, "MAP_FLAGS", 0, + "The hashmap flags passed to BPF_MAP_CREATE"}, + { "max_entries", ARG_MAX_ENTRIES, "MAX_ENTRIES", 0, + "The hashmap max entries"}, + { "nr_entries", ARG_NR_ENTRIES, "NR_ENTRIES", 0, + "The number of entries to insert/lookup"}, + { "nr_loops", ARG_NR_LOOPS, "NR_LOOPS", 0, + "The number of loops for the benchmark"}, + {}, +}; + +static error_t parse_arg(int key, char *arg, struct argp_state *state) +{ + long ret; + + switch (key) { + case ARG_KEY_SIZE: + ret = strtol(arg, NULL, 10); + if (ret < 1 || ret > MAX_KEY_SIZE) { + fprintf(stderr, "invalid key_size"); + argp_usage(state); + } + args.key_size = ret; + break; + case ARG_MAP_FLAGS: + ret = strtol(arg, NULL, 0); + if (ret < 0 || ret > UINT_MAX) { + fprintf(stderr, "invalid map_flags"); + argp_usage(state); + } + args.map_flags = ret; + break; + case ARG_MAX_ENTRIES: + ret = strtol(arg, NULL, 10); + if (ret < 1 || ret > UINT_MAX) { + fprintf(stderr, "invalid max_entries"); + argp_usage(state); + } + args.max_entries = ret; + break; + case ARG_NR_ENTRIES: + ret = strtol(arg, NULL, 10); + if (ret < 1 || ret > UINT_MAX) { + fprintf(stderr, "invalid nr_entries"); + argp_usage(state); + } + args.nr_entries = ret; + break; + case ARG_NR_LOOPS: + ret = strtol(arg, NULL, 10); + if (ret < 1 || ret > BPF_MAX_LOOPS) { + fprintf(stderr, "invalid nr_loops: %ld (min=1 max=%u)\n", + ret, BPF_MAX_LOOPS); + argp_usage(state); + } + args.nr_loops = ret; + break; + default: + return ARGP_ERR_UNKNOWN; + } + + return 0; +} + +const struct argp bench_hashmap_lookup_argp = { + .options = opts, + .parser = parse_arg, +}; + +static void validate(void) +{ + if (env.consumer_cnt != 1) { + fprintf(stderr, "benchmark doesn't support multi-consumer!\n"); + exit(1); + } + + if (args.nr_entries > args.max_entries) { + fprintf(stderr, "args.nr_entries is too big! (max %u, got %u)\n", + args.max_entries, args.nr_entries); + exit(1); + } +} + +static void *producer(void *input) +{ + while (true) { + /* trigger the bpf program */ + syscall(__NR_getpgid); + } + return NULL; +} + +static void *consumer(void *input) +{ + return NULL; +} + +static void measure(struct bench_res *res) +{ +} + +static inline void patch_key(u32 i, u32 *key) +{ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + *key = i + 1; +#else + *key = __builtin_bswap32(i + 1); +#endif + /* the rest of key is random */ +} + +static void setup(void) +{ + struct bpf_link *link; + int map_fd; + int ret; + int i; + + setup_libbpf(); + + ctx.skel = bpf_hashmap_lookup__open(); + if (!ctx.skel) { + fprintf(stderr, "failed to open skeleton\n"); + exit(1); + } + + bpf_map__set_max_entries(ctx.skel->maps.hash_map_bench, args.max_entries); + bpf_map__set_key_size(ctx.skel->maps.hash_map_bench, args.key_size); + bpf_map__set_value_size(ctx.skel->maps.hash_map_bench, 8); + bpf_map__set_map_flags(ctx.skel->maps.hash_map_bench, args.map_flags); + + ctx.skel->bss->nr_entries = args.nr_entries; + ctx.skel->bss->nr_loops = args.nr_loops / args.nr_entries; + + if (args.key_size > 4) { + for (i = 1; i < args.key_size/4; i++) + ctx.skel->bss->key[i] = 2654435761 * i; + } + + ret = bpf_hashmap_lookup__load(ctx.skel); + if (ret) { + bpf_hashmap_lookup__destroy(ctx.skel); + fprintf(stderr, "failed to load map: %s", strerror(-ret)); + exit(1); + } + + /* fill in the hash_map */ + map_fd = bpf_map__fd(ctx.skel->maps.hash_map_bench); + for (u64 i = 0; i < args.nr_entries; i++) { + patch_key(i, ctx.skel->bss->key); + bpf_map_update_elem(map_fd, ctx.skel->bss->key, &i, BPF_ANY); + } + + link = bpf_program__attach(ctx.skel->progs.benchmark); + if (!link) { + fprintf(stderr, "failed to attach program!\n"); + exit(1); + } +} + +static inline double events_from_time(u64 time) +{ + if (time) + return args.nr_loops * 1000000000llu / time / 1000000.0L; + + return 0; +} + +static int compute_events(u64 *times, double *events_mean, double *events_stddev, u64 *mean_time) +{ + int i, n = 0; + + *events_mean = 0; + *events_stddev = 0; + *mean_time = 0; + + for (i = 0; i < 32; i++) { + if (!times[i]) + break; + *mean_time += times[i]; + *events_mean += events_from_time(times[i]); + n += 1; + } + if (!n) + return 0; + + *mean_time /= n; + *events_mean /= n; + + if (n > 1) { + for (i = 0; i < n; i++) { + double events_i = *events_mean - events_from_time(times[i]); + *events_stddev += events_i * events_i / (n - 1); + } + *events_stddev = sqrt(*events_stddev); + } + + return n; +} + +static void hashmap_report_final(struct bench_res res[], int res_cnt) +{ + unsigned int nr_cpus = bpf_num_possible_cpus(); + double events_mean, events_stddev; + u64 mean_time; + int i, n; + + for (i = 0; i < nr_cpus; i++) { + n = compute_events(ctx.skel->bss->percpu_times[i], &events_mean, + &events_stddev, &mean_time); + if (n == 0) + continue; + + if (env.quiet) { + /* we expect only one cpu to be present */ + if (env.affinity) + printf("%.3lf\n", events_mean); + else + printf("cpu%02d %.3lf\n", i, events_mean); + } else { + printf("cpu%02d: lookup %.3lfM ± %.3lfM events/sec" + " (approximated from %d samples of ~%lums)\n", + i, events_mean, 2*events_stddev, + n, mean_time / 1000000); + } + } +} + +const struct bench bench_bpf_hashmap_lookup = { + .name = "bpf-hashmap-lookup", + .argp = &bench_hashmap_lookup_argp, + .validate = validate, + .setup = setup, + .producer_thread = producer, + .consumer_thread = consumer, + .measure = measure, + .report_progress = NULL, + .report_final = hashmap_report_final, +}; diff --git a/tools/testing/selftests/bpf/benchs/bench_bpf_loop.c b/tools/testing/selftests/bpf/benchs/bench_bpf_loop.c index d0a6572bfab6..d8a0394e10b1 100644 --- a/tools/testing/selftests/bpf/benchs/bench_bpf_loop.c +++ b/tools/testing/selftests/bpf/benchs/bench_bpf_loop.c @@ -95,6 +95,7 @@ static void setup(void) const struct bench bench_bpf_loop = { .name = "bpf-loop", + .argp = &bench_bpf_loop_argp, .validate = validate, .setup = setup, .producer_thread = producer, diff --git a/tools/testing/selftests/bpf/benchs/bench_local_storage.c b/tools/testing/selftests/bpf/benchs/bench_local_storage.c index 5a378c84e81f..d4b2817306d4 100644 --- a/tools/testing/selftests/bpf/benchs/bench_local_storage.c +++ b/tools/testing/selftests/bpf/benchs/bench_local_storage.c @@ -255,6 +255,7 @@ static void *producer(void *input) */ const struct bench bench_local_storage_cache_seq_get = { .name = "local-storage-cache-seq-get", + .argp = &bench_local_storage_argp, .validate = validate, .setup = local_storage_cache_get_setup, .producer_thread = producer, @@ -266,6 +267,7 @@ const struct bench bench_local_storage_cache_seq_get = { const struct bench bench_local_storage_cache_interleaved_get = { .name = "local-storage-cache-int-get", + .argp = &bench_local_storage_argp, .validate = validate, .setup = local_storage_cache_get_interleaved_setup, .producer_thread = producer, @@ -277,6 +279,7 @@ const struct bench bench_local_storage_cache_interleaved_get = { const struct bench bench_local_storage_cache_hashmap_control = { .name = "local-storage-cache-hashmap-control", + .argp = &bench_local_storage_argp, .validate = validate, .setup = hashmap_setup, .producer_thread = producer, diff --git a/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c b/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c index 43f109d93130..d5eb5587f2aa 100644 --- a/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c +++ b/tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c @@ -12,17 +12,14 @@ static struct { __u32 nr_procs; __u32 kthread_pid; - bool quiet; } args = { .nr_procs = 1000, .kthread_pid = 0, - .quiet = false, }; enum { ARG_NR_PROCS = 7000, ARG_KTHREAD_PID = 7001, - ARG_QUIET = 7002, }; static const struct argp_option opts[] = { @@ -30,8 +27,6 @@ static const struct argp_option opts[] = { "Set number of user processes to spin up"}, { "kthread_pid", ARG_KTHREAD_PID, "PID", 0, "Pid of rcu_tasks_trace kthread for ticks tracking"}, - { "quiet", ARG_QUIET, "{0,1}", 0, - "If true, don't report progress"}, {}, }; @@ -56,14 +51,6 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) } args.kthread_pid = ret; break; - case ARG_QUIET: - ret = strtol(arg, NULL, 10); - if (ret < 0 || ret > 1) { - fprintf(stderr, "invalid quiet %ld\n", ret); - argp_usage(state); - } - args.quiet = ret; - break; break; default: return ARGP_ERR_UNKNOWN; @@ -230,7 +217,7 @@ static void report_progress(int iter, struct bench_res *res, long delta_ns) exit(1); } - if (args.quiet) + if (env.quiet) return; printf("Iter %d\t avg tasks_trace grace period latency\t%lf ns\n", @@ -271,6 +258,7 @@ static void report_final(struct bench_res res[], int res_cnt) */ const struct bench bench_local_storage_tasks_trace = { .name = "local-storage-tasks-trace", + .argp = &bench_local_storage_rcu_tasks_trace_argp, .validate = validate, .setup = local_storage_tasks_trace_setup, .producer_thread = producer, diff --git a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c index c2554f9695ff..fc91fdac4faa 100644 --- a/tools/testing/selftests/bpf/benchs/bench_ringbufs.c +++ b/tools/testing/selftests/bpf/benchs/bench_ringbufs.c @@ -518,6 +518,7 @@ static void *perfbuf_custom_consumer(void *input) const struct bench bench_rb_libbpf = { .name = "rb-libbpf", + .argp = &bench_ringbufs_argp, .validate = bufs_validate, .setup = ringbuf_libbpf_setup, .producer_thread = bufs_sample_producer, @@ -529,6 +530,7 @@ const struct bench bench_rb_libbpf = { const struct bench bench_rb_custom = { .name = "rb-custom", + .argp = &bench_ringbufs_argp, .validate = bufs_validate, .setup = ringbuf_custom_setup, .producer_thread = bufs_sample_producer, @@ -540,6 +542,7 @@ const struct bench bench_rb_custom = { const struct bench bench_pb_libbpf = { .name = "pb-libbpf", + .argp = &bench_ringbufs_argp, .validate = bufs_validate, .setup = perfbuf_libbpf_setup, .producer_thread = bufs_sample_producer, @@ -551,6 +554,7 @@ const struct bench bench_pb_libbpf = { const struct bench bench_pb_custom = { .name = "pb-custom", + .argp = &bench_ringbufs_argp, .validate = bufs_validate, .setup = perfbuf_libbpf_setup, .producer_thread = bufs_sample_producer, diff --git a/tools/testing/selftests/bpf/benchs/bench_strncmp.c b/tools/testing/selftests/bpf/benchs/bench_strncmp.c index 494b591c0289..d3fad2ba6916 100644 --- a/tools/testing/selftests/bpf/benchs/bench_strncmp.c +++ b/tools/testing/selftests/bpf/benchs/bench_strncmp.c @@ -140,6 +140,7 @@ static void strncmp_measure(struct bench_res *res) const struct bench bench_strncmp_no_helper = { .name = "strncmp-no-helper", + .argp = &bench_strncmp_argp, .validate = strncmp_validate, .setup = strncmp_no_helper_setup, .producer_thread = strncmp_producer, @@ -151,6 +152,7 @@ const struct bench bench_strncmp_no_helper = { const struct bench bench_strncmp_helper = { .name = "strncmp-helper", + .argp = &bench_strncmp_argp, .validate = strncmp_validate, .setup = strncmp_helper_setup, .producer_thread = strncmp_producer, diff --git a/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh b/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh index 1e2de838f9fa..cd2efd3fdef3 100755 --- a/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh +++ b/tools/testing/selftests/bpf/benchs/run_bench_bpf_hashmap_full_update.sh @@ -6,6 +6,6 @@ source ./benchs/run_common.sh set -eufo pipefail nr_threads=`expr $(cat /proc/cpuinfo | grep "processor"| wc -l) - 1` -summary=$($RUN_BENCH -p $nr_threads bpf-hashmap-ful-update) +summary=$($RUN_BENCH -p $nr_threads bpf-hashmap-full-update) printf "$summary" printf "\n" diff --git a/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh b/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh index 5dac1f02892c..3e8a969f2096 100755 --- a/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh +++ b/tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh @@ -8,4 +8,4 @@ if [ -z $kthread_pid ]; then exit 1 fi -./bench --nr_procs 15000 --kthread_pid $kthread_pid -d 600 --quiet 1 local-storage-tasks-trace +./bench --nr_procs 15000 --kthread_pid $kthread_pid -d 600 --quiet local-storage-tasks-trace diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 424f7bbbfe9b..dbd2c729781a 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -65,4 +65,28 @@ extern struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head) __ks */ extern struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) __ksym; +/* Description + * Remove 'node' from rbtree with root 'root' + * Returns + * Pointer to the removed node, or NULL if 'root' didn't contain 'node' + */ +extern struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, + struct bpf_rb_node *node) __ksym; + +/* Description + * Add 'node' to rbtree with root 'root' using comparator 'less' + * Returns + * Nothing + */ +extern void bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node, + bool (less)(struct bpf_rb_node *a, const struct bpf_rb_node *b)) __ksym; + +/* Description + * Return the first (leftmost) node in input tree + * Returns + * Pointer to the node, which is _not_ removed from the tree. If the tree + * contains no nodes, returns NULL. + */ +extern struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root) __ksym; + #endif diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 5085fea3cac5..46500636d8cd 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -59,7 +59,7 @@ bpf_testmod_test_struct_arg_5(void) { return bpf_testmod_test_struct_arg_result; } -noinline void +__bpf_kfunc void bpf_testmod_test_mod_kfunc(int i) { *(int *)this_cpu_ptr(&bpf_testmod_ksym_percpu) = i; diff --git a/tools/testing/selftests/bpf/map_tests/map_in_map_batch_ops.c b/tools/testing/selftests/bpf/map_tests/map_in_map_batch_ops.c index f472d28ad11a..16f1671e4bde 100644 --- a/tools/testing/selftests/bpf/map_tests/map_in_map_batch_ops.c +++ b/tools/testing/selftests/bpf/map_tests/map_in_map_batch_ops.c @@ -18,7 +18,7 @@ static __u32 get_map_id_from_fd(int map_fd) uint32_t info_len = sizeof(map_info); int ret; - ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &info_len); + ret = bpf_map_get_info_by_fd(map_fd, &map_info, &info_len); CHECK(ret < 0, "Finding map info failed", "error:%s\n", strerror(errno)); diff --git a/tools/testing/selftests/bpf/netcnt_common.h b/tools/testing/selftests/bpf/netcnt_common.h index 0ab1c88041cd..2d4a58e4e39c 100644 --- a/tools/testing/selftests/bpf/netcnt_common.h +++ b/tools/testing/selftests/bpf/netcnt_common.h @@ -8,11 +8,11 @@ /* sizeof(struct bpf_local_storage_elem): * - * It really is about 128 bytes on x86_64, but allocate more to account for - * possible layout changes, different architectures, etc. + * It is about 128 bytes on x86_64 and 512 bytes on s390x, but allocate more to + * account for possible layout changes, different architectures, etc. * The kernel will wrap up to PAGE_SIZE internally anyway. */ -#define SIZEOF_BPF_LOCAL_STORAGE_ELEM 256 +#define SIZEOF_BPF_LOCAL_STORAGE_ELEM 768 /* Try to estimate kernel's BPF_LOCAL_STORAGE_MAX_VALUE_SIZE: */ #define BPF_LOCAL_STORAGE_MAX_VALUE_SIZE (0xFFFF - \ diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c index 9566d9d2f6ee..56374c8b5436 100644 --- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c +++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c @@ -33,8 +33,8 @@ void test_attach_probe(void) struct test_attach_probe* skel; ssize_t uprobe_offset, ref_ctr_offset; struct bpf_link *uprobe_err_link; + FILE *devnull; bool legacy; - char *mem; /* Check if new-style kprobe/uprobe API is supported. * Kernels that support new FD-based kprobe and uprobe BPF attachment @@ -147,7 +147,7 @@ void test_attach_probe(void) /* test attach by name for a library function, using the library * as the binary argument. libc.so.6 will be resolved via dlopen()/dlinfo(). */ - uprobe_opts.func_name = "malloc"; + uprobe_opts.func_name = "fopen"; uprobe_opts.retprobe = false; skel->links.handle_uprobe_byname2 = bpf_program__attach_uprobe_opts(skel->progs.handle_uprobe_byname2, @@ -157,7 +157,7 @@ void test_attach_probe(void) if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname2, "attach_uprobe_byname2")) goto cleanup; - uprobe_opts.func_name = "free"; + uprobe_opts.func_name = "fclose"; uprobe_opts.retprobe = true; skel->links.handle_uretprobe_byname2 = bpf_program__attach_uprobe_opts(skel->progs.handle_uretprobe_byname2, @@ -195,8 +195,8 @@ void test_attach_probe(void) usleep(1); /* trigger & validate shared library u[ret]probes attached by name */ - mem = malloc(1); - free(mem); + devnull = fopen("/dev/null", "r"); + fclose(devnull); /* trigger & validate uprobe & uretprobe */ trigger_func(); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c index 2be2d61954bc..26b2d1bffdfd 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c @@ -472,6 +472,7 @@ static void lsm_subtest(struct test_bpf_cookie *skel) int prog_fd; int lsm_fd = -1; LIBBPF_OPTS(bpf_link_create_opts, link_opts); + int err; skel->bss->lsm_res = 0; @@ -482,8 +483,9 @@ static void lsm_subtest(struct test_bpf_cookie *skel) if (!ASSERT_GE(lsm_fd, 0, "lsm.link_create")) goto cleanup; - stack_mprotect(); - if (!ASSERT_EQ(errno, EPERM, "stack_mprotect")) + err = stack_mprotect(); + if (!ASSERT_EQ(err, -1, "stack_mprotect") || + !ASSERT_EQ(errno, EPERM, "stack_mprotect")) goto cleanup; usleep(1); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index 3af6450763e9..1f02168103dd 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -195,8 +195,8 @@ static void check_bpf_link_info(const struct bpf_program *prog) return; info_len = sizeof(info); - err = bpf_obj_get_info_by_fd(bpf_link__fd(link), &info, &info_len); - ASSERT_OK(err, "bpf_obj_get_info_by_fd"); + err = bpf_link_get_info_by_fd(bpf_link__fd(link), &info, &info_len); + ASSERT_OK(err, "bpf_link_get_info_by_fd"); ASSERT_EQ(info.iter.task.tid, getpid(), "check_task_tid"); bpf_link__destroy(link); @@ -684,13 +684,13 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) /* setup filtering map_id in bpf program */ map_info_len = sizeof(map_info); - err = bpf_obj_get_info_by_fd(map1_fd, &map_info, &map_info_len); + err = bpf_map_get_info_by_fd(map1_fd, &map_info, &map_info_len); if (CHECK(err, "get_map_info", "get map info failed: %s\n", strerror(errno))) goto free_map2; skel->bss->map1_id = map_info.id; - err = bpf_obj_get_info_by_fd(map2_fd, &map_info, &map_info_len); + err = bpf_map_get_info_by_fd(map2_fd, &map_info, &map_info_len); if (CHECK(err, "get_map_info", "get map info failed: %s\n", strerror(errno))) goto free_map2; diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c index e1c1e521cca2..675b90b15280 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c @@ -44,7 +44,7 @@ void serial_test_bpf_obj_id(void) CHECK(err >= 0 || errno != ENOENT, "get-fd-by-notexist-link-id", "err %d errno %d\n", err, errno); - /* Check bpf_obj_get_info_by_fd() */ + /* Check bpf_map_get_info_by_fd() */ bzero(zeros, sizeof(zeros)); for (i = 0; i < nr_iters; i++) { now = time(NULL); @@ -79,7 +79,7 @@ void serial_test_bpf_obj_id(void) /* Check getting map info */ info_len = sizeof(struct bpf_map_info) * 2; bzero(&map_infos[i], info_len); - err = bpf_obj_get_info_by_fd(map_fds[i], &map_infos[i], + err = bpf_map_get_info_by_fd(map_fds[i], &map_infos[i], &info_len); if (CHECK(err || map_infos[i].type != BPF_MAP_TYPE_ARRAY || @@ -118,8 +118,8 @@ void serial_test_bpf_obj_id(void) err = clock_gettime(CLOCK_BOOTTIME, &boot_time_ts); if (CHECK_FAIL(err)) goto done; - err = bpf_obj_get_info_by_fd(prog_fds[i], &prog_infos[i], - &info_len); + err = bpf_prog_get_info_by_fd(prog_fds[i], &prog_infos[i], + &info_len); load_time = (real_time_ts.tv_sec - boot_time_ts.tv_sec) + (prog_infos[i].load_time / nsec_per_sec); if (CHECK(err || @@ -161,8 +161,8 @@ void serial_test_bpf_obj_id(void) bzero(&link_infos[i], info_len); link_infos[i].raw_tracepoint.tp_name = ptr_to_u64(&tp_name); link_infos[i].raw_tracepoint.tp_name_len = sizeof(tp_name); - err = bpf_obj_get_info_by_fd(bpf_link__fd(links[i]), - &link_infos[i], &info_len); + err = bpf_link_get_info_by_fd(bpf_link__fd(links[i]), + &link_infos[i], &info_len); if (CHECK(err || link_infos[i].type != BPF_LINK_TYPE_RAW_TRACEPOINT || link_infos[i].prog_id != prog_infos[i].id || @@ -217,7 +217,7 @@ void serial_test_bpf_obj_id(void) * prog_info.map_ids = NULL */ prog_info.nr_map_ids = 1; - err = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &info_len); if (CHECK(!err || errno != EFAULT, "get-prog-fd-bad-nr-map-ids", "err %d errno %d(%d)", err, errno, EFAULT)) @@ -228,7 +228,7 @@ void serial_test_bpf_obj_id(void) saved_map_id = *(int *)((long)prog_infos[i].map_ids); prog_info.map_ids = prog_infos[i].map_ids; prog_info.nr_map_ids = 2; - err = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &info_len); prog_infos[i].jited_prog_insns = 0; prog_infos[i].xlated_prog_insns = 0; CHECK(err || info_len != sizeof(struct bpf_prog_info) || @@ -277,7 +277,7 @@ void serial_test_bpf_obj_id(void) if (CHECK_FAIL(err)) goto done; - err = bpf_obj_get_info_by_fd(map_fd, &map_info, &info_len); + err = bpf_map_get_info_by_fd(map_fd, &map_info, &info_len); CHECK(err || info_len != sizeof(struct bpf_map_info) || memcmp(&map_info, &map_infos[i], info_len) || array_value != array_magic_value, @@ -322,7 +322,7 @@ void serial_test_bpf_obj_id(void) nr_id_found++; - err = bpf_obj_get_info_by_fd(link_fd, &link_info, &info_len); + err = bpf_link_get_info_by_fd(link_fd, &link_info, &info_len); cmp_res = memcmp(&link_info, &link_infos[i], offsetof(struct bpf_link_info, raw_tracepoint)); CHECK(err || info_len != sizeof(link_info) || cmp_res, diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index de1b5b9eb93a..cbb600be943d 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -4422,7 +4422,7 @@ static int test_big_btf_info(unsigned int test_num) info->btf = ptr_to_u64(user_btf); info->btf_size = raw_btf_size; - err = bpf_obj_get_info_by_fd(btf_fd, info, &info_len); + err = bpf_btf_get_info_by_fd(btf_fd, info, &info_len); if (CHECK(!err, "!err")) { err = -1; goto done; @@ -4435,7 +4435,7 @@ static int test_big_btf_info(unsigned int test_num) * to userspace. */ info_garbage.garbage = 0; - err = bpf_obj_get_info_by_fd(btf_fd, info, &info_len); + err = bpf_btf_get_info_by_fd(btf_fd, info, &info_len); if (CHECK(err || info_len != sizeof(*info), "err:%d errno:%d info_len:%u sizeof(*info):%zu", err, errno, info_len, sizeof(*info))) { @@ -4499,7 +4499,7 @@ static int test_btf_id(unsigned int test_num) /* Test BPF_OBJ_GET_INFO_BY_ID on btf_id */ info_len = sizeof(info[0]); - err = bpf_obj_get_info_by_fd(btf_fd[0], &info[0], &info_len); + err = bpf_btf_get_info_by_fd(btf_fd[0], &info[0], &info_len); if (CHECK(err, "errno:%d", errno)) { err = -1; goto done; @@ -4512,7 +4512,7 @@ static int test_btf_id(unsigned int test_num) } ret = 0; - err = bpf_obj_get_info_by_fd(btf_fd[1], &info[1], &info_len); + err = bpf_btf_get_info_by_fd(btf_fd[1], &info[1], &info_len); if (CHECK(err || info[0].id != info[1].id || info[0].btf_size != info[1].btf_size || (ret = memcmp(user_btf[0], user_btf[1], info[0].btf_size)), @@ -4535,7 +4535,7 @@ static int test_btf_id(unsigned int test_num) } info_len = sizeof(map_info); - err = bpf_obj_get_info_by_fd(map_fd, &map_info, &info_len); + err = bpf_map_get_info_by_fd(map_fd, &map_info, &info_len); if (CHECK(err || map_info.btf_id != info[0].id || map_info.btf_key_type_id != 1 || map_info.btf_value_type_id != 2, "err:%d errno:%d info.id:%u btf_id:%u btf_key_type_id:%u btf_value_type_id:%u", @@ -4638,7 +4638,7 @@ static void do_test_get_info(unsigned int test_num) info.btf_size = user_btf_size; ret = 0; - err = bpf_obj_get_info_by_fd(btf_fd, &info, &info_len); + err = bpf_btf_get_info_by_fd(btf_fd, &info, &info_len); if (CHECK(err || !info.id || info_len != sizeof(info) || info.btf_size != raw_btf_size || (ret = memcmp(raw_btf, user_btf, expected_nbytes)), @@ -4755,7 +4755,7 @@ static void do_test_file(unsigned int test_num) /* get necessary program info */ info_len = sizeof(struct bpf_prog_info); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err < 0, "invalid get info (1st) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); @@ -4787,7 +4787,7 @@ static void do_test_file(unsigned int test_num) info.func_info_rec_size = rec_size; info.func_info = ptr_to_u64(func_info); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err < 0, "invalid get info (2nd) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); @@ -6405,7 +6405,7 @@ static int test_get_finfo(const struct prog_info_raw_test *test, /* get necessary lens */ info_len = sizeof(struct bpf_prog_info); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err < 0, "invalid get info (1st) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); return -1; @@ -6435,7 +6435,7 @@ static int test_get_finfo(const struct prog_info_raw_test *test, info.nr_func_info = nr_func_info; info.func_info_rec_size = rec_size; info.func_info = ptr_to_u64(func_info); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err < 0, "invalid get info (2nd) errno:%d", errno)) { fprintf(stderr, "%s\n", btf_log_buf); err = -1; @@ -6499,7 +6499,7 @@ static int test_get_linfo(const struct prog_info_raw_test *test, nr_jited_func_lens = nr_jited_ksyms; info_len = sizeof(struct bpf_prog_info); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err < 0, "err:%d errno:%d", err, errno)) { err = -1; goto done; @@ -6573,7 +6573,7 @@ static int test_get_linfo(const struct prog_info_raw_test *test, info.jited_func_lens = ptr_to_u64(jited_func_lens); } - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); /* * Only recheck the info.*line_info* fields. diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c index eb90a6b8850d..a8b53b8736f0 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c @@ -14,7 +14,7 @@ static __u32 bpf_map_id(struct bpf_map *map) int err; memset(&info, 0, info_len); - err = bpf_obj_get_info_by_fd(bpf_map__fd(map), &info, &info_len); + err = bpf_map_get_info_by_fd(bpf_map__fd(map), &info, &info_len); if (err) return 0; return info.id; diff --git a/tools/testing/selftests/bpf/prog_tests/cgrp_kfunc.c b/tools/testing/selftests/bpf/prog_tests/cgrp_kfunc.c index 973f0c5af965..b3f7985c8504 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgrp_kfunc.c +++ b/tools/testing/selftests/bpf/prog_tests/cgrp_kfunc.c @@ -8,9 +8,6 @@ #include "cgrp_kfunc_failure.skel.h" #include "cgrp_kfunc_success.skel.h" -static size_t log_buf_sz = 1 << 20; /* 1 MB */ -static char obj_log_buf[1048576]; - static struct cgrp_kfunc_success *open_load_cgrp_kfunc_skel(void) { struct cgrp_kfunc_success *skel; @@ -89,65 +86,6 @@ static const char * const success_tests[] = { "test_cgrp_get_ancestors", }; -static struct { - const char *prog_name; - const char *expected_err_msg; -} failure_tests[] = { - {"cgrp_kfunc_acquire_untrusted", "R1 must be referenced or trusted"}, - {"cgrp_kfunc_acquire_fp", "arg#0 pointer type STRUCT cgroup must point"}, - {"cgrp_kfunc_acquire_unsafe_kretprobe", "reg type unsupported for arg#0 function"}, - {"cgrp_kfunc_acquire_trusted_walked", "R1 must be referenced or trusted"}, - {"cgrp_kfunc_acquire_null", "arg#0 pointer type STRUCT cgroup must point"}, - {"cgrp_kfunc_acquire_unreleased", "Unreleased reference"}, - {"cgrp_kfunc_get_non_kptr_param", "arg#0 expected pointer to map value"}, - {"cgrp_kfunc_get_non_kptr_acquired", "arg#0 expected pointer to map value"}, - {"cgrp_kfunc_get_null", "arg#0 expected pointer to map value"}, - {"cgrp_kfunc_xchg_unreleased", "Unreleased reference"}, - {"cgrp_kfunc_get_unreleased", "Unreleased reference"}, - {"cgrp_kfunc_release_untrusted", "arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket"}, - {"cgrp_kfunc_release_fp", "arg#0 pointer type STRUCT cgroup must point"}, - {"cgrp_kfunc_release_null", "arg#0 is ptr_or_null_ expected ptr_ or socket"}, - {"cgrp_kfunc_release_unacquired", "release kernel function bpf_cgroup_release expects"}, -}; - -static void verify_fail(const char *prog_name, const char *expected_err_msg) -{ - LIBBPF_OPTS(bpf_object_open_opts, opts); - struct cgrp_kfunc_failure *skel; - int err, i; - - opts.kernel_log_buf = obj_log_buf; - opts.kernel_log_size = log_buf_sz; - opts.kernel_log_level = 1; - - skel = cgrp_kfunc_failure__open_opts(&opts); - if (!ASSERT_OK_PTR(skel, "cgrp_kfunc_failure__open_opts")) - goto cleanup; - - for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { - struct bpf_program *prog; - const char *curr_name = failure_tests[i].prog_name; - - prog = bpf_object__find_program_by_name(skel->obj, curr_name); - if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) - goto cleanup; - - bpf_program__set_autoload(prog, !strcmp(curr_name, prog_name)); - } - - err = cgrp_kfunc_failure__load(skel); - if (!ASSERT_ERR(err, "unexpected load success")) - goto cleanup; - - if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) { - fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg); - fprintf(stderr, "Verifier output: %s\n", obj_log_buf); - } - -cleanup: - cgrp_kfunc_failure__destroy(skel); -} - void test_cgrp_kfunc(void) { int i, err; @@ -163,12 +101,7 @@ void test_cgrp_kfunc(void) run_success_test(success_tests[i]); } - for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { - if (!test__start_subtest(failure_tests[i].prog_name)) - continue; - - verify_fail(failure_tests[i].prog_name, failure_tests[i].expected_err_msg); - } + RUN_TESTS(cgrp_kfunc_failure); cleanup: cleanup_cgroup_environment(); diff --git a/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c b/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c index 33a2776737e7..2cc759956e3b 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c @@ -16,7 +16,7 @@ struct socket_cookie { __u64 cookie_key; - __u32 cookie_value; + __u64 cookie_value; }; static void test_tp_btf(int cgroup_fd) diff --git a/tools/testing/selftests/bpf/prog_tests/check_mtu.c b/tools/testing/selftests/bpf/prog_tests/check_mtu.c index 12f4395f18b3..5338d2ea0460 100644 --- a/tools/testing/selftests/bpf/prog_tests/check_mtu.c +++ b/tools/testing/selftests/bpf/prog_tests/check_mtu.c @@ -59,7 +59,7 @@ static void test_check_mtu_xdp_attach(void) memset(&link_info, 0, sizeof(link_info)); fd = bpf_link__fd(link); - err = bpf_obj_get_info_by_fd(fd, &link_info, &link_info_len); + err = bpf_link_get_info_by_fd(fd, &link_info, &link_info_len); if (CHECK(err, "link_info", "failed: %d\n", err)) goto out; diff --git a/tools/testing/selftests/bpf/prog_tests/cpumask.c b/tools/testing/selftests/bpf/prog_tests/cpumask.c new file mode 100644 index 000000000000..5fbe457c4ebe --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/cpumask.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <test_progs.h> +#include "cpumask_failure.skel.h" +#include "cpumask_success.skel.h" + +static const char * const cpumask_success_testcases[] = { + "test_alloc_free_cpumask", + "test_set_clear_cpu", + "test_setall_clear_cpu", + "test_first_firstzero_cpu", + "test_test_and_set_clear", + "test_and_or_xor", + "test_intersects_subset", + "test_copy_any_anyand", + "test_insert_leave", + "test_insert_remove_release", + "test_insert_kptr_get_release", +}; + +static void verify_success(const char *prog_name) +{ + struct cpumask_success *skel; + struct bpf_program *prog; + struct bpf_link *link = NULL; + pid_t child_pid; + int status; + + skel = cpumask_success__open(); + if (!ASSERT_OK_PTR(skel, "cpumask_success__open")) + return; + + skel->bss->pid = getpid(); + skel->bss->nr_cpus = libbpf_num_possible_cpus(); + + cpumask_success__load(skel); + if (!ASSERT_OK_PTR(skel, "cpumask_success__load")) + goto cleanup; + + prog = bpf_object__find_program_by_name(skel->obj, prog_name); + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) + goto cleanup; + + link = bpf_program__attach(prog); + if (!ASSERT_OK_PTR(link, "bpf_program__attach")) + goto cleanup; + + child_pid = fork(); + if (!ASSERT_GT(child_pid, -1, "child_pid")) + goto cleanup; + if (child_pid == 0) + _exit(0); + waitpid(child_pid, &status, 0); + ASSERT_OK(skel->bss->err, "post_wait_err"); + +cleanup: + bpf_link__destroy(link); + cpumask_success__destroy(skel); +} + +void test_cpumask(void) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(cpumask_success_testcases); i++) { + if (!test__start_subtest(cpumask_success_testcases[i])) + continue; + + verify_success(cpumask_success_testcases[i]); + } + + RUN_TESTS(cpumask_failure); +} diff --git a/tools/testing/selftests/bpf/prog_tests/decap_sanity.c b/tools/testing/selftests/bpf/prog_tests/decap_sanity.c index 0b2f73b88c53..2853883b7cbb 100644 --- a/tools/testing/selftests/bpf/prog_tests/decap_sanity.c +++ b/tools/testing/selftests/bpf/prog_tests/decap_sanity.c @@ -80,6 +80,6 @@ fail: bpf_tc_hook_destroy(&qdisc_hook); close_netns(nstoken); } - system("ip netns del " NS_TEST " >& /dev/null"); + system("ip netns del " NS_TEST " &> /dev/null"); decap_sanity__destroy(skel); } diff --git a/tools/testing/selftests/bpf/prog_tests/dummy_st_ops.c b/tools/testing/selftests/bpf/prog_tests/dummy_st_ops.c index c11832657d2b..f43fcb13d2c4 100644 --- a/tools/testing/selftests/bpf/prog_tests/dummy_st_ops.c +++ b/tools/testing/selftests/bpf/prog_tests/dummy_st_ops.c @@ -1,7 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (C) 2021. Huawei Technologies Co., Ltd */ #include <test_progs.h> -#include "dummy_st_ops.skel.h" +#include "dummy_st_ops_success.skel.h" +#include "dummy_st_ops_fail.skel.h" #include "trace_dummy_st_ops.skel.h" /* Need to keep consistent with definition in include/linux/bpf.h */ @@ -11,17 +12,17 @@ struct bpf_dummy_ops_state { static void test_dummy_st_ops_attach(void) { - struct dummy_st_ops *skel; + struct dummy_st_ops_success *skel; struct bpf_link *link; - skel = dummy_st_ops__open_and_load(); + skel = dummy_st_ops_success__open_and_load(); if (!ASSERT_OK_PTR(skel, "dummy_st_ops_load")) return; link = bpf_map__attach_struct_ops(skel->maps.dummy_1); ASSERT_EQ(libbpf_get_error(link), -EOPNOTSUPP, "dummy_st_ops_attach"); - dummy_st_ops__destroy(skel); + dummy_st_ops_success__destroy(skel); } static void test_dummy_init_ret_value(void) @@ -31,10 +32,10 @@ static void test_dummy_init_ret_value(void) .ctx_in = args, .ctx_size_in = sizeof(args), ); - struct dummy_st_ops *skel; + struct dummy_st_ops_success *skel; int fd, err; - skel = dummy_st_ops__open_and_load(); + skel = dummy_st_ops_success__open_and_load(); if (!ASSERT_OK_PTR(skel, "dummy_st_ops_load")) return; @@ -43,7 +44,7 @@ static void test_dummy_init_ret_value(void) ASSERT_OK(err, "test_run"); ASSERT_EQ(attr.retval, 0xf2f3f4f5, "test_ret"); - dummy_st_ops__destroy(skel); + dummy_st_ops_success__destroy(skel); } static void test_dummy_init_ptr_arg(void) @@ -58,10 +59,10 @@ static void test_dummy_init_ptr_arg(void) .ctx_size_in = sizeof(args), ); struct trace_dummy_st_ops *trace_skel; - struct dummy_st_ops *skel; + struct dummy_st_ops_success *skel; int fd, err; - skel = dummy_st_ops__open_and_load(); + skel = dummy_st_ops_success__open_and_load(); if (!ASSERT_OK_PTR(skel, "dummy_st_ops_load")) return; @@ -91,7 +92,7 @@ static void test_dummy_init_ptr_arg(void) ASSERT_EQ(trace_skel->bss->val, exp_retval, "fentry_val"); done: - dummy_st_ops__destroy(skel); + dummy_st_ops_success__destroy(skel); trace_dummy_st_ops__destroy(trace_skel); } @@ -102,12 +103,12 @@ static void test_dummy_multiple_args(void) .ctx_in = args, .ctx_size_in = sizeof(args), ); - struct dummy_st_ops *skel; + struct dummy_st_ops_success *skel; int fd, err; size_t i; char name[8]; - skel = dummy_st_ops__open_and_load(); + skel = dummy_st_ops_success__open_and_load(); if (!ASSERT_OK_PTR(skel, "dummy_st_ops_load")) return; @@ -119,7 +120,28 @@ static void test_dummy_multiple_args(void) ASSERT_EQ(skel->bss->test_2_args[i], args[i], name); } - dummy_st_ops__destroy(skel); + dummy_st_ops_success__destroy(skel); +} + +static void test_dummy_sleepable(void) +{ + __u64 args[1] = {0}; + LIBBPF_OPTS(bpf_test_run_opts, attr, + .ctx_in = args, + .ctx_size_in = sizeof(args), + ); + struct dummy_st_ops_success *skel; + int fd, err; + + skel = dummy_st_ops_success__open_and_load(); + if (!ASSERT_OK_PTR(skel, "dummy_st_ops_load")) + return; + + fd = bpf_program__fd(skel->progs.test_sleepable); + err = bpf_prog_test_run_opts(fd, &attr); + ASSERT_OK(err, "test_run"); + + dummy_st_ops_success__destroy(skel); } void test_dummy_st_ops(void) @@ -132,4 +154,8 @@ void test_dummy_st_ops(void) test_dummy_init_ptr_arg(); if (test__start_subtest("dummy_multiple_args")) test_dummy_multiple_args(); + if (test__start_subtest("dummy_sleepable")) + test_dummy_sleepable(); + + RUN_TESTS(dummy_st_ops_fail); } diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c index 7faaf6d9e0d4..b99264ec0d9c 100644 --- a/tools/testing/selftests/bpf/prog_tests/dynptr.c +++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c @@ -5,14 +5,10 @@ #include "dynptr_fail.skel.h" #include "dynptr_success.skel.h" -static struct { - const char *prog_name; - const char *expected_err_msg; -} dynptr_tests[] = { - /* success cases */ - {"test_read_write", NULL}, - {"test_data_slice", NULL}, - {"test_ringbuf", NULL}, +static const char * const success_tests[] = { + "test_read_write", + "test_data_slice", + "test_ringbuf", }; static void verify_success(const char *prog_name) @@ -53,11 +49,11 @@ void test_dynptr(void) { int i; - for (i = 0; i < ARRAY_SIZE(dynptr_tests); i++) { - if (!test__start_subtest(dynptr_tests[i].prog_name)) + for (i = 0; i < ARRAY_SIZE(success_tests); i++) { + if (!test__start_subtest(success_tests[i])) continue; - verify_success(dynptr_tests[i].prog_name); + verify_success(success_tests[i]); } RUN_TESTS(dynptr_fail); diff --git a/tools/testing/selftests/bpf/prog_tests/enable_stats.c b/tools/testing/selftests/bpf/prog_tests/enable_stats.c index 2cb2085917e7..75f85d0fe74a 100644 --- a/tools/testing/selftests/bpf/prog_tests/enable_stats.c +++ b/tools/testing/selftests/bpf/prog_tests/enable_stats.c @@ -28,7 +28,7 @@ void test_enable_stats(void) prog_fd = bpf_program__fd(skel->progs.test_enable_stats); memset(&info, 0, info_len); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err, "get_prog_info", "failed to get bpf_prog_info for fd %d\n", prog_fd)) goto cleanup; diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index 20f5fa0fcec9..8ec73fdfcdab 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -79,7 +79,7 @@ static void test_fexit_bpf2bpf_common(const char *obj_file, return; info_len = sizeof(prog_info); - err = bpf_obj_get_info_by_fd(tgt_fd, &prog_info, &info_len); + err = bpf_prog_get_info_by_fd(tgt_fd, &prog_info, &info_len); if (!ASSERT_OK(err, "tgt_fd_get_info")) goto close_prog; @@ -136,8 +136,8 @@ static void test_fexit_bpf2bpf_common(const char *obj_file, info_len = sizeof(link_info); memset(&link_info, 0, sizeof(link_info)); - err = bpf_obj_get_info_by_fd(bpf_link__fd(link[i]), - &link_info, &info_len); + err = bpf_link_get_info_by_fd(bpf_link__fd(link[i]), + &link_info, &info_len); ASSERT_OK(err, "link_fd_get_info"); ASSERT_EQ(link_info.tracing.attach_type, bpf_program__expected_attach_type(prog[i]), @@ -417,7 +417,7 @@ static int find_prog_btf_id(const char *name, __u32 attach_prog_fd) struct btf *btf; int ret; - ret = bpf_obj_get_info_by_fd(attach_prog_fd, &info, &info_len); + ret = bpf_prog_get_info_by_fd(attach_prog_fd, &info, &info_len); if (ret) return ret; @@ -483,12 +483,12 @@ static void test_fentry_to_cgroup_bpf(void) if (!ASSERT_GE(fentry_fd, 0, "load_fentry")) goto cleanup; - /* Make sure bpf_obj_get_info_by_fd works correctly when attaching + /* Make sure bpf_prog_get_info_by_fd works correctly when attaching * to another BPF program. */ - ASSERT_OK(bpf_obj_get_info_by_fd(fentry_fd, &info, &info_len), - "bpf_obj_get_info_by_fd"); + ASSERT_OK(bpf_prog_get_info_by_fd(fentry_fd, &info, &info_len), + "bpf_prog_get_info_by_fd"); ASSERT_EQ(info.btf_id, 0, "info.btf_id"); ASSERT_EQ(info.attach_btf_id, btf_id, "info.attach_btf_id"); diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c index 5a7e6011f6bf..596536def43d 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c @@ -2,14 +2,19 @@ /* Copyright (c) 2019 Facebook */ #include <test_progs.h> -/* that's kernel internal BPF_MAX_TRAMP_PROGS define */ -#define CNT 38 - void serial_test_fexit_stress(void) { - int fexit_fd[CNT] = {}; - int link_fd[CNT] = {}; - int err, i; + int bpf_max_tramp_links, err, i; + int *fd, *fexit_fd, *link_fd; + + bpf_max_tramp_links = get_bpf_max_tramp_links(); + if (!ASSERT_GE(bpf_max_tramp_links, 1, "bpf_max_tramp_links")) + return; + fd = calloc(bpf_max_tramp_links * 2, sizeof(*fd)); + if (!ASSERT_OK_PTR(fd, "fd")) + return; + fexit_fd = fd; + link_fd = fd + bpf_max_tramp_links; const struct bpf_insn trace_program[] = { BPF_MOV64_IMM(BPF_REG_0, 0), @@ -28,7 +33,7 @@ void serial_test_fexit_stress(void) goto out; trace_opts.attach_btf_id = err; - for (i = 0; i < CNT; i++) { + for (i = 0; i < bpf_max_tramp_links; i++) { fexit_fd[i] = bpf_prog_load(BPF_PROG_TYPE_TRACING, NULL, "GPL", trace_program, sizeof(trace_program) / sizeof(struct bpf_insn), @@ -44,10 +49,11 @@ void serial_test_fexit_stress(void) ASSERT_OK(err, "bpf_prog_test_run_opts"); out: - for (i = 0; i < CNT; i++) { + for (i = 0; i < bpf_max_tramp_links; i++) { if (link_fd[i]) close(link_fd[i]); if (fexit_fd[i]) close(fexit_fd[i]); } + free(fd); } diff --git a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c new file mode 100644 index 000000000000..61ccddccf485 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c @@ -0,0 +1,187 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <sys/types.h> +#include <net/if.h> + +#include "test_progs.h" +#include "network_helpers.h" +#include "fib_lookup.skel.h" + +#define SYS(fmt, ...) \ + ({ \ + char cmd[1024]; \ + snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__); \ + if (!ASSERT_OK(system(cmd), cmd)) \ + goto fail; \ + }) + +#define NS_TEST "fib_lookup_ns" +#define IPV6_IFACE_ADDR "face::face" +#define IPV6_NUD_FAILED_ADDR "face::1" +#define IPV6_NUD_STALE_ADDR "face::2" +#define IPV4_IFACE_ADDR "10.0.0.254" +#define IPV4_NUD_FAILED_ADDR "10.0.0.1" +#define IPV4_NUD_STALE_ADDR "10.0.0.2" +#define DMAC "11:11:11:11:11:11" +#define DMAC_INIT { 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, } + +struct fib_lookup_test { + const char *desc; + const char *daddr; + int expected_ret; + int lookup_flags; + __u8 dmac[6]; +}; + +static const struct fib_lookup_test tests[] = { + { .desc = "IPv6 failed neigh", + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_NO_NEIGH, }, + { .desc = "IPv6 stale neigh", + .daddr = IPV6_NUD_STALE_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .dmac = DMAC_INIT, }, + { .desc = "IPv6 skip neigh", + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .lookup_flags = BPF_FIB_LOOKUP_SKIP_NEIGH, }, + { .desc = "IPv4 failed neigh", + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_NO_NEIGH, }, + { .desc = "IPv4 stale neigh", + .daddr = IPV4_NUD_STALE_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .dmac = DMAC_INIT, }, + { .desc = "IPv4 skip neigh", + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .lookup_flags = BPF_FIB_LOOKUP_SKIP_NEIGH, }, +}; + +static int ifindex; + +static int setup_netns(void) +{ + int err; + + SYS("ip link add veth1 type veth peer name veth2"); + SYS("ip link set dev veth1 up"); + + SYS("ip addr add %s/64 dev veth1 nodad", IPV6_IFACE_ADDR); + SYS("ip neigh add %s dev veth1 nud failed", IPV6_NUD_FAILED_ADDR); + SYS("ip neigh add %s dev veth1 lladdr %s nud stale", IPV6_NUD_STALE_ADDR, DMAC); + + SYS("ip addr add %s/24 dev veth1 nodad", IPV4_IFACE_ADDR); + SYS("ip neigh add %s dev veth1 nud failed", IPV4_NUD_FAILED_ADDR); + SYS("ip neigh add %s dev veth1 lladdr %s nud stale", IPV4_NUD_STALE_ADDR, DMAC); + + err = write_sysctl("/proc/sys/net/ipv4/conf/veth1/forwarding", "1"); + if (!ASSERT_OK(err, "write_sysctl(net.ipv4.conf.veth1.forwarding)")) + goto fail; + + err = write_sysctl("/proc/sys/net/ipv6/conf/veth1/forwarding", "1"); + if (!ASSERT_OK(err, "write_sysctl(net.ipv6.conf.veth1.forwarding)")) + goto fail; + + return 0; +fail: + return -1; +} + +static int set_lookup_params(struct bpf_fib_lookup *params, const char *daddr) +{ + int ret; + + memset(params, 0, sizeof(*params)); + + params->l4_protocol = IPPROTO_TCP; + params->ifindex = ifindex; + + if (inet_pton(AF_INET6, daddr, params->ipv6_dst) == 1) { + params->family = AF_INET6; + ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) + return -1; + return 0; + } + + ret = inet_pton(AF_INET, daddr, ¶ms->ipv4_dst); + if (!ASSERT_EQ(ret, 1, "convert IP[46] address")) + return -1; + params->family = AF_INET; + ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) + return -1; + + return 0; +} + +static void mac_str(char *b, const __u8 *mac) +{ + sprintf(b, "%02X:%02X:%02X:%02X:%02X:%02X", + mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]); +} + +void test_fib_lookup(void) +{ + struct bpf_fib_lookup *fib_params; + struct nstoken *nstoken = NULL; + struct __sk_buff skb = { }; + struct fib_lookup *skel; + int prog_fd, err, ret, i; + + /* The test does not use the skb->data, so + * use pkt_v6 for both v6 and v4 test. + */ + LIBBPF_OPTS(bpf_test_run_opts, run_opts, + .data_in = &pkt_v6, + .data_size_in = sizeof(pkt_v6), + .ctx_in = &skb, + .ctx_size_in = sizeof(skb), + ); + + skel = fib_lookup__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel open_and_load")) + return; + prog_fd = bpf_program__fd(skel->progs.fib_lookup); + + SYS("ip netns add %s", NS_TEST); + + nstoken = open_netns(NS_TEST); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto fail; + + if (setup_netns()) + goto fail; + + ifindex = if_nametoindex("veth1"); + skb.ifindex = ifindex; + fib_params = &skel->bss->fib_params; + + for (i = 0; i < ARRAY_SIZE(tests); i++) { + printf("Testing %s\n", tests[i].desc); + + if (set_lookup_params(fib_params, tests[i].daddr)) + continue; + skel->bss->fib_lookup_ret = -1; + skel->bss->lookup_flags = BPF_FIB_LOOKUP_OUTPUT | + tests[i].lookup_flags; + + err = bpf_prog_test_run_opts(prog_fd, &run_opts); + if (!ASSERT_OK(err, "bpf_prog_test_run_opts")) + continue; + + ASSERT_EQ(tests[i].expected_ret, skel->bss->fib_lookup_ret, + "fib_lookup_ret"); + + ret = memcmp(tests[i].dmac, fib_params->dmac, sizeof(tests[i].dmac)); + if (!ASSERT_EQ(ret, 0, "dmac not match")) { + char expected[18], actual[18]; + + mac_str(expected, tests[i].dmac); + mac_str(actual, fib_params->dmac); + printf("dmac expected %s actual %s\n", expected, actual); + } + } + +fail: + if (nstoken) + close_netns(nstoken); + system("ip netns del " NS_TEST " &> /dev/null"); + fib_lookup__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c index 7c79462d2702..9333f7346d15 100644 --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector_reattach.c @@ -60,9 +60,9 @@ static __u32 query_prog_id(int prog) __u32 info_len = sizeof(info); int err; - err = bpf_obj_get_info_by_fd(prog, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog, &info, &info_len); if (CHECK_FAIL(err || info_len != sizeof(info))) { - perror("bpf_obj_get_info_by_fd"); + perror("bpf_prog_get_info_by_fd"); return 0; } @@ -497,7 +497,7 @@ static void test_link_get_info(int netns, int prog1, int prog2) } info_len = sizeof(info); - err = bpf_obj_get_info_by_fd(link, &info, &info_len); + err = bpf_link_get_info_by_fd(link, &info, &info_len); if (CHECK_FAIL(err)) { perror("bpf_obj_get_info"); goto out_unlink; @@ -521,7 +521,7 @@ static void test_link_get_info(int netns, int prog1, int prog2) link_id = info.id; info_len = sizeof(info); - err = bpf_obj_get_info_by_fd(link, &info, &info_len); + err = bpf_link_get_info_by_fd(link, &info, &info_len); if (CHECK_FAIL(err)) { perror("bpf_obj_get_info"); goto out_unlink; @@ -546,7 +546,7 @@ static void test_link_get_info(int netns, int prog1, int prog2) netns = -1; info_len = sizeof(info); - err = bpf_obj_get_info_by_fd(link, &info, &info_len); + err = bpf_link_get_info_by_fd(link, &info, &info_len); if (CHECK_FAIL(err)) { perror("bpf_obj_get_info"); goto out_unlink; diff --git a/tools/testing/selftests/bpf/prog_tests/htab_reuse.c b/tools/testing/selftests/bpf/prog_tests/htab_reuse.c new file mode 100644 index 000000000000..a742dd994d60 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/htab_reuse.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2023. Huawei Technologies Co., Ltd */ +#define _GNU_SOURCE +#include <sched.h> +#include <stdbool.h> +#include <test_progs.h> +#include "htab_reuse.skel.h" + +struct htab_op_ctx { + int fd; + int loop; + bool stop; +}; + +struct htab_val { + unsigned int lock; + unsigned int data; +}; + +static void *htab_lookup_fn(void *arg) +{ + struct htab_op_ctx *ctx = arg; + int i = 0; + + while (i++ < ctx->loop && !ctx->stop) { + struct htab_val value; + unsigned int key; + + /* Use BPF_F_LOCK to use spin-lock in map value. */ + key = 7; + bpf_map_lookup_elem_flags(ctx->fd, &key, &value, BPF_F_LOCK); + } + + return NULL; +} + +static void *htab_update_fn(void *arg) +{ + struct htab_op_ctx *ctx = arg; + int i = 0; + + while (i++ < ctx->loop && !ctx->stop) { + struct htab_val value; + unsigned int key; + + key = 7; + value.lock = 0; + value.data = key; + bpf_map_update_elem(ctx->fd, &key, &value, BPF_F_LOCK); + bpf_map_delete_elem(ctx->fd, &key); + + key = 24; + value.lock = 0; + value.data = key; + bpf_map_update_elem(ctx->fd, &key, &value, BPF_F_LOCK); + bpf_map_delete_elem(ctx->fd, &key); + } + + return NULL; +} + +void test_htab_reuse(void) +{ + unsigned int i, wr_nr = 1, rd_nr = 4; + pthread_t tids[wr_nr + rd_nr]; + struct htab_reuse *skel; + struct htab_op_ctx ctx; + int err; + + skel = htab_reuse__open_and_load(); + if (!ASSERT_OK_PTR(skel, "htab_reuse__open_and_load")) + return; + + ctx.fd = bpf_map__fd(skel->maps.htab); + ctx.loop = 500; + ctx.stop = false; + + memset(tids, 0, sizeof(tids)); + for (i = 0; i < wr_nr; i++) { + err = pthread_create(&tids[i], NULL, htab_update_fn, &ctx); + if (!ASSERT_OK(err, "pthread_create")) { + ctx.stop = true; + goto reap; + } + } + for (i = 0; i < rd_nr; i++) { + err = pthread_create(&tids[i + wr_nr], NULL, htab_lookup_fn, &ctx); + if (!ASSERT_OK(err, "pthread_create")) { + ctx.stop = true; + goto reap; + } + } + +reap: + for (i = 0; i < wr_nr + rd_nr; i++) { + if (!tids[i]) + continue; + pthread_join(tids[i], NULL); + } + htab_reuse__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/prog_tests/jit_probe_mem.c b/tools/testing/selftests/bpf/prog_tests/jit_probe_mem.c new file mode 100644 index 000000000000..5639428607e6 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/jit_probe_mem.c @@ -0,0 +1,28 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ +#include <test_progs.h> +#include <network_helpers.h> + +#include "jit_probe_mem.skel.h" + +void test_jit_probe_mem(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct jit_probe_mem *skel; + int ret; + + skel = jit_probe_mem__open_and_load(); + if (!ASSERT_OK_PTR(skel, "jit_probe_mem__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.test_jit_probe_mem), &opts); + ASSERT_OK(ret, "jit_probe_mem ret"); + ASSERT_OK(opts.retval, "jit_probe_mem opts.retval"); + ASSERT_EQ(skel->data->total_sum, 192, "jit_probe_mem total_sum"); + + jit_probe_mem__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c index 73579370bfbd..c07991544a78 100644 --- a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c +++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c @@ -36,7 +36,7 @@ static void on_sample(void *ctx, int cpu, void *data, __u32 size) "cb32_0 %x != %x\n", meta->cb32_0, cb.cb32[0])) return; - if (CHECK(pkt_v6->eth.h_proto != 0xdd86, "check_eth", + if (CHECK(pkt_v6->eth.h_proto != htons(ETH_P_IPV6), "check_eth", "h_proto %x\n", pkt_v6->eth.h_proto)) return; if (CHECK(pkt_v6->iph.nexthdr != 6, "check_ip", diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c index 5af1ee8f0e6e..a543742cd7bd 100644 --- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c @@ -72,10 +72,12 @@ static struct kfunc_test_params kfunc_tests[] = { /* success cases */ TC_TEST(kfunc_call_test1, 12), TC_TEST(kfunc_call_test2, 3), + TC_TEST(kfunc_call_test4, -1234), TC_TEST(kfunc_call_test_ref_btf_id, 0), TC_TEST(kfunc_call_test_get_mem, 42), SYSCALL_TEST(kfunc_syscall_test, 0), SYSCALL_NULL_CTX_TEST(kfunc_syscall_test_null, 0), + TC_TEST(kfunc_call_test_static_unused_arg, 0), }; struct syscall_test_args { diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c index a9229260a6ce..8cd298b78e44 100644 --- a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c @@ -10,17 +10,11 @@ #include <test_progs.h> #include "test_kfunc_dynptr_param.skel.h" -static size_t log_buf_sz = 1048576; /* 1 MB */ -static char obj_log_buf[1048576]; - static struct { const char *prog_name; - const char *expected_verifier_err_msg; int expected_runtime_err; } kfunc_dynptr_tests[] = { - {"not_valid_dynptr", "Expected an initialized dynptr as arg #1", 0}, - {"not_ptr_to_stack", "arg#0 expected pointer to stack or dynptr_ptr", 0}, - {"dynptr_data_null", NULL, -EBADMSG}, + {"dynptr_data_null", -EBADMSG}, }; static bool kfunc_not_supported; @@ -38,29 +32,15 @@ static int libbpf_print_cb(enum libbpf_print_level level, const char *fmt, return 0; } -static void verify_fail(const char *prog_name, const char *expected_err_msg) +static bool has_pkcs7_kfunc_support(void) { struct test_kfunc_dynptr_param *skel; - LIBBPF_OPTS(bpf_object_open_opts, opts); libbpf_print_fn_t old_print_cb; - struct bpf_program *prog; int err; - opts.kernel_log_buf = obj_log_buf; - opts.kernel_log_size = log_buf_sz; - opts.kernel_log_level = 1; - - skel = test_kfunc_dynptr_param__open_opts(&opts); - if (!ASSERT_OK_PTR(skel, "test_kfunc_dynptr_param__open_opts")) - goto cleanup; - - prog = bpf_object__find_program_by_name(skel->obj, prog_name); - if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) - goto cleanup; - - bpf_program__set_autoload(prog, true); - - bpf_map__set_max_entries(skel->maps.ringbuf, getpagesize()); + skel = test_kfunc_dynptr_param__open(); + if (!ASSERT_OK_PTR(skel, "test_kfunc_dynptr_param__open")) + return false; kfunc_not_supported = false; @@ -72,26 +52,18 @@ static void verify_fail(const char *prog_name, const char *expected_err_msg) fprintf(stderr, "%s:SKIP:bpf_verify_pkcs7_signature() kfunc not supported\n", __func__); - test__skip(); - goto cleanup; - } - - if (!ASSERT_ERR(err, "unexpected load success")) - goto cleanup; - - if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) { - fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg); - fprintf(stderr, "Verifier output: %s\n", obj_log_buf); + test_kfunc_dynptr_param__destroy(skel); + return false; } -cleanup: test_kfunc_dynptr_param__destroy(skel); + + return true; } static void verify_success(const char *prog_name, int expected_runtime_err) { struct test_kfunc_dynptr_param *skel; - libbpf_print_fn_t old_print_cb; struct bpf_program *prog; struct bpf_link *link; __u32 next_id; @@ -103,21 +75,7 @@ static void verify_success(const char *prog_name, int expected_runtime_err) skel->bss->pid = getpid(); - bpf_map__set_max_entries(skel->maps.ringbuf, getpagesize()); - - kfunc_not_supported = false; - - old_print_cb = libbpf_set_print(libbpf_print_cb); err = test_kfunc_dynptr_param__load(skel); - libbpf_set_print(old_print_cb); - - if (err < 0 && kfunc_not_supported) { - fprintf(stderr, - "%s:SKIP:bpf_verify_pkcs7_signature() kfunc not supported\n", - __func__); - test__skip(); - goto cleanup; - } if (!ASSERT_OK(err, "test_kfunc_dynptr_param__load")) goto cleanup; @@ -147,15 +105,15 @@ void test_kfunc_dynptr_param(void) { int i; + if (!has_pkcs7_kfunc_support()) + return; + for (i = 0; i < ARRAY_SIZE(kfunc_dynptr_tests); i++) { if (!test__start_subtest(kfunc_dynptr_tests[i].prog_name)) continue; - if (kfunc_dynptr_tests[i].expected_verifier_err_msg) - verify_fail(kfunc_dynptr_tests[i].prog_name, - kfunc_dynptr_tests[i].expected_verifier_err_msg); - else - verify_success(kfunc_dynptr_tests[i].prog_name, - kfunc_dynptr_tests[i].expected_runtime_err); + verify_success(kfunc_dynptr_tests[i].prog_name, + kfunc_dynptr_tests[i].expected_runtime_err); } + RUN_TESTS(test_kfunc_dynptr_param); } diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c index c6f37e825f11..113dba349a57 100644 --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c @@ -322,7 +322,7 @@ static bool symbol_equal(long key1, long key2, void *ctx __maybe_unused) return strcmp((const char *) key1, (const char *) key2) == 0; } -static int get_syms(char ***symsp, size_t *cntp) +static int get_syms(char ***symsp, size_t *cntp, bool kernel) { size_t cap = 0, cnt = 0, i; char *name = NULL, **syms = NULL; @@ -349,8 +349,9 @@ static int get_syms(char ***symsp, size_t *cntp) } while (fgets(buf, sizeof(buf), f)) { - /* skip modules */ - if (strchr(buf, '[')) + if (kernel && strchr(buf, '[')) + continue; + if (!kernel && !strchr(buf, '[')) continue; free(name); @@ -404,7 +405,7 @@ error: return err; } -void serial_test_kprobe_multi_bench_attach(void) +static void test_kprobe_multi_bench_attach(bool kernel) { LIBBPF_OPTS(bpf_kprobe_multi_opts, opts); struct kprobe_multi_empty *skel = NULL; @@ -415,7 +416,7 @@ void serial_test_kprobe_multi_bench_attach(void) char **syms = NULL; size_t cnt = 0, i; - if (!ASSERT_OK(get_syms(&syms, &cnt), "get_syms")) + if (!ASSERT_OK(get_syms(&syms, &cnt, kernel), "get_syms")) return; skel = kprobe_multi_empty__open_and_load(); @@ -453,6 +454,14 @@ cleanup: } } +void serial_test_kprobe_multi_bench_attach(void) +{ + if (test__start_subtest("kernel")) + test_kprobe_multi_bench_attach(true); + if (test__start_subtest("modules")) + test_kprobe_multi_bench_attach(false); +} + void test_kprobe_multi_test(void) { if (!ASSERT_OK(load_kallsyms(), "load_kallsyms")) diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_get_fd_by_id_opts.c b/tools/testing/selftests/bpf/prog_tests/libbpf_get_fd_by_id_opts.c index 25e5dfa9c315..a3f238f51d05 100644 --- a/tools/testing/selftests/bpf/prog_tests/libbpf_get_fd_by_id_opts.c +++ b/tools/testing/selftests/bpf/prog_tests/libbpf_get_fd_by_id_opts.c @@ -29,9 +29,9 @@ void test_libbpf_get_fd_by_id_opts(void) if (!ASSERT_OK(ret, "test_libbpf_get_fd_by_id_opts__attach")) goto close_prog; - ret = bpf_obj_get_info_by_fd(bpf_map__fd(skel->maps.data_input), + ret = bpf_map_get_info_by_fd(bpf_map__fd(skel->maps.data_input), &info_m, &len); - if (!ASSERT_OK(ret, "bpf_obj_get_info_by_fd")) + if (!ASSERT_OK(ret, "bpf_map_get_info_by_fd")) goto close_prog; fd = bpf_map_get_fd_by_id(info_m.id); diff --git a/tools/testing/selftests/bpf/prog_tests/linked_list.c b/tools/testing/selftests/bpf/prog_tests/linked_list.c index 9a7d4c47af63..0ed8132ce1c3 100644 --- a/tools/testing/selftests/bpf/prog_tests/linked_list.c +++ b/tools/testing/selftests/bpf/prog_tests/linked_list.c @@ -58,12 +58,12 @@ static struct { TEST(inner_map, pop_front) TEST(inner_map, pop_back) #undef TEST - { "map_compat_kprobe", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_kretprobe", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_tp", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_perf", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_raw_tp", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_raw_tp_w", "tracing progs cannot use bpf_list_head yet" }, + { "map_compat_kprobe", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_kretprobe", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_tp", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_perf", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_raw_tp", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_raw_tp_w", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, { "obj_type_id_oor", "local type ID argument must be in range [0, U32_MAX]" }, { "obj_new_no_composite", "bpf_obj_new type ID argument must be of a struct" }, { "obj_new_no_struct", "bpf_obj_new type ID argument must be of a struct" }, @@ -78,8 +78,6 @@ static struct { { "direct_write_head", "direct access to bpf_list_head is disallowed" }, { "direct_read_node", "direct access to bpf_list_node is disallowed" }, { "direct_write_node", "direct access to bpf_list_node is disallowed" }, - { "write_after_push_front", "only read is supported" }, - { "write_after_push_back", "only read is supported" }, { "use_after_unlock_push_front", "invalid mem access 'scalar'" }, { "use_after_unlock_push_back", "invalid mem access 'scalar'" }, { "double_push_front", "arg#1 expected pointer to allocated object" }, @@ -717,6 +715,43 @@ static void test_btf(void) btf__free(btf); break; } + + while (test__start_subtest("btf: list_node and rb_node in same struct")) { + btf = init_btf(); + if (!ASSERT_OK_PTR(btf, "init_btf")) + break; + + id = btf__add_struct(btf, "bpf_rb_node", 24); + if (!ASSERT_EQ(id, 5, "btf__add_struct bpf_rb_node")) + break; + id = btf__add_struct(btf, "bar", 40); + if (!ASSERT_EQ(id, 6, "btf__add_struct bar")) + break; + err = btf__add_field(btf, "a", LIST_NODE, 0, 0); + if (!ASSERT_OK(err, "btf__add_field bar::a")) + break; + err = btf__add_field(btf, "c", 5, 128, 0); + if (!ASSERT_OK(err, "btf__add_field bar::c")) + break; + + id = btf__add_struct(btf, "foo", 20); + if (!ASSERT_EQ(id, 7, "btf__add_struct foo")) + break; + err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); + if (!ASSERT_OK(err, "btf__add_field foo::a")) + break; + err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); + if (!ASSERT_OK(err, "btf__add_field foo::b")) + break; + id = btf__add_decl_tag(btf, "contains:bar:a", 7, 0); + if (!ASSERT_EQ(id, 8, "btf__add_decl_tag contains:bar:a")) + break; + + err = btf__load_into_kernel(btf); + ASSERT_EQ(err, -EINVAL, "check btf"); + btf__free(btf); + break; + } } void test_linked_list(void) diff --git a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c index f117bfef68a1..130a3b21e467 100644 --- a/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c +++ b/tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c @@ -47,7 +47,8 @@ static __u32 query_prog_cnt(int cgroup_fd, const char *attach_func) fd = bpf_prog_get_fd_by_id(p.prog_ids[i]); ASSERT_GE(fd, 0, "prog_get_fd_by_id"); - ASSERT_OK(bpf_obj_get_info_by_fd(fd, &info, &info_len), "prog_info_by_fd"); + ASSERT_OK(bpf_prog_get_info_by_fd(fd, &info, &info_len), + "prog_info_by_fd"); close(fd); if (info.attach_btf_id == diff --git a/tools/testing/selftests/bpf/prog_tests/metadata.c b/tools/testing/selftests/bpf/prog_tests/metadata.c index 2c53eade88e3..8b67dfc10f5c 100644 --- a/tools/testing/selftests/bpf/prog_tests/metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/metadata.c @@ -16,7 +16,7 @@ static int duration; static int prog_holds_map(int prog_fd, int map_fd) { struct bpf_prog_info prog_info = {}; - struct bpf_prog_info map_info = {}; + struct bpf_map_info map_info = {}; __u32 prog_info_len; __u32 map_info_len; __u32 *map_ids; @@ -25,12 +25,12 @@ static int prog_holds_map(int prog_fd, int map_fd) int i; map_info_len = sizeof(map_info); - ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len); + ret = bpf_map_get_info_by_fd(map_fd, &map_info, &map_info_len); if (ret) return -errno; prog_info_len = sizeof(prog_info); - ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); + ret = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); if (ret) return -errno; @@ -44,7 +44,7 @@ static int prog_holds_map(int prog_fd, int map_fd) prog_info.map_ids = ptr_to_u64(map_ids); prog_info_len = sizeof(prog_info); - ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); + ret = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &prog_info_len); if (ret) { ret = -errno; goto free_map_ids; diff --git a/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c b/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c index eb2feaac81fe..653b0a20fab9 100644 --- a/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c +++ b/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c @@ -488,7 +488,7 @@ static void run_test(struct migrate_reuseport_test_case *test_case, goto close_servers; } - /* Tie requests to the first four listners */ + /* Tie requests to the first four listeners */ err = start_clients(test_case); if (!ASSERT_OK(err, "start_clients")) goto close_clients; diff --git a/tools/testing/selftests/bpf/prog_tests/mmap.c b/tools/testing/selftests/bpf/prog_tests/mmap.c index 37b002ca1167..a271d5a0f7ab 100644 --- a/tools/testing/selftests/bpf/prog_tests/mmap.c +++ b/tools/testing/selftests/bpf/prog_tests/mmap.c @@ -64,7 +64,7 @@ void test_mmap(void) /* get map's ID */ memset(&map_info, 0, map_info_sz); - err = bpf_obj_get_info_by_fd(data_map_fd, &map_info, &map_info_sz); + err = bpf_map_get_info_by_fd(data_map_fd, &map_info, &map_info_sz); if (CHECK(err, "map_get_info", "failed %d\n", errno)) goto cleanup; data_map_id = map_info.id; diff --git a/tools/testing/selftests/bpf/prog_tests/nested_trust.c b/tools/testing/selftests/bpf/prog_tests/nested_trust.c new file mode 100644 index 000000000000..39886f58924e --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/nested_trust.c @@ -0,0 +1,12 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <test_progs.h> +#include "nested_trust_failure.skel.h" +#include "nested_trust_success.skel.h" + +void test_nested_trust(void) +{ + RUN_TESTS(nested_trust_success); + RUN_TESTS(nested_trust_failure); +} diff --git a/tools/testing/selftests/bpf/prog_tests/perf_link.c b/tools/testing/selftests/bpf/prog_tests/perf_link.c index 224eba6fef2e..3a25f1c743a1 100644 --- a/tools/testing/selftests/bpf/prog_tests/perf_link.c +++ b/tools/testing/selftests/bpf/prog_tests/perf_link.c @@ -54,7 +54,7 @@ void serial_test_perf_link(void) goto cleanup; memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(link_fd, &info, &info_len); + err = bpf_link_get_info_by_fd(link_fd, &info, &info_len); if (!ASSERT_OK(err, "link_get_info")) goto cleanup; diff --git a/tools/testing/selftests/bpf/prog_tests/pinning.c b/tools/testing/selftests/bpf/prog_tests/pinning.c index d95cee5867b7..c799a3c5ad1f 100644 --- a/tools/testing/selftests/bpf/prog_tests/pinning.c +++ b/tools/testing/selftests/bpf/prog_tests/pinning.c @@ -18,7 +18,7 @@ __u32 get_map_id(struct bpf_object *obj, const char *name) if (CHECK(!map, "find map", "NULL map")) return 0; - err = bpf_obj_get_info_by_fd(bpf_map__fd(map), + err = bpf_map_get_info_by_fd(bpf_map__fd(map), &map_info, &map_info_len); CHECK(err, "get map info", "err %d errno %d", err, errno); return map_info.id; diff --git a/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c b/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c index 1ccd2bdf8fa8..01f1d1b6715a 100644 --- a/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c +++ b/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c @@ -12,7 +12,7 @@ static void check_run_cnt(int prog_fd, __u64 run_cnt) __u32 info_len = sizeof(info); int err; - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (CHECK(err, "get_prog_info", "failed to get bpf_prog_info for fd %d\n", prog_fd)) return; diff --git a/tools/testing/selftests/bpf/prog_tests/rbtree.c b/tools/testing/selftests/bpf/prog_tests/rbtree.c new file mode 100644 index 000000000000..156fa95c42f6 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/rbtree.c @@ -0,0 +1,117 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include <test_progs.h> +#include <network_helpers.h> + +#include "rbtree.skel.h" +#include "rbtree_fail.skel.h" +#include "rbtree_btf_fail__wrong_node_type.skel.h" +#include "rbtree_btf_fail__add_wrong_type.skel.h" + +static void test_rbtree_add_nodes(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct rbtree *skel; + int ret; + + skel = rbtree__open_and_load(); + if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_add_nodes), &opts); + ASSERT_OK(ret, "rbtree_add_nodes run"); + ASSERT_OK(opts.retval, "rbtree_add_nodes retval"); + ASSERT_EQ(skel->data->less_callback_ran, 1, "rbtree_add_nodes less_callback_ran"); + + rbtree__destroy(skel); +} + +static void test_rbtree_add_and_remove(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct rbtree *skel; + int ret; + + skel = rbtree__open_and_load(); + if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_add_and_remove), &opts); + ASSERT_OK(ret, "rbtree_add_and_remove"); + ASSERT_OK(opts.retval, "rbtree_add_and_remove retval"); + ASSERT_EQ(skel->data->removed_key, 5, "rbtree_add_and_remove first removed key"); + + rbtree__destroy(skel); +} + +static void test_rbtree_first_and_remove(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct rbtree *skel; + int ret; + + skel = rbtree__open_and_load(); + if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_first_and_remove), &opts); + ASSERT_OK(ret, "rbtree_first_and_remove"); + ASSERT_OK(opts.retval, "rbtree_first_and_remove retval"); + ASSERT_EQ(skel->data->first_data[0], 2, "rbtree_first_and_remove first rbtree_first()"); + ASSERT_EQ(skel->data->removed_key, 1, "rbtree_first_and_remove first removed key"); + ASSERT_EQ(skel->data->first_data[1], 4, "rbtree_first_and_remove second rbtree_first()"); + + rbtree__destroy(skel); +} + +void test_rbtree_success(void) +{ + if (test__start_subtest("rbtree_add_nodes")) + test_rbtree_add_nodes(); + if (test__start_subtest("rbtree_add_and_remove")) + test_rbtree_add_and_remove(); + if (test__start_subtest("rbtree_first_and_remove")) + test_rbtree_first_and_remove(); +} + +#define BTF_FAIL_TEST(suffix) \ +void test_rbtree_btf_fail__##suffix(void) \ +{ \ + struct rbtree_btf_fail__##suffix *skel; \ + \ + skel = rbtree_btf_fail__##suffix##__open_and_load(); \ + if (!ASSERT_ERR_PTR(skel, \ + "rbtree_btf_fail__" #suffix "__open_and_load unexpected success")) \ + rbtree_btf_fail__##suffix##__destroy(skel); \ +} + +#define RUN_BTF_FAIL_TEST(suffix) \ + if (test__start_subtest("rbtree_btf_fail__" #suffix)) \ + test_rbtree_btf_fail__##suffix(); + +BTF_FAIL_TEST(wrong_node_type); +BTF_FAIL_TEST(add_wrong_type); + +void test_rbtree_btf_fail(void) +{ + RUN_BTF_FAIL_TEST(wrong_node_type); + RUN_BTF_FAIL_TEST(add_wrong_type); +} + +void test_rbtree_fail(void) +{ + RUN_TESTS(rbtree_fail); +} diff --git a/tools/testing/selftests/bpf/prog_tests/recursion.c b/tools/testing/selftests/bpf/prog_tests/recursion.c index f3af2627b599..23552d3e3365 100644 --- a/tools/testing/selftests/bpf/prog_tests/recursion.c +++ b/tools/testing/selftests/bpf/prog_tests/recursion.c @@ -31,8 +31,8 @@ void test_recursion(void) bpf_map_delete_elem(bpf_map__fd(skel->maps.hash2), &key); ASSERT_EQ(skel->bss->pass2, 2, "pass2 == 2"); - err = bpf_obj_get_info_by_fd(bpf_program__fd(skel->progs.on_delete), - &prog_info, &prog_info_len); + err = bpf_prog_get_info_by_fd(bpf_program__fd(skel->progs.on_delete), + &prog_info, &prog_info_len); if (!ASSERT_OK(err, "get_prog_info")) goto out; ASSERT_EQ(prog_info.recursion_misses, 2, "recursion_misses"); diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c index 018611e6b248..7d4a9b3d3722 100644 --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c @@ -4,6 +4,7 @@ #define _GNU_SOURCE #include <sched.h> #include <linux/socket.h> +#include <linux/tls.h> #include <net/if.h> #include "test_progs.h" @@ -83,6 +84,76 @@ static void test_udp(int family) ASSERT_EQ(bss->nr_binddev, 1, "nr_bind"); } +static void test_ktls(int family) +{ + struct tls12_crypto_info_aes_gcm_128 aes128; + struct setget_sockopt__bss *bss = skel->bss; + int cfd = -1, sfd = -1, fd = -1, ret; + char buf; + + memset(bss, 0, sizeof(*bss)); + + sfd = start_server(family, SOCK_STREAM, + family == AF_INET6 ? addr6_str : addr4_str, 0, 0); + if (!ASSERT_GE(sfd, 0, "start_server")) + return; + fd = connect_to_fd(sfd, 0); + if (!ASSERT_GE(fd, 0, "connect_to_fd")) + goto err_out; + + cfd = accept(sfd, NULL, 0); + if (!ASSERT_GE(cfd, 0, "accept")) + goto err_out; + + close(sfd); + sfd = -1; + + /* Setup KTLS */ + ret = setsockopt(fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls")); + if (!ASSERT_OK(ret, "setsockopt")) + goto err_out; + ret = setsockopt(cfd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls")); + if (!ASSERT_OK(ret, "setsockopt")) + goto err_out; + + memset(&aes128, 0, sizeof(aes128)); + aes128.info.version = TLS_1_2_VERSION; + aes128.info.cipher_type = TLS_CIPHER_AES_GCM_128; + + ret = setsockopt(fd, SOL_TLS, TLS_TX, &aes128, sizeof(aes128)); + if (!ASSERT_OK(ret, "setsockopt")) + goto err_out; + + ret = setsockopt(cfd, SOL_TLS, TLS_RX, &aes128, sizeof(aes128)); + if (!ASSERT_OK(ret, "setsockopt")) + goto err_out; + + /* KTLS is enabled */ + + close(fd); + /* At this point, the cfd socket is at the CLOSE_WAIT state + * and still run TLS protocol. The test for + * BPF_TCP_CLOSE_WAIT should be run at this point. + */ + ret = read(cfd, &buf, sizeof(buf)); + ASSERT_EQ(ret, 0, "read"); + close(cfd); + + ASSERT_EQ(bss->nr_listen, 1, "nr_listen"); + ASSERT_EQ(bss->nr_connect, 1, "nr_connect"); + ASSERT_EQ(bss->nr_active, 1, "nr_active"); + ASSERT_EQ(bss->nr_passive, 1, "nr_passive"); + ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create"); + ASSERT_EQ(bss->nr_binddev, 2, "nr_bind"); + ASSERT_EQ(bss->nr_fin_wait1, 1, "nr_fin_wait1"); + return; + +err_out: + close(fd); + close(cfd); + close(sfd); +} + void test_setget_sockopt(void) { cg_fd = test__join_cgroup(CG_NAME); @@ -118,6 +189,8 @@ void test_setget_sockopt(void) test_tcp(AF_INET); test_udp(AF_INET6); test_udp(AF_INET); + test_ktls(AF_INET6); + test_ktls(AF_INET); done: setget_sockopt__destroy(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/sk_assign.c b/tools/testing/selftests/bpf/prog_tests/sk_assign.c index 3e190ed63976..1374b626a985 100644 --- a/tools/testing/selftests/bpf/prog_tests/sk_assign.c +++ b/tools/testing/selftests/bpf/prog_tests/sk_assign.c @@ -29,7 +29,23 @@ static int stop, duration; static bool configure_stack(void) { + char tc_version[128]; char tc_cmd[BUFSIZ]; + char *prog; + FILE *tc; + + /* Check whether tc is built with libbpf. */ + tc = popen("tc -V", "r"); + if (CHECK_FAIL(!tc)) + return false; + if (CHECK_FAIL(!fgets(tc_version, sizeof(tc_version), tc))) + return false; + if (strstr(tc_version, ", libbpf ")) + prog = "test_sk_assign_libbpf.bpf.o"; + else + prog = "test_sk_assign.bpf.o"; + if (CHECK_FAIL(pclose(tc))) + return false; /* Move to a new networking namespace */ if (CHECK_FAIL(unshare(CLONE_NEWNET))) @@ -46,8 +62,8 @@ configure_stack(void) /* Load qdisc, BPF program */ if (CHECK_FAIL(system("tc qdisc add dev lo clsact"))) return false; - sprintf(tc_cmd, "%s %s %s %s", "tc filter add dev lo ingress bpf", - "direct-action object-file ./test_sk_assign.bpf.o", + sprintf(tc_cmd, "%s %s %s %s %s", "tc filter add dev lo ingress bpf", + "direct-action object-file", prog, "section tc", (env.verbosity < VERBOSE_VERY) ? " 2>/dev/null" : "verbose"); if (CHECK(system(tc_cmd), "BPF load failed;", @@ -129,15 +145,12 @@ get_port(int fd) static ssize_t rcv_msg(int srv_client, int type) { - struct sockaddr_storage ss; char buf[BUFSIZ]; - socklen_t slen; if (type == SOCK_STREAM) return read(srv_client, &buf, sizeof(buf)); else - return recvfrom(srv_client, &buf, sizeof(buf), 0, - (struct sockaddr *)&ss, &slen); + return recvfrom(srv_client, &buf, sizeof(buf), 0, NULL, NULL); } static int diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 0aa088900699..0ce25a967481 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -299,9 +299,9 @@ static __u32 query_prog_id(int prog_fd) __u32 info_len = sizeof(info); int err; - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (!ASSERT_OK(err, "bpf_obj_get_info_by_fd") || - !ASSERT_EQ(info_len, sizeof(info), "bpf_obj_get_info_by_fd")) + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd") || + !ASSERT_EQ(info_len, sizeof(info), "bpf_prog_get_info_by_fd")) return 0; return info.id; diff --git a/tools/testing/selftests/bpf/prog_tests/task_kfunc.c b/tools/testing/selftests/bpf/prog_tests/task_kfunc.c index 18848c31e36f..f79fa5bc9a8d 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_kfunc.c +++ b/tools/testing/selftests/bpf/prog_tests/task_kfunc.c @@ -9,9 +9,6 @@ #include "task_kfunc_failure.skel.h" #include "task_kfunc_success.skel.h" -static size_t log_buf_sz = 1 << 20; /* 1 MB */ -static char obj_log_buf[1048576]; - static struct task_kfunc_success *open_load_task_kfunc_skel(void) { struct task_kfunc_success *skel; @@ -83,67 +80,6 @@ static const char * const success_tests[] = { "test_task_from_pid_invalid", }; -static struct { - const char *prog_name; - const char *expected_err_msg; -} failure_tests[] = { - {"task_kfunc_acquire_untrusted", "R1 must be referenced or trusted"}, - {"task_kfunc_acquire_fp", "arg#0 pointer type STRUCT task_struct must point"}, - {"task_kfunc_acquire_unsafe_kretprobe", "reg type unsupported for arg#0 function"}, - {"task_kfunc_acquire_trusted_walked", "R1 must be referenced or trusted"}, - {"task_kfunc_acquire_null", "arg#0 pointer type STRUCT task_struct must point"}, - {"task_kfunc_acquire_unreleased", "Unreleased reference"}, - {"task_kfunc_get_non_kptr_param", "arg#0 expected pointer to map value"}, - {"task_kfunc_get_non_kptr_acquired", "arg#0 expected pointer to map value"}, - {"task_kfunc_get_null", "arg#0 expected pointer to map value"}, - {"task_kfunc_xchg_unreleased", "Unreleased reference"}, - {"task_kfunc_get_unreleased", "Unreleased reference"}, - {"task_kfunc_release_untrusted", "arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket"}, - {"task_kfunc_release_fp", "arg#0 pointer type STRUCT task_struct must point"}, - {"task_kfunc_release_null", "arg#0 is ptr_or_null_ expected ptr_ or socket"}, - {"task_kfunc_release_unacquired", "release kernel function bpf_task_release expects"}, - {"task_kfunc_from_pid_no_null_check", "arg#0 is ptr_or_null_ expected ptr_ or socket"}, - {"task_kfunc_from_lsm_task_free", "reg type unsupported for arg#0 function"}, -}; - -static void verify_fail(const char *prog_name, const char *expected_err_msg) -{ - LIBBPF_OPTS(bpf_object_open_opts, opts); - struct task_kfunc_failure *skel; - int err, i; - - opts.kernel_log_buf = obj_log_buf; - opts.kernel_log_size = log_buf_sz; - opts.kernel_log_level = 1; - - skel = task_kfunc_failure__open_opts(&opts); - if (!ASSERT_OK_PTR(skel, "task_kfunc_failure__open_opts")) - goto cleanup; - - for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { - struct bpf_program *prog; - const char *curr_name = failure_tests[i].prog_name; - - prog = bpf_object__find_program_by_name(skel->obj, curr_name); - if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) - goto cleanup; - - bpf_program__set_autoload(prog, !strcmp(curr_name, prog_name)); - } - - err = task_kfunc_failure__load(skel); - if (!ASSERT_ERR(err, "unexpected load success")) - goto cleanup; - - if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) { - fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg); - fprintf(stderr, "Verifier output: %s\n", obj_log_buf); - } - -cleanup: - task_kfunc_failure__destroy(skel); -} - void test_task_kfunc(void) { int i; @@ -155,10 +91,5 @@ void test_task_kfunc(void) run_success_test(success_tests[i]); } - for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { - if (!test__start_subtest(failure_tests[i].prog_name)) - continue; - - verify_fail(failure_tests[i].prog_name, failure_tests[i].expected_err_msg); - } + RUN_TESTS(task_kfunc_failure); } diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index a176bd75a748..ea8537c54413 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -119,19 +119,19 @@ static void test_recursion(void) prog_fd = bpf_program__fd(skel->progs.on_lookup); memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); ASSERT_OK(err, "get prog info"); ASSERT_GT(info.recursion_misses, 0, "on_lookup prog recursion"); prog_fd = bpf_program__fd(skel->progs.on_update); memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); ASSERT_OK(err, "get prog info"); ASSERT_EQ(info.recursion_misses, 0, "on_update prog recursion"); prog_fd = bpf_program__fd(skel->progs.on_enter); memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); ASSERT_OK(err, "get prog info"); ASSERT_EQ(info.recursion_misses, 0, "on_enter prog recursion"); @@ -221,7 +221,7 @@ static void test_nodeadlock(void) info_len = sizeof(info); prog_fd = bpf_program__fd(skel->progs.socket_post_create); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); ASSERT_OK(err, "get prog info"); ASSERT_EQ(info.recursion_misses, 0, "prog recursion"); diff --git a/tools/testing/selftests/bpf/prog_tests/tc_bpf.c b/tools/testing/selftests/bpf/prog_tests/tc_bpf.c index 4a505a5adf4d..e873766276d1 100644 --- a/tools/testing/selftests/bpf/prog_tests/tc_bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/tc_bpf.c @@ -29,8 +29,8 @@ static int test_tc_bpf_basic(const struct bpf_tc_hook *hook, int fd) __u32 info_len = sizeof(info); int ret; - ret = bpf_obj_get_info_by_fd(fd, &info, &info_len); - if (!ASSERT_OK(ret, "bpf_obj_get_info_by_fd")) + ret = bpf_prog_get_info_by_fd(fd, &info, &info_len); + if (!ASSERT_OK(ret, "bpf_prog_get_info_by_fd")) return ret; ret = bpf_tc_attach(hook, &opts); diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c b/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c index c381faaae741..2900c5e9a016 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c +++ b/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright 2022 Sony Group Corporation */ +#define _GNU_SOURCE +#include <fcntl.h> #include <sys/prctl.h> #include <test_progs.h> #include "bpf_syscall_macro.skel.h" @@ -13,6 +15,8 @@ void test_bpf_syscall_macro(void) unsigned long exp_arg3 = 13; unsigned long exp_arg4 = 14; unsigned long exp_arg5 = 15; + loff_t off_in, off_out; + ssize_t r; /* check whether it can open program */ skel = bpf_syscall_macro__open(); @@ -33,6 +37,7 @@ void test_bpf_syscall_macro(void) /* check whether args of syscall are copied correctly */ prctl(exp_arg1, exp_arg2, exp_arg3, exp_arg4, exp_arg5); + #if defined(__aarch64__) || defined(__s390__) ASSERT_NEQ(skel->bss->arg1, exp_arg1, "syscall_arg1"); #else @@ -68,6 +73,18 @@ void test_bpf_syscall_macro(void) ASSERT_EQ(skel->bss->arg4_syscall, exp_arg4, "BPF_KPROBE_SYSCALL_arg4"); ASSERT_EQ(skel->bss->arg5_syscall, exp_arg5, "BPF_KPROBE_SYSCALL_arg5"); + r = splice(-42, &off_in, 42, &off_out, 0x12340000, SPLICE_F_NONBLOCK); + err = -errno; + ASSERT_EQ(r, -1, "splice_res"); + ASSERT_EQ(err, -EBADF, "splice_err"); + + ASSERT_EQ(skel->bss->splice_fd_in, -42, "splice_arg1"); + ASSERT_EQ(skel->bss->splice_off_in, (__u64)&off_in, "splice_arg2"); + ASSERT_EQ(skel->bss->splice_fd_out, 42, "splice_arg3"); + ASSERT_EQ(skel->bss->splice_off_out, (__u64)&off_out, "splice_arg4"); + ASSERT_EQ(skel->bss->splice_len, 0x12340000, "splice_arg5"); + ASSERT_EQ(skel->bss->splice_flags, SPLICE_F_NONBLOCK, "splice_arg6"); + cleanup: bpf_syscall_macro__destroy(skel); } diff --git a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c index 7295cc60f724..e0879df38639 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c +++ b/tools/testing/selftests/bpf/prog_tests/test_global_funcs.c @@ -1,104 +1,43 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2020 Facebook */ #include <test_progs.h> - -const char *err_str; -bool found; - -static int libbpf_debug_print(enum libbpf_print_level level, - const char *format, va_list args) -{ - char *log_buf; - - if (level != LIBBPF_WARN || - strcmp(format, "libbpf: \n%s\n")) { - vprintf(format, args); - return 0; - } - - log_buf = va_arg(args, char *); - if (!log_buf) - goto out; - if (err_str && strstr(log_buf, err_str) == 0) - found = true; -out: - printf(format, log_buf); - return 0; -} - -extern int extra_prog_load_log_flags; - -static int check_load(const char *file) -{ - struct bpf_object *obj = NULL; - struct bpf_program *prog; - int err; - - found = false; - - obj = bpf_object__open_file(file, NULL); - err = libbpf_get_error(obj); - if (err) - return err; - - prog = bpf_object__next_program(obj, NULL); - if (!prog) { - err = -ENOENT; - goto err_out; - } - - bpf_program__set_flags(prog, BPF_F_TEST_RND_HI32); - bpf_program__set_log_level(prog, extra_prog_load_log_flags); - - err = bpf_object__load(obj); - -err_out: - bpf_object__close(obj); - return err; -} - -struct test_def { - const char *file; - const char *err_str; -}; +#include "test_global_func1.skel.h" +#include "test_global_func2.skel.h" +#include "test_global_func3.skel.h" +#include "test_global_func4.skel.h" +#include "test_global_func5.skel.h" +#include "test_global_func6.skel.h" +#include "test_global_func7.skel.h" +#include "test_global_func8.skel.h" +#include "test_global_func9.skel.h" +#include "test_global_func10.skel.h" +#include "test_global_func11.skel.h" +#include "test_global_func12.skel.h" +#include "test_global_func13.skel.h" +#include "test_global_func14.skel.h" +#include "test_global_func15.skel.h" +#include "test_global_func16.skel.h" +#include "test_global_func17.skel.h" +#include "test_global_func_ctx_args.skel.h" void test_test_global_funcs(void) { - struct test_def tests[] = { - { "test_global_func1.bpf.o", "combined stack size of 4 calls is 544" }, - { "test_global_func2.bpf.o" }, - { "test_global_func3.bpf.o", "the call stack of 8 frames" }, - { "test_global_func4.bpf.o" }, - { "test_global_func5.bpf.o", "expected pointer to ctx, but got PTR" }, - { "test_global_func6.bpf.o", "modified ctx ptr R2" }, - { "test_global_func7.bpf.o", "foo() doesn't return scalar" }, - { "test_global_func8.bpf.o" }, - { "test_global_func9.bpf.o" }, - { "test_global_func10.bpf.o", "invalid indirect read from stack" }, - { "test_global_func11.bpf.o", "Caller passes invalid args into func#1" }, - { "test_global_func12.bpf.o", "invalid mem access 'mem_or_null'" }, - { "test_global_func13.bpf.o", "Caller passes invalid args into func#1" }, - { "test_global_func14.bpf.o", "reference type('FWD S') size cannot be determined" }, - { "test_global_func15.bpf.o", "At program exit the register R0 has value" }, - { "test_global_func16.bpf.o", "invalid indirect read from stack" }, - { "test_global_func17.bpf.o", "Caller passes invalid args into func#1" }, - }; - libbpf_print_fn_t old_print_fn = NULL; - int err, i, duration = 0; - - old_print_fn = libbpf_set_print(libbpf_debug_print); - - for (i = 0; i < ARRAY_SIZE(tests); i++) { - const struct test_def *test = &tests[i]; - - if (!test__start_subtest(test->file)) - continue; - - err_str = test->err_str; - err = check_load(test->file); - CHECK_FAIL(!!err ^ !!err_str); - if (err_str) - CHECK(found, "", "expected string '%s'", err_str); - } - libbpf_set_print(old_print_fn); + RUN_TESTS(test_global_func1); + RUN_TESTS(test_global_func2); + RUN_TESTS(test_global_func3); + RUN_TESTS(test_global_func4); + RUN_TESTS(test_global_func5); + RUN_TESTS(test_global_func6); + RUN_TESTS(test_global_func7); + RUN_TESTS(test_global_func8); + RUN_TESTS(test_global_func9); + RUN_TESTS(test_global_func10); + RUN_TESTS(test_global_func11); + RUN_TESTS(test_global_func12); + RUN_TESTS(test_global_func13); + RUN_TESTS(test_global_func14); + RUN_TESTS(test_global_func15); + RUN_TESTS(test_global_func16); + RUN_TESTS(test_global_func17); + RUN_TESTS(test_global_func_ctx_args); } diff --git a/tools/testing/selftests/bpf/prog_tests/test_lsm.c b/tools/testing/selftests/bpf/prog_tests/test_lsm.c index 244c01125126..16175d579bc7 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_lsm.c +++ b/tools/testing/selftests/bpf/prog_tests/test_lsm.c @@ -75,7 +75,8 @@ static int test_lsm(struct lsm *skel) skel->bss->monitored_pid = getpid(); err = stack_mprotect(); - if (!ASSERT_EQ(errno, EPERM, "stack_mprotect")) + if (!ASSERT_EQ(err, -1, "stack_mprotect") || + !ASSERT_EQ(errno, EPERM, "stack_mprotect")) return err; ASSERT_EQ(skel->bss->mprotect_count, 1, "mprotect_count"); diff --git a/tools/testing/selftests/bpf/prog_tests/tp_attach_query.c b/tools/testing/selftests/bpf/prog_tests/tp_attach_query.c index a479080533db..770fcc3bb1ba 100644 --- a/tools/testing/selftests/bpf/prog_tests/tp_attach_query.c +++ b/tools/testing/selftests/bpf/prog_tests/tp_attach_query.c @@ -45,8 +45,9 @@ void serial_test_tp_attach_query(void) prog_info.xlated_prog_len = 0; prog_info.nr_map_ids = 0; info_len = sizeof(prog_info); - err = bpf_obj_get_info_by_fd(prog_fd[i], &prog_info, &info_len); - if (CHECK(err, "bpf_obj_get_info_by_fd", "err %d errno %d\n", + err = bpf_prog_get_info_by_fd(prog_fd[i], &prog_info, + &info_len); + if (CHECK(err, "bpf_prog_get_info_by_fd", "err %d errno %d\n", err, errno)) goto cleanup1; saved_prog_ids[i] = prog_info.id; diff --git a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c index 564b75bc087f..e91d0d1769f1 100644 --- a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c +++ b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c @@ -2,8 +2,6 @@ #define _GNU_SOURCE #include <test_progs.h> -#define MAX_TRAMP_PROGS 38 - struct inst { struct bpf_object *obj; struct bpf_link *link; @@ -37,14 +35,21 @@ void serial_test_trampoline_count(void) { char *file = "test_trampoline_count.bpf.o"; char *const progs[] = { "fentry_test", "fmod_ret_test", "fexit_test" }; - struct inst inst[MAX_TRAMP_PROGS + 1] = {}; + int bpf_max_tramp_links, err, i, prog_fd; struct bpf_program *prog; struct bpf_link *link; - int prog_fd, err, i; + struct inst *inst; LIBBPF_OPTS(bpf_test_run_opts, opts); + bpf_max_tramp_links = get_bpf_max_tramp_links(); + if (!ASSERT_GE(bpf_max_tramp_links, 1, "bpf_max_tramp_links")) + return; + inst = calloc(bpf_max_tramp_links + 1, sizeof(*inst)); + if (!ASSERT_OK_PTR(inst, "inst")) + return; + /* attach 'allowed' trampoline programs */ - for (i = 0; i < MAX_TRAMP_PROGS; i++) { + for (i = 0; i < bpf_max_tramp_links; i++) { prog = load_prog(file, progs[i % ARRAY_SIZE(progs)], &inst[i]); if (!prog) goto cleanup; @@ -74,7 +79,7 @@ void serial_test_trampoline_count(void) if (!ASSERT_EQ(link, NULL, "ptr_is_null")) goto cleanup; - /* and finaly execute the probe */ + /* and finally execute the probe */ prog_fd = bpf_program__fd(prog); if (!ASSERT_GE(prog_fd, 0, "bpf_program__fd")) goto cleanup; @@ -91,4 +96,5 @@ cleanup: bpf_link__destroy(inst[i].link); bpf_object__close(inst[i].obj); } + free(inst); } diff --git a/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c b/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c index 1ed3cc2092db..8383a99f610f 100644 --- a/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c +++ b/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c @@ -179,7 +179,7 @@ static void test_unpriv_bpf_disabled_negative(struct test_unpriv_bpf_disabled *s ASSERT_EQ(bpf_prog_get_next_id(prog_id, &next), -EPERM, "prog_get_next_id_fails"); ASSERT_EQ(bpf_prog_get_next_id(0, &next), -EPERM, "prog_get_next_id_fails"); - if (ASSERT_OK(bpf_obj_get_info_by_fd(map_fds[0], &map_info, &map_info_len), + if (ASSERT_OK(bpf_map_get_info_by_fd(map_fds[0], &map_info, &map_info_len), "obj_get_info_by_fd")) { ASSERT_EQ(bpf_map_get_fd_by_id(map_info.id), -EPERM, "map_get_fd_by_id_fails"); ASSERT_EQ(bpf_map_get_next_id(map_info.id, &next), -EPERM, @@ -187,8 +187,8 @@ static void test_unpriv_bpf_disabled_negative(struct test_unpriv_bpf_disabled *s } ASSERT_EQ(bpf_map_get_next_id(0, &next), -EPERM, "map_get_next_id_fails"); - if (ASSERT_OK(bpf_obj_get_info_by_fd(bpf_link__fd(skel->links.sys_nanosleep_enter), - &link_info, &link_info_len), + if (ASSERT_OK(bpf_link_get_info_by_fd(bpf_link__fd(skel->links.sys_nanosleep_enter), + &link_info, &link_info_len), "obj_get_info_by_fd")) { ASSERT_EQ(bpf_link_get_fd_by_id(link_info.id), -EPERM, "link_get_fd_by_id_fails"); ASSERT_EQ(bpf_link_get_next_id(link_info.id, &next), -EPERM, @@ -269,7 +269,7 @@ void test_unpriv_bpf_disabled(void) } prog_fd = bpf_program__fd(skel->progs.sys_nanosleep_enter); - ASSERT_OK(bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len), + ASSERT_OK(bpf_prog_get_info_by_fd(prog_fd, &prog_info, &prog_info_len), "obj_get_info_by_fd"); prog_id = prog_info.id; ASSERT_GT(prog_id, 0, "valid_prog_id"); diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c index 35b87c7ba5be..6558c857e620 100644 --- a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c @@ -3,20 +3,23 @@ #include <test_progs.h> #include "test_uprobe_autoattach.skel.h" +#include "progs/bpf_misc.h" /* uprobe attach point */ -static noinline int autoattach_trigger_func(int arg) +static noinline int autoattach_trigger_func(int arg1, int arg2, int arg3, + int arg4, int arg5, int arg6, + int arg7, int arg8) { asm volatile (""); - return arg + 1; + return arg1 + arg2 + arg3 + arg4 + arg5 + arg6 + arg7 + arg8 + 1; } void test_uprobe_autoattach(void) { + const char *devnull_str = "/dev/null"; struct test_uprobe_autoattach *skel; - int trigger_val = 100, trigger_ret; - size_t malloc_sz = 1; - char *mem; + int trigger_ret; + FILE *devnull; skel = test_uprobe_autoattach__open_and_load(); if (!ASSERT_OK_PTR(skel, "skel_open")) @@ -28,23 +31,45 @@ void test_uprobe_autoattach(void) skel->bss->test_pid = getpid(); /* trigger & validate uprobe & uretprobe */ - trigger_ret = autoattach_trigger_func(trigger_val); + trigger_ret = autoattach_trigger_func(1, 2, 3, 4, 5, 6, 7, 8); skel->bss->test_pid = getpid(); /* trigger & validate shared library u[ret]probes attached by name */ - mem = malloc(malloc_sz); + devnull = fopen(devnull_str, "r"); - ASSERT_EQ(skel->bss->uprobe_byname_parm1, trigger_val, "check_uprobe_byname_parm1"); + ASSERT_EQ(skel->bss->uprobe_byname_parm1, 1, "check_uprobe_byname_parm1"); ASSERT_EQ(skel->bss->uprobe_byname_ran, 1, "check_uprobe_byname_ran"); ASSERT_EQ(skel->bss->uretprobe_byname_rc, trigger_ret, "check_uretprobe_byname_rc"); + ASSERT_EQ(skel->bss->uretprobe_byname_ret, trigger_ret, "check_uretprobe_byname_ret"); ASSERT_EQ(skel->bss->uretprobe_byname_ran, 2, "check_uretprobe_byname_ran"); - ASSERT_EQ(skel->bss->uprobe_byname2_parm1, malloc_sz, "check_uprobe_byname2_parm1"); + ASSERT_EQ(skel->bss->uprobe_byname2_parm1, (__u64)(long)devnull_str, + "check_uprobe_byname2_parm1"); ASSERT_EQ(skel->bss->uprobe_byname2_ran, 3, "check_uprobe_byname2_ran"); - ASSERT_EQ(skel->bss->uretprobe_byname2_rc, mem, "check_uretprobe_byname2_rc"); + ASSERT_EQ(skel->bss->uretprobe_byname2_rc, (__u64)(long)devnull, + "check_uretprobe_byname2_rc"); ASSERT_EQ(skel->bss->uretprobe_byname2_ran, 4, "check_uretprobe_byname2_ran"); - free(mem); + ASSERT_EQ(skel->bss->a[0], 1, "arg1"); + ASSERT_EQ(skel->bss->a[1], 2, "arg2"); + ASSERT_EQ(skel->bss->a[2], 3, "arg3"); +#if FUNC_REG_ARG_CNT > 3 + ASSERT_EQ(skel->bss->a[3], 4, "arg4"); +#endif +#if FUNC_REG_ARG_CNT > 4 + ASSERT_EQ(skel->bss->a[4], 5, "arg5"); +#endif +#if FUNC_REG_ARG_CNT > 5 + ASSERT_EQ(skel->bss->a[5], 6, "arg6"); +#endif +#if FUNC_REG_ARG_CNT > 6 + ASSERT_EQ(skel->bss->a[6], 7, "arg7"); +#endif +#if FUNC_REG_ARG_CNT > 7 + ASSERT_EQ(skel->bss->a[7], 8, "arg8"); +#endif + + fclose(devnull); cleanup: test_uprobe_autoattach__destroy(skel); } diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c index 9ad9da0f215e..56ed1eb9b527 100644 --- a/tools/testing/selftests/bpf/prog_tests/usdt.c +++ b/tools/testing/selftests/bpf/prog_tests/usdt.c @@ -314,6 +314,7 @@ static FILE *urand_spawn(int *pid) if (fscanf(f, "%d", pid) != 1) { pclose(f); + errno = EINVAL; return NULL; } diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c index dae68de285b9..3a13e102c149 100644 --- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c +++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c @@ -19,8 +19,6 @@ #include "../progs/test_user_ringbuf.h" -static size_t log_buf_sz = 1 << 20; /* 1 MB */ -static char obj_log_buf[1048576]; static const long c_sample_size = sizeof(struct sample) + BPF_RINGBUF_HDR_SZ; static const long c_ringbuf_size = 1 << 12; /* 1 small page */ static const long c_max_entries = c_ringbuf_size / c_sample_size; @@ -663,23 +661,6 @@ cleanup: user_ringbuf_success__destroy(skel); } -static struct { - const char *prog_name; - const char *expected_err_msg; -} failure_tests[] = { - /* failure cases */ - {"user_ringbuf_callback_bad_access1", "negative offset dynptr_ptr ptr"}, - {"user_ringbuf_callback_bad_access2", "dereference of modified dynptr_ptr ptr"}, - {"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr_ptr'"}, - {"user_ringbuf_callback_null_context_write", "invalid mem access 'scalar'"}, - {"user_ringbuf_callback_null_context_read", "invalid mem access 'scalar'"}, - {"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"}, - {"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"}, - {"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"}, - {"user_ringbuf_callback_reinit_dynptr_mem", "Dynptr has to be an uninitialized dynptr"}, - {"user_ringbuf_callback_reinit_dynptr_ringbuf", "Dynptr has to be an uninitialized dynptr"}, -}; - #define SUCCESS_TEST(_func) { _func, #_func } static struct { @@ -700,42 +681,6 @@ static struct { SUCCESS_TEST(test_user_ringbuf_blocking_reserve), }; -static void verify_fail(const char *prog_name, const char *expected_err_msg) -{ - LIBBPF_OPTS(bpf_object_open_opts, opts); - struct bpf_program *prog; - struct user_ringbuf_fail *skel; - int err; - - opts.kernel_log_buf = obj_log_buf; - opts.kernel_log_size = log_buf_sz; - opts.kernel_log_level = 1; - - skel = user_ringbuf_fail__open_opts(&opts); - if (!ASSERT_OK_PTR(skel, "dynptr_fail__open_opts")) - goto cleanup; - - prog = bpf_object__find_program_by_name(skel->obj, prog_name); - if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) - goto cleanup; - - bpf_program__set_autoload(prog, true); - - bpf_map__set_max_entries(skel->maps.user_ringbuf, getpagesize()); - - err = user_ringbuf_fail__load(skel); - if (!ASSERT_ERR(err, "unexpected load success")) - goto cleanup; - - if (!ASSERT_OK_PTR(strstr(obj_log_buf, expected_err_msg), "expected_err_msg")) { - fprintf(stderr, "Expected err_msg: %s\n", expected_err_msg); - fprintf(stderr, "Verifier output: %s\n", obj_log_buf); - } - -cleanup: - user_ringbuf_fail__destroy(skel); -} - void test_user_ringbuf(void) { int i; @@ -747,10 +692,5 @@ void test_user_ringbuf(void) success_tests[i].test_callback(); } - for (i = 0; i < ARRAY_SIZE(failure_tests); i++) { - if (!test__start_subtest(failure_tests[i].prog_name)) - continue; - - verify_fail(failure_tests[i].prog_name, failure_tests[i].expected_err_msg); - } + RUN_TESTS(user_ringbuf_fail); } diff --git a/tools/testing/selftests/bpf/prog_tests/verif_stats.c b/tools/testing/selftests/bpf/prog_tests/verif_stats.c index a47e7c0e1ffd..af4b95f57ac1 100644 --- a/tools/testing/selftests/bpf/prog_tests/verif_stats.c +++ b/tools/testing/selftests/bpf/prog_tests/verif_stats.c @@ -16,8 +16,9 @@ void test_verif_stats(void) if (!ASSERT_OK_PTR(skel, "trace_vprintk__open_and_load")) goto cleanup; - err = bpf_obj_get_info_by_fd(skel->progs.sys_enter.prog_fd, &info, &len); - if (!ASSERT_OK(err, "bpf_obj_get_info_by_fd")) + err = bpf_prog_get_info_by_fd(skel->progs.sys_enter.prog_fd, + &info, &len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) goto cleanup; if (!ASSERT_GT(info.verified_insns, 0, "verified_insns")) diff --git a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c index 579d6ee83ce0..dd7f2bc70048 100644 --- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c +++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c @@ -61,6 +61,9 @@ static bool kfunc_not_supported; static int libbpf_print_cb(enum libbpf_print_level level, const char *fmt, va_list args) { + if (level == LIBBPF_WARN) + vprintf(fmt, args); + if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found in kernel or module BTFs\n")) return 0; diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c index 39973ea1ce43..f09505f8b038 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c @@ -76,10 +76,15 @@ static void test_xdp_adjust_tail_grow2(void) { const char *file = "./test_xdp_adjust_tail_grow.bpf.o"; char buf[4096]; /* avoid segfault: large buf to hold grow results */ - int tailroom = 320; /* SKB_DATA_ALIGN(sizeof(struct skb_shared_info))*/; struct bpf_object *obj; int err, cnt, i; int max_grow, prog_fd; + /* SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) */ +#if defined(__s390x__) + int tailroom = 512; +#else + int tailroom = 320; +#endif LIBBPF_OPTS(bpf_test_run_opts, tattr, .repeat = 1, diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_attach.c index 062fbc8c8e5e..d4cd9f873c14 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_attach.c @@ -18,7 +18,7 @@ void serial_test_xdp_attach(void) err = bpf_prog_test_load(file, BPF_PROG_TYPE_XDP, &obj1, &fd1); if (CHECK_FAIL(err)) return; - err = bpf_obj_get_info_by_fd(fd1, &info, &len); + err = bpf_prog_get_info_by_fd(fd1, &info, &len); if (CHECK_FAIL(err)) goto out_1; id1 = info.id; @@ -28,7 +28,7 @@ void serial_test_xdp_attach(void) goto out_1; memset(&info, 0, sizeof(info)); - err = bpf_obj_get_info_by_fd(fd2, &info, &len); + err = bpf_prog_get_info_by_fd(fd2, &info, &len); if (CHECK_FAIL(err)) goto out_2; id2 = info.id; diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c index f775a1613833..481626a875d1 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_attach.c @@ -33,8 +33,8 @@ static void test_xdp_with_cpumap_helpers(void) prog_fd = bpf_program__fd(skel->progs.xdp_dummy_cm); map_fd = bpf_map__fd(skel->maps.cpu_map); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &len); - if (!ASSERT_OK(err, "bpf_obj_get_info_by_fd")) + err = bpf_prog_get_info_by_fd(prog_fd, &info, &len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) goto out_close; val.bpf_prog.fd = prog_fd; @@ -85,8 +85,8 @@ static void test_xdp_with_cpumap_frags_helpers(void) frags_prog_fd = bpf_program__fd(skel->progs.xdp_dummy_cm_frags); map_fd = bpf_map__fd(skel->maps.cpu_map); - err = bpf_obj_get_info_by_fd(frags_prog_fd, &info, &len); - if (!ASSERT_OK(err, "bpf_obj_get_info_by_fd")) + err = bpf_prog_get_info_by_fd(frags_prog_fd, &info, &len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) goto out_close; val.bpf_prog.fd = frags_prog_fd; diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c b/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c index ead40016c324..ce6812558287 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c @@ -35,8 +35,8 @@ static void test_xdp_with_devmap_helpers(void) dm_fd = bpf_program__fd(skel->progs.xdp_dummy_dm); map_fd = bpf_map__fd(skel->maps.dm_ports); - err = bpf_obj_get_info_by_fd(dm_fd, &info, &len); - if (!ASSERT_OK(err, "bpf_obj_get_info_by_fd")) + err = bpf_prog_get_info_by_fd(dm_fd, &info, &len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) goto out_close; val.bpf_prog.fd = dm_fd; @@ -98,8 +98,8 @@ static void test_xdp_with_devmap_frags_helpers(void) dm_fd_frags = bpf_program__fd(skel->progs.xdp_dummy_dm_frags); map_fd = bpf_map__fd(skel->maps.dm_ports); - err = bpf_obj_get_info_by_fd(dm_fd_frags, &info, &len); - if (!ASSERT_OK(err, "bpf_obj_get_info_by_fd")) + err = bpf_prog_get_info_by_fd(dm_fd_frags, &info, &len); + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd")) goto out_close; val.bpf_prog.fd = dm_fd_frags; diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c index a50971c6cf4a..2666c84dbd01 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c @@ -4,10 +4,12 @@ #include <net/if.h> #include <linux/if_ether.h> #include <linux/if_packet.h> +#include <linux/if_link.h> #include <linux/ipv6.h> #include <linux/in6.h> #include <linux/udp.h> #include <bpf/bpf_endian.h> +#include <uapi/linux/netdev.h> #include "test_xdp_do_redirect.skel.h" #define SYS(fmt, ...) \ @@ -65,7 +67,11 @@ static int attach_tc_prog(struct bpf_tc_hook *hook, int fd) /* The maximum permissible size is: PAGE_SIZE - sizeof(struct xdp_page_head) - * sizeof(struct skb_shared_info) - XDP_PACKET_HEADROOM = 3368 bytes */ +#if defined(__s390x__) +#define MAX_PKT_SIZE 3176 +#else #define MAX_PKT_SIZE 3368 +#endif static void test_max_pkt_size(int fd) { char data[MAX_PKT_SIZE + 1] = {}; @@ -92,7 +98,7 @@ void test_xdp_do_redirect(void) struct test_xdp_do_redirect *skel = NULL; struct nstoken *nstoken = NULL; struct bpf_link *link; - + LIBBPF_OPTS(bpf_xdp_query_opts, query_opts); struct xdp_md ctx_in = { .data = sizeof(__u32), .data_end = sizeof(data) }; DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, @@ -153,6 +159,29 @@ void test_xdp_do_redirect(void) !ASSERT_NEQ(ifindex_dst, 0, "ifindex_dst")) goto out; + /* Check xdp features supported by veth driver */ + err = bpf_xdp_query(ifindex_src, XDP_FLAGS_DRV_MODE, &query_opts); + if (!ASSERT_OK(err, "veth_src bpf_xdp_query")) + goto out; + + if (!ASSERT_EQ(query_opts.feature_flags, + NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | + NETDEV_XDP_ACT_NDO_XMIT | NETDEV_XDP_ACT_RX_SG | + NETDEV_XDP_ACT_NDO_XMIT_SG, + "veth_src query_opts.feature_flags")) + goto out; + + err = bpf_xdp_query(ifindex_dst, XDP_FLAGS_DRV_MODE, &query_opts); + if (!ASSERT_OK(err, "veth_dst bpf_xdp_query")) + goto out; + + if (!ASSERT_EQ(query_opts.feature_flags, + NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | + NETDEV_XDP_ACT_NDO_XMIT | NETDEV_XDP_ACT_RX_SG | + NETDEV_XDP_ACT_NDO_XMIT_SG, + "veth_dst query_opts.feature_flags")) + goto out; + memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN); skel->rodata->ifindex_out = ifindex_src; /* redirect back to the same iface */ skel->rodata->ifindex_in = ifindex_src; diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_info.c b/tools/testing/selftests/bpf/prog_tests/xdp_info.c index cd3aa340e65e..1dbddcab87a8 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_info.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_info.c @@ -8,6 +8,7 @@ void serial_test_xdp_info(void) { __u32 len = sizeof(struct bpf_prog_info), duration = 0, prog_id; const char *file = "./xdp_dummy.bpf.o"; + LIBBPF_OPTS(bpf_xdp_query_opts, opts); struct bpf_prog_info info = {}; struct bpf_object *obj; int err, prog_fd; @@ -33,7 +34,7 @@ void serial_test_xdp_info(void) if (CHECK_FAIL(err)) return; - err = bpf_obj_get_info_by_fd(prog_fd, &info, &len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &len); if (CHECK(err, "get_prog_info", "errno=%d\n", errno)) goto out_close; @@ -61,6 +62,13 @@ void serial_test_xdp_info(void) if (CHECK(prog_id, "prog_id_drv", "unexpected prog_id=%u\n", prog_id)) goto out; + /* Check xdp features supported by lo device */ + opts.feature_flags = ~0; + err = bpf_xdp_query(IFINDEX_LO, XDP_FLAGS_DRV_MODE, &opts); + if (!ASSERT_OK(err, "bpf_xdp_query")) + goto out; + + ASSERT_EQ(opts.feature_flags, 0, "opts.feature_flags"); out: bpf_xdp_detach(IFINDEX_LO, 0, NULL); out_close: diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_link.c b/tools/testing/selftests/bpf/prog_tests/xdp_link.c index 3e9d5c5521f0..e7e9f3c22edf 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_link.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_link.c @@ -29,13 +29,13 @@ void serial_test_xdp_link(void) prog_fd2 = bpf_program__fd(skel2->progs.xdp_handler); memset(&prog_info, 0, sizeof(prog_info)); - err = bpf_obj_get_info_by_fd(prog_fd1, &prog_info, &prog_info_len); + err = bpf_prog_get_info_by_fd(prog_fd1, &prog_info, &prog_info_len); if (!ASSERT_OK(err, "fd_info1")) goto cleanup; id1 = prog_info.id; memset(&prog_info, 0, sizeof(prog_info)); - err = bpf_obj_get_info_by_fd(prog_fd2, &prog_info, &prog_info_len); + err = bpf_prog_get_info_by_fd(prog_fd2, &prog_info, &prog_info_len); if (!ASSERT_OK(err, "fd_info2")) goto cleanup; id2 = prog_info.id; @@ -119,7 +119,8 @@ void serial_test_xdp_link(void) goto cleanup; memset(&link_info, 0, sizeof(link_info)); - err = bpf_obj_get_info_by_fd(bpf_link__fd(link), &link_info, &link_info_len); + err = bpf_link_get_info_by_fd(bpf_link__fd(link), + &link_info, &link_info_len); if (!ASSERT_OK(err, "link_info")) goto cleanup; @@ -137,7 +138,8 @@ void serial_test_xdp_link(void) goto cleanup; memset(&link_info, 0, sizeof(link_info)); - err = bpf_obj_get_info_by_fd(bpf_link__fd(link), &link_info, &link_info_len); + err = bpf_link_get_info_by_fd(bpf_link__fd(link), + &link_info, &link_info_len); ASSERT_OK(err, "link_info"); ASSERT_EQ(link_info.prog_id, id1, "link_prog_id"); diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c new file mode 100644 index 000000000000..aa4beae99f4f --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -0,0 +1,409 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> +#include <network_helpers.h> +#include "xdp_metadata.skel.h" +#include "xdp_metadata2.skel.h" +#include "xdp_metadata.h" +#include "xsk.h" + +#include <bpf/btf.h> +#include <linux/errqueue.h> +#include <linux/if_link.h> +#include <linux/net_tstamp.h> +#include <linux/udp.h> +#include <sys/mman.h> +#include <net/if.h> +#include <poll.h> + +#define TX_NAME "veTX" +#define RX_NAME "veRX" + +#define UDP_PAYLOAD_BYTES 4 + +#define AF_XDP_SOURCE_PORT 1234 +#define AF_XDP_CONSUMER_PORT 8080 + +#define UMEM_NUM 16 +#define UMEM_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE +#define UMEM_SIZE (UMEM_FRAME_SIZE * UMEM_NUM) +#define XDP_FLAGS XDP_FLAGS_DRV_MODE +#define QUEUE_ID 0 + +#define TX_ADDR "10.0.0.1" +#define RX_ADDR "10.0.0.2" +#define PREFIX_LEN "8" +#define FAMILY AF_INET + +#define SYS(cmd) ({ \ + if (!ASSERT_OK(system(cmd), (cmd))) \ + goto out; \ +}) + +struct xsk { + void *umem_area; + struct xsk_umem *umem; + struct xsk_ring_prod fill; + struct xsk_ring_cons comp; + struct xsk_ring_prod tx; + struct xsk_ring_cons rx; + struct xsk_socket *socket; +}; + +static int open_xsk(int ifindex, struct xsk *xsk) +{ + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; + const struct xsk_socket_config socket_config = { + .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .bind_flags = XDP_COPY, + }; + const struct xsk_umem_config umem_config = { + .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, + .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, + .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + }; + __u32 idx; + u64 addr; + int ret; + int i; + + xsk->umem_area = mmap(NULL, UMEM_SIZE, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); + if (!ASSERT_NEQ(xsk->umem_area, MAP_FAILED, "mmap")) + return -1; + + ret = xsk_umem__create(&xsk->umem, + xsk->umem_area, UMEM_SIZE, + &xsk->fill, + &xsk->comp, + &umem_config); + if (!ASSERT_OK(ret, "xsk_umem__create")) + return ret; + + ret = xsk_socket__create(&xsk->socket, ifindex, QUEUE_ID, + xsk->umem, + &xsk->rx, + &xsk->tx, + &socket_config); + if (!ASSERT_OK(ret, "xsk_socket__create")) + return ret; + + /* First half of umem is for TX. This way address matches 1-to-1 + * to the completion queue index. + */ + + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = i * UMEM_FRAME_SIZE; + printf("%p: tx_desc[%d] -> %lx\n", xsk, i, addr); + } + + /* Second half of umem is for RX. */ + + ret = xsk_ring_prod__reserve(&xsk->fill, UMEM_NUM / 2, &idx); + if (!ASSERT_EQ(UMEM_NUM / 2, ret, "xsk_ring_prod__reserve")) + return ret; + if (!ASSERT_EQ(idx, 0, "fill idx != 0")) + return -1; + + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = (UMEM_NUM / 2 + i) * UMEM_FRAME_SIZE; + printf("%p: rx_desc[%d] -> %lx\n", xsk, i, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, i) = addr; + } + xsk_ring_prod__submit(&xsk->fill, ret); + + return 0; +} + +static void close_xsk(struct xsk *xsk) +{ + if (xsk->umem) + xsk_umem__delete(xsk->umem); + if (xsk->socket) + xsk_socket__delete(xsk->socket); + munmap(xsk->umem_area, UMEM_SIZE); +} + +static void ip_csum(struct iphdr *iph) +{ + __u32 sum = 0; + __u16 *p; + int i; + + iph->check = 0; + p = (void *)iph; + for (i = 0; i < sizeof(*iph) / sizeof(*p); i++) + sum += p[i]; + + while (sum >> 16) + sum = (sum & 0xffff) + (sum >> 16); + + iph->check = ~sum; +} + +static int generate_packet(struct xsk *xsk, __u16 dst_port) +{ + struct xdp_desc *tx_desc; + struct udphdr *udph; + struct ethhdr *eth; + struct iphdr *iph; + void *data; + __u32 idx; + int ret; + + ret = xsk_ring_prod__reserve(&xsk->tx, 1, &idx); + if (!ASSERT_EQ(ret, 1, "xsk_ring_prod__reserve")) + return -1; + + tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx); + tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE; + printf("%p: tx_desc[%u]->addr=%llx\n", xsk, idx, tx_desc->addr); + data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr); + + eth = data; + iph = (void *)(eth + 1); + udph = (void *)(iph + 1); + + memcpy(eth->h_dest, "\x00\x00\x00\x00\x00\x02", ETH_ALEN); + memcpy(eth->h_source, "\x00\x00\x00\x00\x00\x01", ETH_ALEN); + eth->h_proto = htons(ETH_P_IP); + + iph->version = 0x4; + iph->ihl = 0x5; + iph->tos = 0x9; + iph->tot_len = htons(sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES); + iph->id = 0; + iph->frag_off = 0; + iph->ttl = 0; + iph->protocol = IPPROTO_UDP; + ASSERT_EQ(inet_pton(FAMILY, TX_ADDR, &iph->saddr), 1, "inet_pton(TX_ADDR)"); + ASSERT_EQ(inet_pton(FAMILY, RX_ADDR, &iph->daddr), 1, "inet_pton(RX_ADDR)"); + ip_csum(iph); + + udph->source = htons(AF_XDP_SOURCE_PORT); + udph->dest = htons(dst_port); + udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); + udph->check = 0; + + memset(udph + 1, 0xAA, UDP_PAYLOAD_BYTES); + + tx_desc->len = sizeof(*eth) + sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES; + xsk_ring_prod__submit(&xsk->tx, 1); + + ret = sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0); + if (!ASSERT_GE(ret, 0, "sendto")) + return ret; + + return 0; +} + +static void complete_tx(struct xsk *xsk) +{ + __u32 idx; + __u64 addr; + + if (ASSERT_EQ(xsk_ring_cons__peek(&xsk->comp, 1, &idx), 1, "xsk_ring_cons__peek")) { + addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx); + + printf("%p: complete tx idx=%u addr=%llx\n", xsk, idx, addr); + xsk_ring_cons__release(&xsk->comp, 1); + } +} + +static void refill_rx(struct xsk *xsk, __u64 addr) +{ + __u32 idx; + + if (ASSERT_EQ(xsk_ring_prod__reserve(&xsk->fill, 1, &idx), 1, "xsk_ring_prod__reserve")) { + printf("%p: complete idx=%u addr=%llx\n", xsk, idx, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; + xsk_ring_prod__submit(&xsk->fill, 1); + } +} + +static int verify_xsk_metadata(struct xsk *xsk) +{ + const struct xdp_desc *rx_desc; + struct pollfd fds = {}; + struct xdp_meta *meta; + struct ethhdr *eth; + struct iphdr *iph; + __u64 comp_addr; + void *data; + __u64 addr; + __u32 idx; + int ret; + + ret = recvfrom(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, NULL); + if (!ASSERT_EQ(ret, 0, "recvfrom")) + return -1; + + fds.fd = xsk_socket__fd(xsk->socket); + fds.events = POLLIN; + + ret = poll(&fds, 1, 1000); + if (!ASSERT_GT(ret, 0, "poll")) + return -1; + + ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); + if (!ASSERT_EQ(ret, 1, "xsk_ring_cons__peek")) + return -2; + + rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); + comp_addr = xsk_umem__extract_addr(rx_desc->addr); + addr = xsk_umem__add_offset_to_addr(rx_desc->addr); + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", + xsk, idx, rx_desc->addr, addr, comp_addr); + data = xsk_umem__get_data(xsk->umem_area, addr); + + /* Make sure we got the packet offset correctly. */ + + eth = data; + ASSERT_EQ(eth->h_proto, htons(ETH_P_IP), "eth->h_proto"); + iph = (void *)(eth + 1); + ASSERT_EQ((int)iph->version, 4, "iph->version"); + + /* custom metadata */ + + meta = data - sizeof(struct xdp_meta); + + if (!ASSERT_NEQ(meta->rx_timestamp, 0, "rx_timestamp")) + return -1; + + if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash")) + return -1; + + xsk_ring_cons__release(&xsk->rx, 1); + refill_rx(xsk, comp_addr); + + return 0; +} + +void test_xdp_metadata(void) +{ + struct xdp_metadata2 *bpf_obj2 = NULL; + struct xdp_metadata *bpf_obj = NULL; + struct bpf_program *new_prog, *prog; + struct nstoken *tok = NULL; + __u32 queue_id = QUEUE_ID; + struct bpf_map *prog_arr; + struct xsk tx_xsk = {}; + struct xsk rx_xsk = {}; + __u32 val, key = 0; + int retries = 10; + int rx_ifindex; + int tx_ifindex; + int sock_fd; + int ret; + + /* Setup new networking namespace, with a veth pair. */ + + SYS("ip netns add xdp_metadata"); + tok = open_netns("xdp_metadata"); + SYS("ip link add numtxqueues 1 numrxqueues 1 " TX_NAME + " type veth peer " RX_NAME " numtxqueues 1 numrxqueues 1"); + SYS("ip link set dev " TX_NAME " address 00:00:00:00:00:01"); + SYS("ip link set dev " RX_NAME " address 00:00:00:00:00:02"); + SYS("ip link set dev " TX_NAME " up"); + SYS("ip link set dev " RX_NAME " up"); + SYS("ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME); + SYS("ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME); + + rx_ifindex = if_nametoindex(RX_NAME); + tx_ifindex = if_nametoindex(TX_NAME); + + /* Setup separate AF_XDP for TX and RX interfaces. */ + + ret = open_xsk(tx_ifindex, &tx_xsk); + if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) + goto out; + + ret = open_xsk(rx_ifindex, &rx_xsk); + if (!ASSERT_OK(ret, "open_xsk(RX_NAME)")) + goto out; + + bpf_obj = xdp_metadata__open(); + if (!ASSERT_OK_PTR(bpf_obj, "open skeleton")) + goto out; + + prog = bpf_object__find_program_by_name(bpf_obj->obj, "rx"); + bpf_program__set_ifindex(prog, rx_ifindex); + bpf_program__set_flags(prog, BPF_F_XDP_DEV_BOUND_ONLY); + + if (!ASSERT_OK(xdp_metadata__load(bpf_obj), "load skeleton")) + goto out; + + /* Make sure we can't add dev-bound programs to prog maps. */ + prog_arr = bpf_object__find_map_by_name(bpf_obj->obj, "prog_arr"); + if (!ASSERT_OK_PTR(prog_arr, "no prog_arr map")) + goto out; + + val = bpf_program__fd(prog); + if (!ASSERT_ERR(bpf_map__update_elem(prog_arr, &key, sizeof(key), + &val, sizeof(val), BPF_ANY), + "update prog_arr")) + goto out; + + /* Attach BPF program to RX interface. */ + + ret = bpf_xdp_attach(rx_ifindex, + bpf_program__fd(bpf_obj->progs.rx), + XDP_FLAGS, NULL); + if (!ASSERT_GE(ret, 0, "bpf_xdp_attach")) + goto out; + + sock_fd = xsk_socket__fd(rx_xsk.socket); + ret = bpf_map_update_elem(bpf_map__fd(bpf_obj->maps.xsk), &queue_id, &sock_fd, 0); + if (!ASSERT_GE(ret, 0, "bpf_map_update_elem")) + goto out; + + /* Send packet destined to RX AF_XDP socket. */ + if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, + "generate AF_XDP_CONSUMER_PORT")) + goto out; + + /* Verify AF_XDP RX packet has proper metadata. */ + if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk), 0, + "verify_xsk_metadata")) + goto out; + + complete_tx(&tx_xsk); + + /* Make sure freplace correctly picks up original bound device + * and doesn't crash. + */ + + bpf_obj2 = xdp_metadata2__open(); + if (!ASSERT_OK_PTR(bpf_obj2, "open skeleton")) + goto out; + + new_prog = bpf_object__find_program_by_name(bpf_obj2->obj, "freplace_rx"); + bpf_program__set_attach_target(new_prog, bpf_program__fd(prog), "rx"); + + if (!ASSERT_OK(xdp_metadata2__load(bpf_obj2), "load freplace skeleton")) + goto out; + + if (!ASSERT_OK(xdp_metadata2__attach(bpf_obj2), "attach freplace")) + goto out; + + /* Send packet to trigger . */ + if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, + "generate freplace packet")) + goto out; + + while (!retries--) { + if (bpf_obj2->bss->called) + break; + usleep(10); + } + ASSERT_GT(bpf_obj2->bss->called, 0, "not called"); + +out: + close_xsk(&rx_xsk); + close_xsk(&tx_xsk); + xdp_metadata2__destroy(bpf_obj2); + xdp_metadata__destroy(bpf_obj); + if (tok) + close_netns(tok); + system("ip netns del xdp_metadata"); +} diff --git a/tools/testing/selftests/bpf/progs/bpf_hashmap_lookup.c b/tools/testing/selftests/bpf/progs/bpf_hashmap_lookup.c new file mode 100644 index 000000000000..1eb74ddca414 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_hashmap_lookup.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ + +#include "vmlinux.h" + +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); +} hash_map_bench SEC(".maps"); + +/* The number of slots to store times */ +#define NR_SLOTS 32 +#define NR_CPUS 256 +#define CPU_MASK (NR_CPUS-1) + +/* Configured by userspace */ +u64 nr_entries; +u64 nr_loops; +u32 __attribute__((__aligned__(8))) key[NR_CPUS]; + +/* Filled by us */ +u64 __attribute__((__aligned__(256))) percpu_times_index[NR_CPUS]; +u64 __attribute__((__aligned__(256))) percpu_times[NR_CPUS][NR_SLOTS]; + +static inline void patch_key(u32 i) +{ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + key[0] = i + 1; +#else + key[0] = __builtin_bswap32(i + 1); +#endif + /* the rest of key is random and is configured by userspace */ +} + +static int lookup_callback(__u32 index, u32 *unused) +{ + patch_key(index); + return bpf_map_lookup_elem(&hash_map_bench, key) ? 0 : 1; +} + +static int loop_lookup_callback(__u32 index, u32 *unused) +{ + return bpf_loop(nr_entries, lookup_callback, NULL, 0) ? 0 : 1; +} + +SEC("fentry/" SYS_PREFIX "sys_getpgid") +int benchmark(void *ctx) +{ + u32 cpu = bpf_get_smp_processor_id(); + u32 times_index; + u64 start_time; + + times_index = percpu_times_index[cpu & CPU_MASK] % NR_SLOTS; + start_time = bpf_ktime_get_ns(); + bpf_loop(nr_loops, loop_lookup_callback, NULL, 0); + percpu_times[cpu & CPU_MASK][times_index] = bpf_ktime_get_ns() - start_time; + percpu_times_index[cpu & CPU_MASK] += 1; + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h index 4a01ea9113bf..14e28f991451 100644 --- a/tools/testing/selftests/bpf/progs/bpf_misc.h +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h @@ -7,6 +7,13 @@ #define __success __attribute__((btf_decl_tag("comment:test_expect_success"))) #define __log_level(lvl) __attribute__((btf_decl_tag("comment:test_log_level="#lvl))) +/* Convenience macro for use with 'asm volatile' blocks */ +#define __naked __attribute__((naked)) +#define __clobber_all "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "memory" +#define __clobber_common "r0", "r1", "r2", "r3", "r4", "r5", "memory" +#define __imm(name) [name]"i"(name) +#define __imm_addr(name) [name]"i"(&name) + #if defined(__TARGET_ARCH_x86) #define SYSCALL_WRAPPER 1 #define SYS_PREFIX "__x64_" @@ -21,4 +28,29 @@ #define SYS_PREFIX "__se_" #endif +/* How many arguments are passed to function in register */ +#if defined(__TARGET_ARCH_x86) || defined(__x86_64__) +#define FUNC_REG_ARG_CNT 6 +#elif defined(__i386__) +#define FUNC_REG_ARG_CNT 3 +#elif defined(__TARGET_ARCH_s390) || defined(__s390x__) +#define FUNC_REG_ARG_CNT 5 +#elif defined(__TARGET_ARCH_arm) || defined(__arm__) +#define FUNC_REG_ARG_CNT 4 +#elif defined(__TARGET_ARCH_arm64) || defined(__aarch64__) +#define FUNC_REG_ARG_CNT 8 +#elif defined(__TARGET_ARCH_mips) || defined(__mips__) +#define FUNC_REG_ARG_CNT 8 +#elif defined(__TARGET_ARCH_powerpc) || defined(__powerpc__) || defined(__powerpc64__) +#define FUNC_REG_ARG_CNT 8 +#elif defined(__TARGET_ARCH_sparc) || defined(__sparc__) +#define FUNC_REG_ARG_CNT 6 +#elif defined(__TARGET_ARCH_riscv) || defined(__riscv__) +#define FUNC_REG_ARG_CNT 8 +#else +/* default to 5 for others */ +#define FUNC_REG_ARG_CNT 5 +#endif + + #endif diff --git a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c index e1e11897e99b..1a476d8ed354 100644 --- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c +++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c @@ -81,4 +81,30 @@ int BPF_KSYSCALL(prctl_enter, int option, unsigned long arg2, return 0; } +__u64 splice_fd_in; +__u64 splice_off_in; +__u64 splice_fd_out; +__u64 splice_off_out; +__u64 splice_len; +__u64 splice_flags; + +SEC("ksyscall/splice") +int BPF_KSYSCALL(splice_enter, int fd_in, loff_t *off_in, int fd_out, + loff_t *off_out, size_t len, unsigned int flags) +{ + pid_t pid = bpf_get_current_pid_tgid() >> 32; + + if (pid != filter_pid) + return 0; + + splice_fd_in = fd_in; + splice_off_in = (__u64)off_in; + splice_fd_out = fd_out; + splice_off_out = (__u64)off_out; + splice_len = len; + splice_flags = flags; + + return 0; +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c index e5560a656030..e01690618e1e 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c @@ -53,7 +53,7 @@ struct bitfields_only_mixed_types { */ /* ------ END-EXPECTED-OUTPUT ------ */ struct bitfield_mixed_with_others { - long: 4; /* char is enough as a backing field */ + char: 4; /* char is enough as a backing field */ int a: 4; /* 8-bit implicit padding */ short b; /* combined with previous bitfield */ diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c index e304b6204bd9..7998f27df7dd 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c @@ -58,7 +58,81 @@ union jump_code_union { } __attribute__((packed)); }; -/*------ END-EXPECTED-OUTPUT ------ */ +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *struct nested_packed_but_aligned_struct { + * int x1; + * int x2; + *}; + * + *struct outer_implicitly_packed_struct { + * char y1; + * struct nested_packed_but_aligned_struct y2; + *} __attribute__((packed)); + * + */ +/* ------ END-EXPECTED-OUTPUT ------ */ + +struct nested_packed_but_aligned_struct { + int x1; + int x2; +} __attribute__((packed)); + +struct outer_implicitly_packed_struct { + char y1; + struct nested_packed_but_aligned_struct y2; +}; +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *struct usb_ss_ep_comp_descriptor { + * char: 8; + * char bDescriptorType; + * char bMaxBurst; + * short wBytesPerInterval; + *}; + * + *struct usb_host_endpoint { + * long: 64; + * char: 8; + * struct usb_ss_ep_comp_descriptor ss_ep_comp; + * long: 0; + *} __attribute__((packed)); + * + */ +/* ------ END-EXPECTED-OUTPUT ------ */ + +struct usb_ss_ep_comp_descriptor { + char: 8; + char bDescriptorType; + char bMaxBurst; + int: 0; + short wBytesPerInterval; +} __attribute__((packed)); + +struct usb_host_endpoint { + long: 64; + char: 8; + struct usb_ss_ep_comp_descriptor ss_ep_comp; + long: 0; +}; + +/* ----- START-EXPECTED-OUTPUT ----- */ +struct nested_packed_struct { + int a; + char b; +} __attribute__((packed)); + +struct outer_nonpacked_struct { + short a; + struct nested_packed_struct b; +}; + +struct outer_packed_struct { + short a; + struct nested_packed_struct b; +} __attribute__((packed)); + +/* ------ END-EXPECTED-OUTPUT ------ */ int f(struct { struct packed_trailing_space _1; @@ -69,6 +143,10 @@ int f(struct { union union_is_never_packed _6; union union_does_not_need_packing _7; union jump_code_union _8; + struct outer_implicitly_packed_struct _9; + struct usb_host_endpoint _10; + struct outer_nonpacked_struct _11; + struct outer_packed_struct _12; } *_) { return 0; diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c index 7cb522d22a66..79276fbe454a 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c @@ -19,7 +19,7 @@ struct padded_implicitly { /* *struct padded_explicitly { * int a; - * int: 32; + * long: 0; * int b; *}; * @@ -28,41 +28,28 @@ struct padded_implicitly { struct padded_explicitly { int a; - int: 1; /* algo will explicitly pad with full 32 bits here */ + int: 1; /* algo will emit aligning `long: 0;` here */ int b; }; /* ----- START-EXPECTED-OUTPUT ----- */ -/* - *struct padded_a_lot { - * int a; - * long: 32; - * long: 64; - * long: 64; - * int b; - *}; - * - */ -/* ------ END-EXPECTED-OUTPUT ------ */ - struct padded_a_lot { int a; - /* 32 bit of implicit padding here, which algo will make explicit */ long: 64; long: 64; int b; }; +/* ------ END-EXPECTED-OUTPUT ------ */ + /* ----- START-EXPECTED-OUTPUT ----- */ /* *struct padded_cache_line { * int a; - * long: 32; * long: 64; * long: 64; * long: 64; * int b; - * long: 32; * long: 64; * long: 64; * long: 64; @@ -85,7 +72,7 @@ struct padded_cache_line { *struct zone { * int a; * short b; - * short: 16; + * long: 0; * struct zone_padding __pad__; *}; * @@ -108,6 +95,131 @@ struct padding_wo_named_members { long: 64; }; +struct padding_weird_1 { + int a; + long: 64; + short: 16; + short b; +}; + +/* ------ END-EXPECTED-OUTPUT ------ */ + +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *struct padding_weird_2 { + * long: 56; + * char a; + * long: 56; + * char b; + * char: 8; + *}; + * + */ +/* ------ END-EXPECTED-OUTPUT ------ */ +struct padding_weird_2 { + int: 32; /* these paddings will be collapsed into `long: 56;` */ + short: 16; + char: 8; + char a; + int: 32; /* these paddings will be collapsed into `long: 56;` */ + short: 16; + char: 8; + char b; + char: 8; +}; + +/* ----- START-EXPECTED-OUTPUT ----- */ +struct exact_1byte { + char x; +}; + +struct padded_1byte { + char: 8; +}; + +struct exact_2bytes { + short x; +}; + +struct padded_2bytes { + short: 16; +}; + +struct exact_4bytes { + int x; +}; + +struct padded_4bytes { + int: 32; +}; + +struct exact_8bytes { + long x; +}; + +struct padded_8bytes { + long: 64; +}; + +struct ff_periodic_effect { + int: 32; + short magnitude; + long: 0; + short phase; + long: 0; + int: 32; + int custom_len; + short *custom_data; +}; + +struct ib_wc { + long: 64; + long: 64; + int: 32; + int byte_len; + void *qp; + union {} ex; + long: 64; + int slid; + int wc_flags; + long: 64; + char smac[6]; + long: 0; + char network_hdr_type; +}; + +struct acpi_object_method { + long: 64; + char: 8; + char type; + short reference_count; + char flags; + short: 0; + char: 8; + char sync_level; + long: 64; + void *node; + void *aml_start; + union {} dispatch; + long: 64; + int aml_length; +}; + +struct nested_unpacked { + int x; +}; + +struct nested_packed { + struct nested_unpacked a; + char c; +} __attribute__((packed)); + +struct outer_mixed_but_unpacked { + struct nested_packed b1; + short a1; + struct nested_packed b2; +}; + /* ------ END-EXPECTED-OUTPUT ------ */ int f(struct { @@ -117,6 +229,20 @@ int f(struct { struct padded_cache_line _4; struct zone _5; struct padding_wo_named_members _6; + struct padding_weird_1 _7; + struct padding_weird_2 _8; + struct exact_1byte _100; + struct padded_1byte _101; + struct exact_2bytes _102; + struct padded_2bytes _103; + struct exact_4bytes _104; + struct padded_4bytes _105; + struct exact_8bytes _106; + struct padded_8bytes _107; + struct ff_periodic_effect _200; + struct ib_wc _201; + struct acpi_object_method _202; + struct outer_mixed_but_unpacked _203; } *_) { return 0; diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c index 4ee4748133fe..ad21ee8c7e23 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c @@ -25,6 +25,39 @@ typedef enum { H = 2, } e3_t; +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *enum e_byte { + * EBYTE_1 = 0, + * EBYTE_2 = 1, + *} __attribute__((mode(byte))); + * + */ +/* ----- END-EXPECTED-OUTPUT ----- */ +enum e_byte { + EBYTE_1, + EBYTE_2, +} __attribute__((mode(byte))); + +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *enum e_word { + * EWORD_1 = 0LL, + * EWORD_2 = 1LL, + *} __attribute__((mode(word))); + * + */ +/* ----- END-EXPECTED-OUTPUT ----- */ +enum e_word { + EWORD_1, + EWORD_2, +} __attribute__((mode(word))); /* force to use 8-byte backing for this enum */ + +/* ----- START-EXPECTED-OUTPUT ----- */ +enum e_big { + EBIG_1 = 1000000000000ULL, +}; + typedef int int_t; typedef volatile const int * volatile const crazy_ptr_t; @@ -51,7 +84,7 @@ typedef void (*printf_fn_t)(const char *, ...); * typedef int (*fn_t)(int); * typedef char * const * (*fn_ptr2_t)(s_t, fn_t); * - * - `fn_complext_t`: pointer to a function returning struct and accepting + * - `fn_complex_t`: pointer to a function returning struct and accepting * union and struct. All structs and enum are anonymous and defined inline. * * - `signal_t: pointer to a function accepting a pointer to a function as an @@ -224,6 +257,9 @@ struct root_struct { enum e2 _2; e2_t _2_1; e3_t _2_2; + enum e_byte _100; + enum e_word _101; + enum e_big _102; struct struct_w_typedefs _3; anon_struct_t _7; struct struct_fwd *_8; diff --git a/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c b/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c index a1369b5ebcf8..4ad7fe24966d 100644 --- a/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c +++ b/tools/testing/selftests/bpf/progs/cgrp_kfunc_failure.c @@ -5,6 +5,7 @@ #include <bpf/bpf_tracing.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" #include "cgrp_kfunc_common.h" char _license[] SEC("license") = "GPL"; @@ -28,6 +29,7 @@ static struct __cgrps_kfunc_map_value *insert_lookup_cgrp(struct cgroup *cgrp) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(cgrp_kfunc_acquire_untrusted, struct cgroup *cgrp, const char *path) { struct cgroup *acquired; @@ -45,6 +47,7 @@ int BPF_PROG(cgrp_kfunc_acquire_untrusted, struct cgroup *cgrp, const char *path } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 pointer type STRUCT cgroup must point") int BPF_PROG(cgrp_kfunc_acquire_fp, struct cgroup *cgrp, const char *path) { struct cgroup *acquired, *stack_cgrp = (struct cgroup *)&path; @@ -57,6 +60,7 @@ int BPF_PROG(cgrp_kfunc_acquire_fp, struct cgroup *cgrp, const char *path) } SEC("kretprobe/cgroup_destroy_locked") +__failure __msg("reg type unsupported for arg#0 function") int BPF_PROG(cgrp_kfunc_acquire_unsafe_kretprobe, struct cgroup *cgrp) { struct cgroup *acquired; @@ -69,6 +73,7 @@ int BPF_PROG(cgrp_kfunc_acquire_unsafe_kretprobe, struct cgroup *cgrp) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("cgrp_kfunc_acquire_trusted_walked") int BPF_PROG(cgrp_kfunc_acquire_trusted_walked, struct cgroup *cgrp, const char *path) { struct cgroup *acquired; @@ -80,8 +85,8 @@ int BPF_PROG(cgrp_kfunc_acquire_trusted_walked, struct cgroup *cgrp, const char return 0; } - SEC("tp_btf/cgroup_mkdir") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(cgrp_kfunc_acquire_null, struct cgroup *cgrp, const char *path) { struct cgroup *acquired; @@ -96,6 +101,7 @@ int BPF_PROG(cgrp_kfunc_acquire_null, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("Unreleased reference") int BPF_PROG(cgrp_kfunc_acquire_unreleased, struct cgroup *cgrp, const char *path) { struct cgroup *acquired; @@ -108,6 +114,7 @@ int BPF_PROG(cgrp_kfunc_acquire_unreleased, struct cgroup *cgrp, const char *pat } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 expected pointer to map value") int BPF_PROG(cgrp_kfunc_get_non_kptr_param, struct cgroup *cgrp, const char *path) { struct cgroup *kptr; @@ -123,6 +130,7 @@ int BPF_PROG(cgrp_kfunc_get_non_kptr_param, struct cgroup *cgrp, const char *pat } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 expected pointer to map value") int BPF_PROG(cgrp_kfunc_get_non_kptr_acquired, struct cgroup *cgrp, const char *path) { struct cgroup *kptr, *acquired; @@ -141,6 +149,7 @@ int BPF_PROG(cgrp_kfunc_get_non_kptr_acquired, struct cgroup *cgrp, const char * } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 expected pointer to map value") int BPF_PROG(cgrp_kfunc_get_null, struct cgroup *cgrp, const char *path) { struct cgroup *kptr; @@ -156,6 +165,7 @@ int BPF_PROG(cgrp_kfunc_get_null, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("Unreleased reference") int BPF_PROG(cgrp_kfunc_xchg_unreleased, struct cgroup *cgrp, const char *path) { struct cgroup *kptr; @@ -175,6 +185,7 @@ int BPF_PROG(cgrp_kfunc_xchg_unreleased, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("Unreleased reference") int BPF_PROG(cgrp_kfunc_get_unreleased, struct cgroup *cgrp, const char *path) { struct cgroup *kptr; @@ -194,6 +205,7 @@ int BPF_PROG(cgrp_kfunc_get_unreleased, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket") int BPF_PROG(cgrp_kfunc_release_untrusted, struct cgroup *cgrp, const char *path) { struct __cgrps_kfunc_map_value *v; @@ -209,6 +221,7 @@ int BPF_PROG(cgrp_kfunc_release_untrusted, struct cgroup *cgrp, const char *path } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 pointer type STRUCT cgroup must point") int BPF_PROG(cgrp_kfunc_release_fp, struct cgroup *cgrp, const char *path) { struct cgroup *acquired = (struct cgroup *)&path; @@ -220,6 +233,7 @@ int BPF_PROG(cgrp_kfunc_release_fp, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("arg#0 is ptr_or_null_ expected ptr_ or socket") int BPF_PROG(cgrp_kfunc_release_null, struct cgroup *cgrp, const char *path) { struct __cgrps_kfunc_map_value local, *v; @@ -251,6 +265,7 @@ int BPF_PROG(cgrp_kfunc_release_null, struct cgroup *cgrp, const char *path) } SEC("tp_btf/cgroup_mkdir") +__failure __msg("release kernel function bpf_cgroup_release expects") int BPF_PROG(cgrp_kfunc_release_unacquired, struct cgroup *cgrp, const char *path) { /* Cannot release trusted cgroup pointer which was not acquired. */ diff --git a/tools/testing/selftests/bpf/progs/cpumask_common.h b/tools/testing/selftests/bpf/progs/cpumask_common.h new file mode 100644 index 000000000000..ad34f3b602be --- /dev/null +++ b/tools/testing/selftests/bpf/progs/cpumask_common.h @@ -0,0 +1,114 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#ifndef _CPUMASK_COMMON_H +#define _CPUMASK_COMMON_H + +#include "errno.h" +#include <stdbool.h> + +int err; + +struct __cpumask_map_value { + struct bpf_cpumask __kptr_ref * cpumask; +}; + +struct array_map { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, int); + __type(value, struct __cpumask_map_value); + __uint(max_entries, 1); +} __cpumask_map SEC(".maps"); + +struct bpf_cpumask *bpf_cpumask_create(void) __ksym; +void bpf_cpumask_release(struct bpf_cpumask *cpumask) __ksym; +struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask) __ksym; +struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumask) __ksym; +u32 bpf_cpumask_first(const struct cpumask *cpumask) __ksym; +u32 bpf_cpumask_first_zero(const struct cpumask *cpumask) __ksym; +void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym; +void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym; +bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask) __ksym; +bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym; +bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym; +void bpf_cpumask_setall(struct bpf_cpumask *cpumask) __ksym; +void bpf_cpumask_clear(struct bpf_cpumask *cpumask) __ksym; +bool bpf_cpumask_and(struct bpf_cpumask *cpumask, + const struct cpumask *src1, + const struct cpumask *src2) __ksym; +void bpf_cpumask_or(struct bpf_cpumask *cpumask, + const struct cpumask *src1, + const struct cpumask *src2) __ksym; +void bpf_cpumask_xor(struct bpf_cpumask *cpumask, + const struct cpumask *src1, + const struct cpumask *src2) __ksym; +bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2) __ksym; +bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2) __ksym; +bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2) __ksym; +bool bpf_cpumask_empty(const struct cpumask *cpumask) __ksym; +bool bpf_cpumask_full(const struct cpumask *cpumask) __ksym; +void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src) __ksym; +u32 bpf_cpumask_any(const struct cpumask *src) __ksym; +u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2) __ksym; + +static inline const struct cpumask *cast(struct bpf_cpumask *cpumask) +{ + return (const struct cpumask *)cpumask; +} + +static inline struct bpf_cpumask *create_cpumask(void) +{ + struct bpf_cpumask *cpumask; + + cpumask = bpf_cpumask_create(); + if (!cpumask) { + err = 1; + return NULL; + } + + if (!bpf_cpumask_empty(cast(cpumask))) { + err = 2; + bpf_cpumask_release(cpumask); + return NULL; + } + + return cpumask; +} + +static inline struct __cpumask_map_value *cpumask_map_value_lookup(void) +{ + u32 key = 0; + + return bpf_map_lookup_elem(&__cpumask_map, &key); +} + +static inline int cpumask_map_insert(struct bpf_cpumask *mask) +{ + struct __cpumask_map_value local, *v; + long status; + struct bpf_cpumask *old; + u32 key = 0; + + local.cpumask = NULL; + status = bpf_map_update_elem(&__cpumask_map, &key, &local, 0); + if (status) { + bpf_cpumask_release(mask); + return status; + } + + v = bpf_map_lookup_elem(&__cpumask_map, &key); + if (!v) { + bpf_cpumask_release(mask); + return -ENOENT; + } + + old = bpf_kptr_xchg(&v->cpumask, mask); + if (old) { + bpf_cpumask_release(old); + return -EEXIST; + } + + return 0; +} + +#endif /* _CPUMASK_COMMON_H */ diff --git a/tools/testing/selftests/bpf/progs/cpumask_failure.c b/tools/testing/selftests/bpf/progs/cpumask_failure.c new file mode 100644 index 000000000000..33e8e86dd090 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/cpumask_failure.c @@ -0,0 +1,126 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +#include "cpumask_common.h" + +char _license[] SEC("license") = "GPL"; + +/* Prototype for all of the program trace events below: + * + * TRACE_EVENT(task_newtask, + * TP_PROTO(struct task_struct *p, u64 clone_flags) + */ + +SEC("tp_btf/task_newtask") +__failure __msg("Unreleased reference") +int BPF_PROG(test_alloc_no_release, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + cpumask = create_cpumask(); + + /* cpumask is never released. */ + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("NULL pointer passed to trusted arg0") +int BPF_PROG(test_alloc_double_release, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + cpumask = create_cpumask(); + + /* cpumask is released twice. */ + bpf_cpumask_release(cpumask); + bpf_cpumask_release(cpumask); + + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("bpf_cpumask_acquire args#0 expected pointer to STRUCT bpf_cpumask") +int BPF_PROG(test_acquire_wrong_cpumask, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + /* Can't acquire a non-struct bpf_cpumask. */ + cpumask = bpf_cpumask_acquire((struct bpf_cpumask *)task->cpus_ptr); + + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("bpf_cpumask_set_cpu args#1 expected pointer to STRUCT bpf_cpumask") +int BPF_PROG(test_mutate_cpumask, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + /* Can't set the CPU of a non-struct bpf_cpumask. */ + bpf_cpumask_set_cpu(0, (struct bpf_cpumask *)task->cpus_ptr); + + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("Unreleased reference") +int BPF_PROG(test_insert_remove_no_release, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + struct __cpumask_map_value *v; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (cpumask_map_insert(cpumask)) + return 0; + + v = cpumask_map_value_lookup(); + if (!v) + return 0; + + cpumask = bpf_kptr_xchg(&v->cpumask, NULL); + + /* cpumask is never released. */ + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("Unreleased reference") +int BPF_PROG(test_kptr_get_no_release, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + struct __cpumask_map_value *v; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (cpumask_map_insert(cpumask)) + return 0; + + v = cpumask_map_value_lookup(); + if (!v) + return 0; + + cpumask = bpf_cpumask_kptr_get(&v->cpumask); + + /* cpumask is never released. */ + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("NULL pointer passed to trusted arg0") +int BPF_PROG(test_cpumask_null, struct task_struct *task, u64 clone_flags) +{ + /* NULL passed to KF_TRUSTED_ARGS kfunc. */ + bpf_cpumask_empty(NULL); + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/cpumask_success.c b/tools/testing/selftests/bpf/progs/cpumask_success.c new file mode 100644 index 000000000000..1d38bc65d4b0 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/cpumask_success.c @@ -0,0 +1,426 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> + +#include "cpumask_common.h" + +char _license[] SEC("license") = "GPL"; + +int pid, nr_cpus; + +static bool is_test_task(void) +{ + int cur_pid = bpf_get_current_pid_tgid() >> 32; + + return pid == cur_pid; +} + +static bool create_cpumask_set(struct bpf_cpumask **out1, + struct bpf_cpumask **out2, + struct bpf_cpumask **out3, + struct bpf_cpumask **out4) +{ + struct bpf_cpumask *mask1, *mask2, *mask3, *mask4; + + mask1 = create_cpumask(); + if (!mask1) + return false; + + mask2 = create_cpumask(); + if (!mask2) { + bpf_cpumask_release(mask1); + err = 3; + return false; + } + + mask3 = create_cpumask(); + if (!mask3) { + bpf_cpumask_release(mask1); + bpf_cpumask_release(mask2); + err = 4; + return false; + } + + mask4 = create_cpumask(); + if (!mask4) { + bpf_cpumask_release(mask1); + bpf_cpumask_release(mask2); + bpf_cpumask_release(mask3); + err = 5; + return false; + } + + *out1 = mask1; + *out2 = mask2; + *out3 = mask3; + *out4 = mask4; + + return true; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_alloc_free_cpumask, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + if (!is_test_task()) + return 0; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + bpf_cpumask_release(cpumask); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_set_clear_cpu, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + if (!is_test_task()) + return 0; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + bpf_cpumask_set_cpu(0, cpumask); + if (!bpf_cpumask_test_cpu(0, cast(cpumask))) { + err = 3; + goto release_exit; + } + + bpf_cpumask_clear_cpu(0, cpumask); + if (bpf_cpumask_test_cpu(0, cast(cpumask))) { + err = 4; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(cpumask); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_setall_clear_cpu, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + if (!is_test_task()) + return 0; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + bpf_cpumask_setall(cpumask); + if (!bpf_cpumask_full(cast(cpumask))) { + err = 3; + goto release_exit; + } + + bpf_cpumask_clear(cpumask); + if (!bpf_cpumask_empty(cast(cpumask))) { + err = 4; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(cpumask); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_first_firstzero_cpu, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + if (!is_test_task()) + return 0; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (bpf_cpumask_first(cast(cpumask)) < nr_cpus) { + err = 3; + goto release_exit; + } + + if (bpf_cpumask_first_zero(cast(cpumask)) != 0) { + bpf_printk("first zero: %d", bpf_cpumask_first_zero(cast(cpumask))); + err = 4; + goto release_exit; + } + + bpf_cpumask_set_cpu(0, cpumask); + if (bpf_cpumask_first(cast(cpumask)) != 0) { + err = 5; + goto release_exit; + } + + if (bpf_cpumask_first_zero(cast(cpumask)) != 1) { + err = 6; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(cpumask); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_test_and_set_clear, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + + if (!is_test_task()) + return 0; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (bpf_cpumask_test_and_set_cpu(0, cpumask)) { + err = 3; + goto release_exit; + } + + if (!bpf_cpumask_test_and_set_cpu(0, cpumask)) { + err = 4; + goto release_exit; + } + + if (!bpf_cpumask_test_and_clear_cpu(0, cpumask)) { + err = 5; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(cpumask); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_and_or_xor, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *mask1, *mask2, *dst1, *dst2; + + if (!is_test_task()) + return 0; + + if (!create_cpumask_set(&mask1, &mask2, &dst1, &dst2)) + return 0; + + bpf_cpumask_set_cpu(0, mask1); + bpf_cpumask_set_cpu(1, mask2); + + if (bpf_cpumask_and(dst1, cast(mask1), cast(mask2))) { + err = 6; + goto release_exit; + } + if (!bpf_cpumask_empty(cast(dst1))) { + err = 7; + goto release_exit; + } + + bpf_cpumask_or(dst1, cast(mask1), cast(mask2)); + if (!bpf_cpumask_test_cpu(0, cast(dst1))) { + err = 8; + goto release_exit; + } + if (!bpf_cpumask_test_cpu(1, cast(dst1))) { + err = 9; + goto release_exit; + } + + bpf_cpumask_xor(dst2, cast(mask1), cast(mask2)); + if (!bpf_cpumask_equal(cast(dst1), cast(dst2))) { + err = 10; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(mask1); + bpf_cpumask_release(mask2); + bpf_cpumask_release(dst1); + bpf_cpumask_release(dst2); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_intersects_subset, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *mask1, *mask2, *dst1, *dst2; + + if (!is_test_task()) + return 0; + + if (!create_cpumask_set(&mask1, &mask2, &dst1, &dst2)) + return 0; + + bpf_cpumask_set_cpu(0, mask1); + bpf_cpumask_set_cpu(1, mask2); + if (bpf_cpumask_intersects(cast(mask1), cast(mask2))) { + err = 6; + goto release_exit; + } + + bpf_cpumask_or(dst1, cast(mask1), cast(mask2)); + if (!bpf_cpumask_subset(cast(mask1), cast(dst1))) { + err = 7; + goto release_exit; + } + + if (!bpf_cpumask_subset(cast(mask2), cast(dst1))) { + err = 8; + goto release_exit; + } + + if (bpf_cpumask_subset(cast(dst1), cast(mask1))) { + err = 9; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(mask1); + bpf_cpumask_release(mask2); + bpf_cpumask_release(dst1); + bpf_cpumask_release(dst2); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_copy_any_anyand, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *mask1, *mask2, *dst1, *dst2; + u32 cpu; + + if (!is_test_task()) + return 0; + + if (!create_cpumask_set(&mask1, &mask2, &dst1, &dst2)) + return 0; + + bpf_cpumask_set_cpu(0, mask1); + bpf_cpumask_set_cpu(1, mask2); + bpf_cpumask_or(dst1, cast(mask1), cast(mask2)); + + cpu = bpf_cpumask_any(cast(mask1)); + if (cpu != 0) { + err = 6; + goto release_exit; + } + + cpu = bpf_cpumask_any(cast(dst2)); + if (cpu < nr_cpus) { + err = 7; + goto release_exit; + } + + bpf_cpumask_copy(dst2, cast(dst1)); + if (!bpf_cpumask_equal(cast(dst1), cast(dst2))) { + err = 8; + goto release_exit; + } + + cpu = bpf_cpumask_any(cast(dst2)); + if (cpu > 1) { + err = 9; + goto release_exit; + } + + cpu = bpf_cpumask_any_and(cast(mask1), cast(mask2)); + if (cpu < nr_cpus) { + err = 10; + goto release_exit; + } + +release_exit: + bpf_cpumask_release(mask1); + bpf_cpumask_release(mask2); + bpf_cpumask_release(dst1); + bpf_cpumask_release(dst2); + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_insert_leave, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + struct __cpumask_map_value *v; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (cpumask_map_insert(cpumask)) + err = 3; + + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_insert_remove_release, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + struct __cpumask_map_value *v; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (cpumask_map_insert(cpumask)) { + err = 3; + return 0; + } + + v = cpumask_map_value_lookup(); + if (!v) { + err = 4; + return 0; + } + + cpumask = bpf_kptr_xchg(&v->cpumask, NULL); + if (cpumask) + bpf_cpumask_release(cpumask); + else + err = 5; + + return 0; +} + +SEC("tp_btf/task_newtask") +int BPF_PROG(test_insert_kptr_get_release, struct task_struct *task, u64 clone_flags) +{ + struct bpf_cpumask *cpumask; + struct __cpumask_map_value *v; + + cpumask = create_cpumask(); + if (!cpumask) + return 0; + + if (cpumask_map_insert(cpumask)) { + err = 3; + return 0; + } + + v = cpumask_map_value_lookup(); + if (!v) { + err = 4; + return 0; + } + + cpumask = bpf_cpumask_kptr_get(&v->cpumask); + if (cpumask) + bpf_cpumask_release(cpumask); + else + err = 5; + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/dummy_st_ops_fail.c b/tools/testing/selftests/bpf/progs/dummy_st_ops_fail.c new file mode 100644 index 000000000000..0bf969a0b5ed --- /dev/null +++ b/tools/testing/selftests/bpf/progs/dummy_st_ops_fail.c @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +SEC("struct_ops.s/test_2") +__failure __msg("attach to unsupported member test_2 of struct bpf_dummy_ops") +int BPF_PROG(test_unsupported_field_sleepable, + struct bpf_dummy_ops_state *state, int a1, unsigned short a2, + char a3, unsigned long a4) +{ + /* Tries to mark an unsleepable field in struct bpf_dummy_ops as sleepable. */ + return 0; +} + +SEC(".struct_ops") +struct bpf_dummy_ops dummy_1 = { + .test_1 = NULL, + .test_2 = (void *)test_unsupported_field_sleepable, + .test_sleepable = (void *)NULL, +}; diff --git a/tools/testing/selftests/bpf/progs/dummy_st_ops.c b/tools/testing/selftests/bpf/progs/dummy_st_ops_success.c index ead87edb75e2..1efa746c25dc 100644 --- a/tools/testing/selftests/bpf/progs/dummy_st_ops.c +++ b/tools/testing/selftests/bpf/progs/dummy_st_ops_success.c @@ -1,19 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (C) 2021. Huawei Technologies Co., Ltd */ -#include <linux/bpf.h> +#include "vmlinux.h" #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> -struct bpf_dummy_ops_state { - int val; -} __attribute__((preserve_access_index)); - -struct bpf_dummy_ops { - int (*test_1)(struct bpf_dummy_ops_state *state); - int (*test_2)(struct bpf_dummy_ops_state *state, int a1, unsigned short a2, - char a3, unsigned long a4); -}; - char _license[] SEC("license") = "GPL"; SEC("struct_ops/test_1") @@ -43,8 +33,15 @@ int BPF_PROG(test_2, struct bpf_dummy_ops_state *state, int a1, unsigned short a return 0; } +SEC("struct_ops.s/test_sleepable") +int BPF_PROG(test_sleepable, struct bpf_dummy_ops_state *state) +{ + return 0; +} + SEC(".struct_ops") struct bpf_dummy_ops dummy_1 = { .test_1 = (void *)test_1, .test_2 = (void *)test_2, + .test_sleepable = (void *)test_sleepable, }; diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c index 78debc1b3820..aa5b69354b91 100644 --- a/tools/testing/selftests/bpf/progs/dynptr_fail.c +++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c @@ -35,6 +35,13 @@ struct { __type(value, __u32); } array_map3 SEC(".maps"); +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} array_map4 SEC(".maps"); + struct sample { int pid; long value; @@ -67,7 +74,7 @@ static int get_map_val_dynptr(struct bpf_dynptr *ptr) * bpf_ringbuf_submit/discard_dynptr call */ SEC("?raw_tp") -__failure __msg("Unreleased reference id=1") +__failure __msg("Unreleased reference id=2") int ringbuf_missing_release1(void *ctx) { struct bpf_dynptr ptr; @@ -80,7 +87,7 @@ int ringbuf_missing_release1(void *ctx) } SEC("?raw_tp") -__failure __msg("Unreleased reference id=2") +__failure __msg("Unreleased reference id=4") int ringbuf_missing_release2(void *ctx) { struct bpf_dynptr ptr1, ptr2; @@ -382,7 +389,7 @@ int invalid_helper1(void *ctx) /* A dynptr can't be passed into a helper function at a non-zero offset */ SEC("?raw_tp") -__failure __msg("Expected an initialized dynptr as arg #3") +__failure __msg("cannot pass in dynptr at an offset=-8") int invalid_helper2(void *ctx) { struct bpf_dynptr ptr; @@ -420,7 +427,7 @@ int invalid_write1(void *ctx) * offset */ SEC("?raw_tp") -__failure __msg("Expected an initialized dynptr as arg #3") +__failure __msg("cannot overwrite referenced dynptr") int invalid_write2(void *ctx) { struct bpf_dynptr ptr; @@ -444,7 +451,7 @@ int invalid_write2(void *ctx) * non-const offset */ SEC("?raw_tp") -__failure __msg("Expected an initialized dynptr as arg #1") +__failure __msg("cannot overwrite referenced dynptr") int invalid_write3(void *ctx) { struct bpf_dynptr ptr; @@ -476,7 +483,7 @@ static int invalid_write4_callback(__u32 index, void *data) * be invalidated as a dynptr */ SEC("?raw_tp") -__failure __msg("arg 1 is an unacquired reference") +__failure __msg("cannot overwrite referenced dynptr") int invalid_write4(void *ctx) { struct bpf_dynptr ptr; @@ -584,7 +591,7 @@ int invalid_read4(void *ctx) /* Initializing a dynptr on an offset should fail */ SEC("?raw_tp") -__failure __msg("invalid write to stack") +__failure __msg("cannot pass in dynptr at an offset=0") int invalid_offset(void *ctx) { struct bpf_dynptr ptr; @@ -623,7 +630,7 @@ static int release_twice_callback_fn(__u32 index, void *data) } /* Test that releasing a dynptr twice, where one of the releases happens - * within a calback function, fails + * within a callback function, fails */ SEC("?raw_tp") __failure __msg("arg 1 is an unacquired reference") @@ -653,3 +660,435 @@ int dynptr_from_mem_invalid_api(void *ctx) return 0; } + +SEC("?tc") +__failure __msg("cannot overwrite referenced dynptr") __log_level(2) +int dynptr_pruning_overwrite(struct __sk_buff *ctx) +{ + asm volatile ( + "r9 = 0xeB9F; \ + r6 = %[ringbuf] ll; \ + r1 = r6; \ + r2 = 8; \ + r3 = 0; \ + r4 = r10; \ + r4 += -16; \ + call %[bpf_ringbuf_reserve_dynptr]; \ + if r0 == 0 goto pjmp1; \ + goto pjmp2; \ + pjmp1: \ + *(u64 *)(r10 - 16) = r9; \ + pjmp2: \ + r1 = r10; \ + r1 += -16; \ + r2 = 0; \ + call %[bpf_ringbuf_discard_dynptr]; " + : + : __imm(bpf_ringbuf_reserve_dynptr), + __imm(bpf_ringbuf_discard_dynptr), + __imm_addr(ringbuf) + : __clobber_all + ); + return 0; +} + +SEC("?tc") +__success __msg("12: safe") __log_level(2) +int dynptr_pruning_stacksafe(struct __sk_buff *ctx) +{ + asm volatile ( + "r9 = 0xeB9F; \ + r6 = %[ringbuf] ll; \ + r1 = r6; \ + r2 = 8; \ + r3 = 0; \ + r4 = r10; \ + r4 += -16; \ + call %[bpf_ringbuf_reserve_dynptr]; \ + if r0 == 0 goto stjmp1; \ + goto stjmp2; \ + stjmp1: \ + r9 = r9; \ + stjmp2: \ + r1 = r10; \ + r1 += -16; \ + r2 = 0; \ + call %[bpf_ringbuf_discard_dynptr]; " + : + : __imm(bpf_ringbuf_reserve_dynptr), + __imm(bpf_ringbuf_discard_dynptr), + __imm_addr(ringbuf) + : __clobber_all + ); + return 0; +} + +SEC("?tc") +__failure __msg("cannot overwrite referenced dynptr") __log_level(2) +int dynptr_pruning_type_confusion(struct __sk_buff *ctx) +{ + asm volatile ( + "r6 = %[array_map4] ll; \ + r7 = %[ringbuf] ll; \ + r1 = r6; \ + r2 = r10; \ + r2 += -8; \ + r9 = 0; \ + *(u64 *)(r2 + 0) = r9; \ + r3 = r10; \ + r3 += -24; \ + r9 = 0xeB9FeB9F; \ + *(u64 *)(r10 - 16) = r9; \ + *(u64 *)(r10 - 24) = r9; \ + r9 = 0; \ + r4 = 0; \ + r8 = r2; \ + call %[bpf_map_update_elem]; \ + r1 = r6; \ + r2 = r8; \ + call %[bpf_map_lookup_elem]; \ + if r0 != 0 goto tjmp1; \ + exit; \ + tjmp1: \ + r8 = r0; \ + r1 = r7; \ + r2 = 8; \ + r3 = 0; \ + r4 = r10; \ + r4 += -16; \ + r0 = *(u64 *)(r0 + 0); \ + call %[bpf_ringbuf_reserve_dynptr]; \ + if r0 == 0 goto tjmp2; \ + r8 = r8; \ + r8 = r8; \ + r8 = r8; \ + r8 = r8; \ + r8 = r8; \ + r8 = r8; \ + r8 = r8; \ + goto tjmp3; \ + tjmp2: \ + *(u64 *)(r10 - 8) = r9; \ + *(u64 *)(r10 - 16) = r9; \ + r1 = r8; \ + r1 += 8; \ + r2 = 0; \ + r3 = 0; \ + r4 = r10; \ + r4 += -16; \ + call %[bpf_dynptr_from_mem]; \ + tjmp3: \ + r1 = r10; \ + r1 += -16; \ + r2 = 0; \ + call %[bpf_ringbuf_discard_dynptr]; " + : + : __imm(bpf_map_update_elem), + __imm(bpf_map_lookup_elem), + __imm(bpf_ringbuf_reserve_dynptr), + __imm(bpf_dynptr_from_mem), + __imm(bpf_ringbuf_discard_dynptr), + __imm_addr(array_map4), + __imm_addr(ringbuf) + : __clobber_all + ); + return 0; +} + +SEC("?tc") +__failure __msg("dynptr has to be at a constant offset") __log_level(2) +int dynptr_var_off_overwrite(struct __sk_buff *ctx) +{ + asm volatile ( + "r9 = 16; \ + *(u32 *)(r10 - 4) = r9; \ + r8 = *(u32 *)(r10 - 4); \ + if r8 >= 0 goto vjmp1; \ + r0 = 1; \ + exit; \ + vjmp1: \ + if r8 <= 16 goto vjmp2; \ + r0 = 1; \ + exit; \ + vjmp2: \ + r8 &= 16; \ + r1 = %[ringbuf] ll; \ + r2 = 8; \ + r3 = 0; \ + r4 = r10; \ + r4 += -32; \ + r4 += r8; \ + call %[bpf_ringbuf_reserve_dynptr]; \ + r9 = 0xeB9F; \ + *(u64 *)(r10 - 16) = r9; \ + r1 = r10; \ + r1 += -32; \ + r1 += r8; \ + r2 = 0; \ + call %[bpf_ringbuf_discard_dynptr]; " + : + : __imm(bpf_ringbuf_reserve_dynptr), + __imm(bpf_ringbuf_discard_dynptr), + __imm_addr(ringbuf) + : __clobber_all + ); + return 0; +} + +SEC("?tc") +__failure __msg("cannot overwrite referenced dynptr") __log_level(2) +int dynptr_partial_slot_invalidate(struct __sk_buff *ctx) +{ + asm volatile ( + "r6 = %[ringbuf] ll; \ + r7 = %[array_map4] ll; \ + r1 = r7; \ + r2 = r10; \ + r2 += -8; \ + r9 = 0; \ + *(u64 *)(r2 + 0) = r9; \ + r3 = r2; \ + r4 = 0; \ + r8 = r2; \ + call %[bpf_map_update_elem]; \ + r1 = r7; \ + r2 = r8; \ + call %[bpf_map_lookup_elem]; \ + if r0 != 0 goto sjmp1; \ + exit; \ + sjmp1: \ + r7 = r0; \ + r1 = r6; \ + r2 = 8; \ + r3 = 0; \ + r4 = r10; \ + r4 += -24; \ + call %[bpf_ringbuf_reserve_dynptr]; \ + *(u64 *)(r10 - 16) = r9; \ + r1 = r7; \ + r2 = 8; \ + r3 = 0; \ + r4 = r10; \ + r4 += -16; \ + call %[bpf_dynptr_from_mem]; \ + r1 = r10; \ + r1 += -512; \ + r2 = 488; \ + r3 = r10; \ + r3 += -24; \ + r4 = 0; \ + r5 = 0; \ + call %[bpf_dynptr_read]; \ + r8 = 1; \ + if r0 != 0 goto sjmp2; \ + r8 = 0; \ + sjmp2: \ + r1 = r10; \ + r1 += -24; \ + r2 = 0; \ + call %[bpf_ringbuf_discard_dynptr]; " + : + : __imm(bpf_map_update_elem), + __imm(bpf_map_lookup_elem), + __imm(bpf_ringbuf_reserve_dynptr), + __imm(bpf_ringbuf_discard_dynptr), + __imm(bpf_dynptr_from_mem), + __imm(bpf_dynptr_read), + __imm_addr(ringbuf), + __imm_addr(array_map4) + : __clobber_all + ); + return 0; +} + +/* Test that it is allowed to overwrite unreferenced dynptr. */ +SEC("?raw_tp") +__success +int dynptr_overwrite_unref(void *ctx) +{ + struct bpf_dynptr ptr; + + if (get_map_val_dynptr(&ptr)) + return 0; + if (get_map_val_dynptr(&ptr)) + return 0; + if (get_map_val_dynptr(&ptr)) + return 0; + + return 0; +} + +/* Test that slices are invalidated on reinitializing a dynptr. */ +SEC("?raw_tp") +__failure __msg("invalid mem access 'scalar'") +int dynptr_invalidate_slice_reinit(void *ctx) +{ + struct bpf_dynptr ptr; + __u8 *p; + + if (get_map_val_dynptr(&ptr)) + return 0; + p = bpf_dynptr_data(&ptr, 0, 1); + if (!p) + return 0; + if (get_map_val_dynptr(&ptr)) + return 0; + /* this should fail */ + return *p; +} + +/* Invalidation of dynptr slices on destruction of dynptr should not miss + * mem_or_null pointers. + */ +SEC("?raw_tp") +__failure __msg("R1 type=scalar expected=percpu_ptr_") +int dynptr_invalidate_slice_or_null(void *ctx) +{ + struct bpf_dynptr ptr; + __u8 *p; + + if (get_map_val_dynptr(&ptr)) + return 0; + + p = bpf_dynptr_data(&ptr, 0, 1); + *(__u8 *)&ptr = 0; + /* this should fail */ + bpf_this_cpu_ptr(p); + return 0; +} + +/* Destruction of dynptr should also any slices obtained from it */ +SEC("?raw_tp") +__failure __msg("R7 invalid mem access 'scalar'") +int dynptr_invalidate_slice_failure(void *ctx) +{ + struct bpf_dynptr ptr1; + struct bpf_dynptr ptr2; + __u8 *p1, *p2; + + if (get_map_val_dynptr(&ptr1)) + return 0; + if (get_map_val_dynptr(&ptr2)) + return 0; + + p1 = bpf_dynptr_data(&ptr1, 0, 1); + if (!p1) + return 0; + p2 = bpf_dynptr_data(&ptr2, 0, 1); + if (!p2) + return 0; + + *(__u8 *)&ptr1 = 0; + /* this should fail */ + return *p1; +} + +/* Invalidation of slices should be scoped and should not prevent dereferencing + * slices of another dynptr after destroying unrelated dynptr + */ +SEC("?raw_tp") +__success +int dynptr_invalidate_slice_success(void *ctx) +{ + struct bpf_dynptr ptr1; + struct bpf_dynptr ptr2; + __u8 *p1, *p2; + + if (get_map_val_dynptr(&ptr1)) + return 1; + if (get_map_val_dynptr(&ptr2)) + return 1; + + p1 = bpf_dynptr_data(&ptr1, 0, 1); + if (!p1) + return 1; + p2 = bpf_dynptr_data(&ptr2, 0, 1); + if (!p2) + return 1; + + *(__u8 *)&ptr1 = 0; + return *p2; +} + +/* Overwriting referenced dynptr should be rejected */ +SEC("?raw_tp") +__failure __msg("cannot overwrite referenced dynptr") +int dynptr_overwrite_ref(void *ctx) +{ + struct bpf_dynptr ptr; + + bpf_ringbuf_reserve_dynptr(&ringbuf, 64, 0, &ptr); + /* this should fail */ + if (get_map_val_dynptr(&ptr)) + bpf_ringbuf_discard_dynptr(&ptr, 0); + return 0; +} + +/* Reject writes to dynptr slot from bpf_dynptr_read */ +SEC("?raw_tp") +__failure __msg("potential write to dynptr at off=-16") +int dynptr_read_into_slot(void *ctx) +{ + union { + struct { + char _pad[48]; + struct bpf_dynptr ptr; + }; + char buf[64]; + } data; + + bpf_ringbuf_reserve_dynptr(&ringbuf, 64, 0, &data.ptr); + /* this should fail */ + bpf_dynptr_read(data.buf, sizeof(data.buf), &data.ptr, 0, 0); + + return 0; +} + +/* Reject writes to dynptr slot for uninit arg */ +SEC("?raw_tp") +__failure __msg("potential write to dynptr at off=-16") +int uninit_write_into_slot(void *ctx) +{ + struct { + char buf[64]; + struct bpf_dynptr ptr; + } data; + + bpf_ringbuf_reserve_dynptr(&ringbuf, 80, 0, &data.ptr); + /* this should fail */ + bpf_get_current_comm(data.buf, 80); + + return 0; +} + +static int callback(__u32 index, void *data) +{ + *(__u32 *)data = 123; + + return 0; +} + +/* If the dynptr is written into in a callback function, its data + * slices should be invalidated as well. + */ +SEC("?raw_tp") +__failure __msg("invalid mem access 'scalar'") +int invalid_data_slices(void *ctx) +{ + struct bpf_dynptr ptr; + __u32 *slice; + + if (get_map_val_dynptr(&ptr)) + return 0; + + slice = bpf_dynptr_data(&ptr, 0, sizeof(__u32)); + if (!slice) + return 0; + + bpf_loop(10, callback, &ptr, 0); + + /* this should fail */ + *slice = 1; + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/fib_lookup.c b/tools/testing/selftests/bpf/progs/fib_lookup.c new file mode 100644 index 000000000000..c4514dd58c62 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/fib_lookup.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <linux/types.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_tracing_net.h" + +struct bpf_fib_lookup fib_params = {}; +int fib_lookup_ret = 0; +int lookup_flags = 0; + +SEC("tc") +int fib_lookup(struct __sk_buff *skb) +{ + fib_lookup_ret = bpf_fib_lookup(skb, &fib_params, sizeof(fib_params), + lookup_flags); + + return TC_ACT_SHOT; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/htab_reuse.c b/tools/testing/selftests/bpf/progs/htab_reuse.c new file mode 100644 index 000000000000..7f7368cb3095 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/htab_reuse.c @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2023. Huawei Technologies Co., Ltd */ +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +char _license[] SEC("license") = "GPL"; + +struct htab_val { + struct bpf_spin_lock lock; + unsigned int data; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 64); + __type(key, unsigned int); + __type(value, struct htab_val); + __uint(map_flags, BPF_F_NO_PREALLOC); +} htab SEC(".maps"); diff --git a/tools/testing/selftests/bpf/progs/jit_probe_mem.c b/tools/testing/selftests/bpf/progs/jit_probe_mem.c new file mode 100644 index 000000000000..2d2e61470794 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/jit_probe_mem.c @@ -0,0 +1,61 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> + +static struct prog_test_ref_kfunc __kptr_ref *v; +long total_sum = -1; + +extern struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(unsigned long *sp) __ksym; +extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym; + +SEC("tc") +int test_jit_probe_mem(struct __sk_buff *ctx) +{ + struct prog_test_ref_kfunc *p; + unsigned long zero = 0, sum; + + p = bpf_kfunc_call_test_acquire(&zero); + if (!p) + return 1; + + p = bpf_kptr_xchg(&v, p); + if (p) + goto release_out; + + /* Direct map value access of kptr, should be PTR_UNTRUSTED */ + p = v; + if (!p) + return 1; + + asm volatile ( + "r9 = %[p];" + "%[sum] = 0;" + + /* r8 = p->a */ + "r8 = *(u32 *)(r9 + 0);" + "%[sum] += r8;" + + /* r8 = p->b */ + "r8 = *(u32 *)(r9 + 4);" + "%[sum] += r8;" + + "r9 += 8;" + /* r9 = p->a */ + "r9 = *(u32 *)(r9 - 8);" + "%[sum] += r9;" + + : [sum] "=r"(sum) + : [p] "r"(p) + : "r8", "r9" + ); + + total_sum = sum; + return 0; +release_out: + bpf_kfunc_call_test_release(p); + return 1; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/kfunc_call_test.c b/tools/testing/selftests/bpf/progs/kfunc_call_test.c index f636e50be259..7daa8f5720b9 100644 --- a/tools/testing/selftests/bpf/progs/kfunc_call_test.c +++ b/tools/testing/selftests/bpf/progs/kfunc_call_test.c @@ -3,6 +3,7 @@ #include <vmlinux.h> #include <bpf/bpf_helpers.h> +extern long bpf_kfunc_call_test4(signed char a, short b, int c, long d) __ksym; extern int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym; extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b, __u32 c, __u64 d) __ksym; @@ -16,6 +17,24 @@ extern void bpf_kfunc_call_test_mem_len_pass1(void *mem, int len) __ksym; extern void bpf_kfunc_call_test_mem_len_fail2(__u64 *mem, int len) __ksym; extern int *bpf_kfunc_call_test_get_rdwr_mem(struct prog_test_ref_kfunc *p, const int rdwr_buf_size) __ksym; extern int *bpf_kfunc_call_test_get_rdonly_mem(struct prog_test_ref_kfunc *p, const int rdonly_buf_size) __ksym; +extern u32 bpf_kfunc_call_test_static_unused_arg(u32 arg, u32 unused) __ksym; + +SEC("tc") +int kfunc_call_test4(struct __sk_buff *skb) +{ + struct bpf_sock *sk = skb->sk; + long tmp; + + if (!sk) + return -1; + + sk = bpf_sk_fullsock(sk); + if (!sk) + return -1; + + tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000); + return (tmp >> 32) + tmp; +} SEC("tc") int kfunc_call_test2(struct __sk_buff *skb) @@ -163,4 +182,14 @@ int kfunc_call_test_get_mem(struct __sk_buff *skb) return ret; } +SEC("tc") +int kfunc_call_test_static_unused_arg(struct __sk_buff *skb) +{ + + u32 expected = 5, actual; + + actual = bpf_kfunc_call_test_static_unused_arg(expected, 0xdeadbeef); + return actual != expected ? -1 : 0; +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/linked_list.c b/tools/testing/selftests/bpf/progs/linked_list.c index 4ad88da5cda2..4fa4a9b01bde 100644 --- a/tools/testing/selftests/bpf/progs/linked_list.c +++ b/tools/testing/selftests/bpf/progs/linked_list.c @@ -260,7 +260,7 @@ int test_list_push_pop_multiple(struct bpf_spin_lock *lock, struct bpf_list_head { int ret; - ret = list_push_pop_multiple(lock ,head, false); + ret = list_push_pop_multiple(lock, head, false); if (ret) return ret; return list_push_pop_multiple(lock, head, true); diff --git a/tools/testing/selftests/bpf/progs/linked_list_fail.c b/tools/testing/selftests/bpf/progs/linked_list_fail.c index 1d9017240e19..69cdc07cba13 100644 --- a/tools/testing/selftests/bpf/progs/linked_list_fail.c +++ b/tools/testing/selftests/bpf/progs/linked_list_fail.c @@ -54,28 +54,44 @@ return 0; \ } -CHECK(kptr, push_front, &f->head); -CHECK(kptr, push_back, &f->head); CHECK(kptr, pop_front, &f->head); CHECK(kptr, pop_back, &f->head); -CHECK(global, push_front, &ghead); -CHECK(global, push_back, &ghead); CHECK(global, pop_front, &ghead); CHECK(global, pop_back, &ghead); -CHECK(map, push_front, &v->head); -CHECK(map, push_back, &v->head); CHECK(map, pop_front, &v->head); CHECK(map, pop_back, &v->head); -CHECK(inner_map, push_front, &iv->head); -CHECK(inner_map, push_back, &iv->head); CHECK(inner_map, pop_front, &iv->head); CHECK(inner_map, pop_back, &iv->head); #undef CHECK +#define CHECK(test, op, hexpr, nexpr) \ + SEC("?tc") \ + int test##_missing_lock_##op(void *ctx) \ + { \ + INIT; \ + void (*p)(void *, void *) = (void *)&bpf_list_##op; \ + p(hexpr, nexpr); \ + return 0; \ + } + +CHECK(kptr, push_front, &f->head, b); +CHECK(kptr, push_back, &f->head, b); + +CHECK(global, push_front, &ghead, f); +CHECK(global, push_back, &ghead, f); + +CHECK(map, push_front, &v->head, f); +CHECK(map, push_back, &v->head, f); + +CHECK(inner_map, push_front, &iv->head, f); +CHECK(inner_map, push_back, &iv->head, f); + +#undef CHECK + #define CHECK(test, op, lexpr, hexpr) \ SEC("?tc") \ int test##_incorrect_lock_##op(void *ctx) \ @@ -108,13 +124,49 @@ CHECK(inner_map, pop_back, &iv->head); CHECK(inner_map_global, op, &iv->lock, &ghead); \ CHECK(inner_map_map, op, &iv->lock, &v->head); -CHECK_OP(push_front); -CHECK_OP(push_back); CHECK_OP(pop_front); CHECK_OP(pop_back); #undef CHECK #undef CHECK_OP + +#define CHECK(test, op, lexpr, hexpr, nexpr) \ + SEC("?tc") \ + int test##_incorrect_lock_##op(void *ctx) \ + { \ + INIT; \ + void (*p)(void *, void*) = (void *)&bpf_list_##op; \ + bpf_spin_lock(lexpr); \ + p(hexpr, nexpr); \ + return 0; \ + } + +#define CHECK_OP(op) \ + CHECK(kptr_kptr, op, &f1->lock, &f2->head, b); \ + CHECK(kptr_global, op, &f1->lock, &ghead, f); \ + CHECK(kptr_map, op, &f1->lock, &v->head, f); \ + CHECK(kptr_inner_map, op, &f1->lock, &iv->head, f); \ + \ + CHECK(global_global, op, &glock2, &ghead, f); \ + CHECK(global_kptr, op, &glock, &f1->head, b); \ + CHECK(global_map, op, &glock, &v->head, f); \ + CHECK(global_inner_map, op, &glock, &iv->head, f); \ + \ + CHECK(map_map, op, &v->lock, &v2->head, f); \ + CHECK(map_kptr, op, &v->lock, &f2->head, b); \ + CHECK(map_global, op, &v->lock, &ghead, f); \ + CHECK(map_inner_map, op, &v->lock, &iv->head, f); \ + \ + CHECK(inner_map_inner_map, op, &iv->lock, &iv2->head, f); \ + CHECK(inner_map_kptr, op, &iv->lock, &f2->head, b); \ + CHECK(inner_map_global, op, &iv->lock, &ghead, f); \ + CHECK(inner_map_map, op, &iv->lock, &v->head, f); + +CHECK_OP(push_front); +CHECK_OP(push_back); + +#undef CHECK +#undef CHECK_OP #undef INIT SEC("?kprobe/xyz") @@ -304,34 +356,6 @@ int direct_write_node(void *ctx) } static __always_inline -int write_after_op(void (*push_op)(void *head, void *node)) -{ - struct foo *f; - - f = bpf_obj_new(typeof(*f)); - if (!f) - return 0; - bpf_spin_lock(&glock); - push_op(&ghead, &f->node); - f->data = 42; - bpf_spin_unlock(&glock); - - return 0; -} - -SEC("?tc") -int write_after_push_front(void *ctx) -{ - return write_after_op((void *)bpf_list_push_front); -} - -SEC("?tc") -int write_after_push_back(void *ctx) -{ - return write_after_op((void *)bpf_list_push_back); -} - -static __always_inline int use_after_unlock(void (*op)(void *head, void *node)) { struct foo *f; diff --git a/tools/testing/selftests/bpf/progs/lsm.c b/tools/testing/selftests/bpf/progs/lsm.c index d8d8af623bc2..dc93887ed34c 100644 --- a/tools/testing/selftests/bpf/progs/lsm.c +++ b/tools/testing/selftests/bpf/progs/lsm.c @@ -6,9 +6,10 @@ #include "bpf_misc.h" #include "vmlinux.h" +#include <bpf/bpf_core_read.h> #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> -#include <errno.h> +#include <errno.h> struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -164,8 +165,8 @@ int copy_test = 0; SEC("fentry.s/" SYS_PREFIX "sys_setdomainname") int BPF_PROG(test_sys_setdomainname, struct pt_regs *regs) { - void *ptr = (void *)PT_REGS_PARM1(regs); - int len = PT_REGS_PARM2(regs); + void *ptr = (void *)PT_REGS_PARM1_SYSCALL(regs); + int len = PT_REGS_PARM2_SYSCALL(regs); int buf = 0; long ret; diff --git a/tools/testing/selftests/bpf/progs/map_kptr.c b/tools/testing/selftests/bpf/progs/map_kptr.c index eb8217803493..228ec45365a8 100644 --- a/tools/testing/selftests/bpf/progs/map_kptr.c +++ b/tools/testing/selftests/bpf/progs/map_kptr.c @@ -62,21 +62,23 @@ extern struct prog_test_ref_kfunc * bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **p, int a, int b) __ksym; extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym; +#define WRITE_ONCE(x, val) ((*(volatile typeof(x) *) &(x)) = (val)) + static void test_kptr_unref(struct map_value *v) { struct prog_test_ref_kfunc *p; p = v->unref_ptr; /* store untrusted_ptr_or_null_ */ - v->unref_ptr = p; + WRITE_ONCE(v->unref_ptr, p); if (!p) return; if (p->a + p->b > 100) return; /* store untrusted_ptr_ */ - v->unref_ptr = p; + WRITE_ONCE(v->unref_ptr, p); /* store NULL */ - v->unref_ptr = NULL; + WRITE_ONCE(v->unref_ptr, NULL); } static void test_kptr_ref(struct map_value *v) @@ -85,7 +87,7 @@ static void test_kptr_ref(struct map_value *v) p = v->ref_ptr; /* store ptr_or_null_ */ - v->unref_ptr = p; + WRITE_ONCE(v->unref_ptr, p); if (!p) return; if (p->a + p->b > 100) @@ -99,7 +101,7 @@ static void test_kptr_ref(struct map_value *v) return; } /* store ptr_ */ - v->unref_ptr = p; + WRITE_ONCE(v->unref_ptr, p); bpf_kfunc_call_test_release(p); p = bpf_kfunc_call_test_acquire(&(unsigned long){0}); diff --git a/tools/testing/selftests/bpf/progs/nested_trust_common.h b/tools/testing/selftests/bpf/progs/nested_trust_common.h new file mode 100644 index 000000000000..83d33931136e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/nested_trust_common.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#ifndef _NESTED_TRUST_COMMON_H +#define _NESTED_TRUST_COMMON_H + +#include <stdbool.h> + +bool bpf_cpumask_test_cpu(unsigned int cpu, const struct cpumask *cpumask) __ksym; +bool bpf_cpumask_first_zero(const struct cpumask *cpumask) __ksym; + +#endif /* _NESTED_TRUST_COMMON_H */ diff --git a/tools/testing/selftests/bpf/progs/nested_trust_failure.c b/tools/testing/selftests/bpf/progs/nested_trust_failure.c new file mode 100644 index 000000000000..14aff7676436 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/nested_trust_failure.c @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +#include "nested_trust_common.h" + +char _license[] SEC("license") = "GPL"; + +/* Prototype for all of the program trace events below: + * + * TRACE_EVENT(task_newtask, + * TP_PROTO(struct task_struct *p, u64 clone_flags) + */ + +SEC("tp_btf/task_newtask") +__failure __msg("R2 must be referenced or trusted") +int BPF_PROG(test_invalid_nested_user_cpus, struct task_struct *task, u64 clone_flags) +{ + bpf_cpumask_test_cpu(0, task->user_cpus_ptr); + return 0; +} + +SEC("tp_btf/task_newtask") +__failure __msg("R1 must have zero offset when passed to release func or trusted arg to kfunc") +int BPF_PROG(test_invalid_nested_offset, struct task_struct *task, u64 clone_flags) +{ + bpf_cpumask_first_zero(&task->cpus_mask); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/nested_trust_success.c b/tools/testing/selftests/bpf/progs/nested_trust_success.c new file mode 100644 index 000000000000..886ade4aa99d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/nested_trust_success.c @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +#include "nested_trust_common.h" + +char _license[] SEC("license") = "GPL"; + +SEC("tp_btf/task_newtask") +__success +int BPF_PROG(test_read_cpumask, struct task_struct *task, u64 clone_flags) +{ + bpf_cpumask_test_cpu(0, task->cpus_ptr); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/profiler.inc.h b/tools/testing/selftests/bpf/progs/profiler.inc.h index 7bd76b9e0f98..875513866032 100644 --- a/tools/testing/selftests/bpf/progs/profiler.inc.h +++ b/tools/testing/selftests/bpf/progs/profiler.inc.h @@ -156,10 +156,10 @@ probe_read_lim(void* dst, void* src, unsigned long len, unsigned long max) { len = len < max ? len : max; if (len > 1) { - if (bpf_probe_read(dst, len, src)) + if (bpf_probe_read_kernel(dst, len, src)) return 0; } else if (len == 1) { - if (bpf_probe_read(dst, 1, src)) + if (bpf_probe_read_kernel(dst, 1, src)) return 0; } return len; @@ -216,7 +216,8 @@ static INLINE void* read_full_cgroup_path(struct kernfs_node* cgroup_node, #endif for (int i = 0; i < MAX_CGROUPS_PATH_DEPTH; i++) { filepart_length = - bpf_probe_read_str(payload, MAX_PATH, BPF_CORE_READ(cgroup_node, name)); + bpf_probe_read_kernel_str(payload, MAX_PATH, + BPF_CORE_READ(cgroup_node, name)); if (!cgroup_node) return payload; if (cgroup_node == cgroup_root_node) @@ -303,7 +304,8 @@ static INLINE void* populate_cgroup_info(struct cgroup_data_t* cgroup_data, cgroup_data->cgroup_full_length = 0; size_t cgroup_root_length = - bpf_probe_read_str(payload, MAX_PATH, BPF_CORE_READ(root_kernfs, name)); + bpf_probe_read_kernel_str(payload, MAX_PATH, + BPF_CORE_READ(root_kernfs, name)); barrier_var(cgroup_root_length); if (cgroup_root_length <= MAX_PATH) { barrier_var(cgroup_root_length); @@ -312,7 +314,8 @@ static INLINE void* populate_cgroup_info(struct cgroup_data_t* cgroup_data, } size_t cgroup_proc_length = - bpf_probe_read_str(payload, MAX_PATH, BPF_CORE_READ(proc_kernfs, name)); + bpf_probe_read_kernel_str(payload, MAX_PATH, + BPF_CORE_READ(proc_kernfs, name)); barrier_var(cgroup_proc_length); if (cgroup_proc_length <= MAX_PATH) { barrier_var(cgroup_proc_length); @@ -395,7 +398,8 @@ static INLINE int trace_var_sys_kill(void* ctx, int tpid, int sig) arr_struct = bpf_map_lookup_elem(&data_heap, &zero); if (arr_struct == NULL) return 0; - bpf_probe_read(&arr_struct->array[0], sizeof(arr_struct->array[0]), kill_data); + bpf_probe_read_kernel(&arr_struct->array[0], + sizeof(arr_struct->array[0]), kill_data); } else { int index = get_var_spid_index(arr_struct, spid); @@ -409,8 +413,9 @@ static INLINE int trace_var_sys_kill(void* ctx, int tpid, int sig) #endif for (int i = 0; i < ARRAY_SIZE(arr_struct->array); i++) if (arr_struct->array[i].meta.pid == 0) { - bpf_probe_read(&arr_struct->array[i], - sizeof(arr_struct->array[i]), kill_data); + bpf_probe_read_kernel(&arr_struct->array[i], + sizeof(arr_struct->array[i]), + kill_data); bpf_map_update_elem(&var_tpid_to_data, &tpid, arr_struct, 0); @@ -427,17 +432,17 @@ static INLINE int trace_var_sys_kill(void* ctx, int tpid, int sig) if (delta_sec < STALE_INFO) { kill_data->kill_count++; kill_data->last_kill_time = bpf_ktime_get_ns(); - bpf_probe_read(&arr_struct->array[index], - sizeof(arr_struct->array[index]), - kill_data); + bpf_probe_read_kernel(&arr_struct->array[index], + sizeof(arr_struct->array[index]), + kill_data); } else { struct var_kill_data_t* kill_data = get_var_kill_data(ctx, spid, tpid, sig); if (kill_data == NULL) return 0; - bpf_probe_read(&arr_struct->array[index], - sizeof(arr_struct->array[index]), - kill_data); + bpf_probe_read_kernel(&arr_struct->array[index], + sizeof(arr_struct->array[index]), + kill_data); } } bpf_map_update_elem(&var_tpid_to_data, &tpid, arr_struct, 0); @@ -487,8 +492,9 @@ read_absolute_file_path_from_dentry(struct dentry* filp_dentry, void* payload) #pragma unroll #endif for (int i = 0; i < MAX_PATH_DEPTH; i++) { - filepart_length = bpf_probe_read_str(payload, MAX_PATH, - BPF_CORE_READ(filp_dentry, d_name.name)); + filepart_length = + bpf_probe_read_kernel_str(payload, MAX_PATH, + BPF_CORE_READ(filp_dentry, d_name.name)); barrier_var(filepart_length); if (filepart_length > MAX_PATH) break; @@ -572,7 +578,8 @@ ssize_t BPF_KPROBE(kprobe__proc_sys_write, sysctl_data->sysctl_val_length = 0; sysctl_data->sysctl_path_length = 0; - size_t sysctl_val_length = bpf_probe_read_str(payload, CTL_MAXNAME, buf); + size_t sysctl_val_length = bpf_probe_read_kernel_str(payload, + CTL_MAXNAME, buf); barrier_var(sysctl_val_length); if (sysctl_val_length <= CTL_MAXNAME) { barrier_var(sysctl_val_length); @@ -580,8 +587,10 @@ ssize_t BPF_KPROBE(kprobe__proc_sys_write, payload += sysctl_val_length; } - size_t sysctl_path_length = bpf_probe_read_str(payload, MAX_PATH, - BPF_CORE_READ(filp, f_path.dentry, d_name.name)); + size_t sysctl_path_length = + bpf_probe_read_kernel_str(payload, MAX_PATH, + BPF_CORE_READ(filp, f_path.dentry, + d_name.name)); barrier_var(sysctl_path_length); if (sysctl_path_length <= MAX_PATH) { barrier_var(sysctl_path_length); @@ -638,7 +647,8 @@ int raw_tracepoint__sched_process_exit(void* ctx) struct var_kill_data_t* past_kill_data = &arr_struct->array[i]; if (past_kill_data != NULL && past_kill_data->kill_target_pid == tpid) { - bpf_probe_read(kill_data, sizeof(*past_kill_data), past_kill_data); + bpf_probe_read_kernel(kill_data, sizeof(*past_kill_data), + past_kill_data); void* payload = kill_data->payload; size_t offset = kill_data->payload_length; if (offset >= MAX_METADATA_PAYLOAD_LEN + MAX_CGROUP_PAYLOAD_LEN) @@ -656,8 +666,10 @@ int raw_tracepoint__sched_process_exit(void* ctx) payload += comm_length; } - size_t cgroup_proc_length = bpf_probe_read_str(payload, KILL_TARGET_LEN, - BPF_CORE_READ(proc_kernfs, name)); + size_t cgroup_proc_length = + bpf_probe_read_kernel_str(payload, + KILL_TARGET_LEN, + BPF_CORE_READ(proc_kernfs, name)); barrier_var(cgroup_proc_length); if (cgroup_proc_length <= KILL_TARGET_LEN) { barrier_var(cgroup_proc_length); @@ -718,7 +730,8 @@ int raw_tracepoint__sched_process_exec(struct bpf_raw_tracepoint_args* ctx) proc_exec_data->parent_start_time = BPF_CORE_READ(parent_task, start_time); const char* filename = BPF_CORE_READ(bprm, filename); - size_t bin_path_length = bpf_probe_read_str(payload, MAX_FILENAME_LEN, filename); + size_t bin_path_length = + bpf_probe_read_kernel_str(payload, MAX_FILENAME_LEN, filename); barrier_var(bin_path_length); if (bin_path_length <= MAX_FILENAME_LEN) { barrier_var(bin_path_length); @@ -922,7 +935,8 @@ int BPF_KPROBE(kprobe__vfs_symlink, struct inode* dir, struct dentry* dentry, filemod_data->payload); payload = populate_cgroup_info(&filemod_data->cgroup_data, task, payload); - size_t len = bpf_probe_read_str(payload, MAX_FILEPATH_LENGTH, oldname); + size_t len = bpf_probe_read_kernel_str(payload, MAX_FILEPATH_LENGTH, + oldname); barrier_var(len); if (len <= MAX_FILEPATH_LENGTH) { barrier_var(len); diff --git a/tools/testing/selftests/bpf/progs/rbtree.c b/tools/testing/selftests/bpf/progs/rbtree.c new file mode 100644 index 000000000000..e5db1a4287e5 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree.c @@ -0,0 +1,176 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> +#include "bpf_experimental.h" + +struct node_data { + long key; + long data; + struct bpf_rb_node node; +}; + +long less_callback_ran = -1; +long removed_key = -1; +long first_data[2] = {-1, -1}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); + +static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + less_callback_ran = 1; + + return node_a->key < node_b->key; +} + +static long __add_three(struct bpf_rb_root *root, struct bpf_spin_lock *lock) +{ + struct node_data *n, *m; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + n->key = 5; + + m = bpf_obj_new(typeof(*m)); + if (!m) { + bpf_obj_drop(n); + return 2; + } + m->key = 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_rbtree_add(&groot, &m->node, less); + bpf_spin_unlock(&glock); + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 3; + n->key = 3; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("tc") +long rbtree_add_nodes(void *ctx) +{ + return __add_three(&groot, &glock); +} + +SEC("tc") +long rbtree_add_and_remove(void *ctx) +{ + struct bpf_rb_node *res = NULL; + struct node_data *n, *m; + + n = bpf_obj_new(typeof(*n)); + if (!n) + goto err_out; + n->key = 5; + + m = bpf_obj_new(typeof(*m)); + if (!m) + goto err_out; + m->key = 3; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_rbtree_add(&groot, &m->node, less); + res = bpf_rbtree_remove(&groot, &n->node); + bpf_spin_unlock(&glock); + + n = container_of(res, struct node_data, node); + removed_key = n->key; + + bpf_obj_drop(n); + + return 0; +err_out: + if (n) + bpf_obj_drop(n); + if (m) + bpf_obj_drop(m); + return 1; +} + +SEC("tc") +long rbtree_first_and_remove(void *ctx) +{ + struct bpf_rb_node *res = NULL; + struct node_data *n, *m, *o; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + n->key = 3; + n->data = 4; + + m = bpf_obj_new(typeof(*m)); + if (!m) + goto err_out; + m->key = 5; + m->data = 6; + + o = bpf_obj_new(typeof(*o)); + if (!o) + goto err_out; + o->key = 1; + o->data = 2; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_rbtree_add(&groot, &m->node, less); + bpf_rbtree_add(&groot, &o->node, less); + + res = bpf_rbtree_first(&groot); + if (!res) { + bpf_spin_unlock(&glock); + return 2; + } + + o = container_of(res, struct node_data, node); + first_data[0] = o->data; + + res = bpf_rbtree_remove(&groot, &o->node); + bpf_spin_unlock(&glock); + + o = container_of(res, struct node_data, node); + removed_key = o->key; + + bpf_obj_drop(o); + + bpf_spin_lock(&glock); + res = bpf_rbtree_first(&groot); + if (!res) { + bpf_spin_unlock(&glock); + return 3; + } + + o = container_of(res, struct node_data, node); + first_data[1] = o->data; + bpf_spin_unlock(&glock); + + return 0; +err_out: + if (n) + bpf_obj_drop(n); + if (m) + bpf_obj_drop(m); + return 1; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c new file mode 100644 index 000000000000..60079b202c07 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> +#include "bpf_experimental.h" + +struct node_data { + int key; + int data; + struct bpf_rb_node node; +}; + +struct node_data2 { + int key; + struct bpf_rb_node node; + int data; +}; + +static bool less2(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data2 *node_a; + struct node_data2 *node_b; + + node_a = container_of(a, struct node_data2, node); + node_b = container_of(b, struct node_data2, node); + + return node_a->key < node_b->key; +} + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); + +SEC("tc") +long rbtree_api_add__add_wrong_type(void *ctx) +{ + struct node_data2 *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less2); + bpf_spin_unlock(&glock); + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c new file mode 100644 index 000000000000..340f97da1084 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> +#include "bpf_experimental.h" + +/* BTF load should fail as bpf_rb_root __contains this type and points to + * 'node', but 'node' is not a bpf_rb_node + */ +struct node_data { + int key; + int data; + struct bpf_list_node node; +}; + +static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + + return node_a->key < node_b->key; +} + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); + +SEC("tc") +long rbtree_api_add__wrong_node_type(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_first(&groot); + bpf_spin_unlock(&glock); + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/rbtree_fail.c b/tools/testing/selftests/bpf/progs/rbtree_fail.c new file mode 100644 index 000000000000..bf3cba115897 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree_fail.c @@ -0,0 +1,322 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_core_read.h> +#include "bpf_experimental.h" +#include "bpf_misc.h" + +struct node_data { + long key; + long data; + struct bpf_rb_node node; +}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); +private(A) struct bpf_rb_root groot2 __contains(node_data, node); + +static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + + return node_a->key < node_b->key; +} + +SEC("?tc") +__failure __msg("bpf_spin_lock at off=16 must be held for bpf_rb_root") +long rbtree_api_nolock_add(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_rbtree_add(&groot, &n->node, less); + return 0; +} + +SEC("?tc") +__failure __msg("bpf_spin_lock at off=16 must be held for bpf_rb_root") +long rbtree_api_nolock_remove(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + + bpf_rbtree_remove(&groot, &n->node); + return 0; +} + +SEC("?tc") +__failure __msg("bpf_spin_lock at off=16 must be held for bpf_rb_root") +long rbtree_api_nolock_first(void *ctx) +{ + bpf_rbtree_first(&groot); + return 0; +} + +SEC("?tc") +__failure __msg("rbtree_remove node input must be non-owning ref") +long rbtree_api_remove_unadded_node(void *ctx) +{ + struct node_data *n, *m; + struct bpf_rb_node *res; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + m = bpf_obj_new(typeof(*m)); + if (!m) { + bpf_obj_drop(n); + return 1; + } + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + + /* This remove should pass verifier */ + res = bpf_rbtree_remove(&groot, &n->node); + n = container_of(res, struct node_data, node); + + /* This remove shouldn't, m isn't in an rbtree */ + res = bpf_rbtree_remove(&groot, &m->node); + m = container_of(res, struct node_data, node); + bpf_spin_unlock(&glock); + + if (n) + bpf_obj_drop(n); + if (m) + bpf_obj_drop(m); + return 0; +} + +SEC("?tc") +__failure __msg("Unreleased reference id=2 alloc_insn=11") +long rbtree_api_remove_no_drop(void *ctx) +{ + struct bpf_rb_node *res; + struct node_data *n; + + bpf_spin_lock(&glock); + res = bpf_rbtree_first(&groot); + if (!res) + goto unlock_err; + + res = bpf_rbtree_remove(&groot, res); + + n = container_of(res, struct node_data, node); + bpf_spin_unlock(&glock); + + /* bpf_obj_drop(n) is missing here */ + return 0; + +unlock_err: + bpf_spin_unlock(&glock); + return 1; +} + +SEC("?tc") +__failure __msg("arg#1 expected pointer to allocated object") +long rbtree_api_add_to_multiple_trees(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + + /* This add should fail since n already in groot's tree */ + bpf_rbtree_add(&groot2, &n->node, less); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("?tc") +__failure __msg("rbtree_remove node input must be non-owning ref") +long rbtree_api_add_release_unlock_escape(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + + bpf_spin_lock(&glock); + /* After add() in previous critical section, n should be + * release_on_unlock and released after previous spin_unlock, + * so should not be possible to use it here + */ + bpf_rbtree_remove(&groot, &n->node); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("?tc") +__failure __msg("rbtree_remove node input must be non-owning ref") +long rbtree_api_release_aliasing(void *ctx) +{ + struct node_data *n, *m, *o; + struct bpf_rb_node *res; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + + bpf_spin_lock(&glock); + + /* m and o point to the same node, + * but verifier doesn't know this + */ + res = bpf_rbtree_first(&groot); + if (!res) + return 1; + o = container_of(res, struct node_data, node); + + res = bpf_rbtree_first(&groot); + if (!res) + return 1; + m = container_of(res, struct node_data, node); + + bpf_rbtree_remove(&groot, &m->node); + /* This second remove shouldn't be possible. Retval of previous + * remove returns owning reference to m, which is the same + * node o's non-owning ref is pointing at + * + * In order to preserve property + * * owning ref must not be in rbtree + * * non-owning ref must be in rbtree + * + * o's ref must be invalidated after previous remove. Otherwise + * we'd have non-owning ref to node that isn't in rbtree, and + * verifier wouldn't be able to use type system to prevent remove + * of ref that already isn't in any tree. Would have to do runtime + * checks in that case. + */ + bpf_rbtree_remove(&groot, &o->node); + + bpf_spin_unlock(&glock); + return 0; +} + +SEC("?tc") +__failure __msg("rbtree_remove node input must be non-owning ref") +long rbtree_api_first_release_unlock_escape(void *ctx) +{ + struct bpf_rb_node *res; + struct node_data *n; + + bpf_spin_lock(&glock); + res = bpf_rbtree_first(&groot); + if (res) + n = container_of(res, struct node_data, node); + bpf_spin_unlock(&glock); + + bpf_spin_lock(&glock); + /* After first() in previous critical section, n should be + * release_on_unlock and released after previous spin_unlock, + * so should not be possible to use it here + */ + bpf_rbtree_remove(&groot, &n->node); + bpf_spin_unlock(&glock); + return 0; +} + +static bool less__bad_fn_call_add(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_add(&groot, &node_a->node, less); + + return node_a->key < node_b->key; +} + +static bool less__bad_fn_call_remove(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_remove(&groot, &node_a->node); + + return node_a->key < node_b->key; +} + +static bool less__bad_fn_call_first_unlock_after(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_first(&groot); + bpf_spin_unlock(&glock); + + return node_a->key < node_b->key; +} + +static __always_inline +long add_with_cb(bool (cb)(struct bpf_rb_node *a, const struct bpf_rb_node *b)) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, cb); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("?tc") +__failure __msg("arg#1 expected pointer to allocated object") +long rbtree_api_add_bad_cb_bad_fn_call_add(void *ctx) +{ + return add_with_cb(less__bad_fn_call_add); +} + +SEC("?tc") +__failure __msg("rbtree_remove not allowed in rbtree cb") +long rbtree_api_add_bad_cb_bad_fn_call_remove(void *ctx) +{ + return add_with_cb(less__bad_fn_call_remove); +} + +SEC("?tc") +__failure __msg("can't spin_{lock,unlock} in rbtree cb") +long rbtree_api_add_bad_cb_bad_fn_call_first_unlock_after(void *ctx) +{ + return add_with_cb(less__bad_fn_call_first_unlock_after); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c index 9523333b8905..7a438600ae98 100644 --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c @@ -22,6 +22,7 @@ int nr_active; int nr_connect; int nr_binddev; int nr_socket_post_create; +int nr_fin_wait1; struct sockopt_test { int opt; @@ -386,6 +387,13 @@ int skops_sockopt(struct bpf_sock_ops *skops) nr_passive += !(bpf_test_sockopt(skops, sk) || test_tcp_maxseg(skops, sk) || test_tcp_saved_syn(skops, sk)); + bpf_sock_ops_cb_flags_set(skops, + skops->bpf_sock_ops_cb_flags | + BPF_SOCK_OPS_STATE_CB_FLAG); + break; + case BPF_SOCK_OPS_STATE_CB: + if (skops->args[1] == BPF_TCP_CLOSE_WAIT) + nr_fin_wait1 += !bpf_test_sockopt(skops, sk); break; } diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h index 753718595c26..e562be6356f3 100644 --- a/tools/testing/selftests/bpf/progs/strobemeta.h +++ b/tools/testing/selftests/bpf/progs/strobemeta.h @@ -135,7 +135,7 @@ struct strobe_value_loc { * tpidr_el0 for aarch64). * TLS_IMM_EXEC: absolute address of GOT entry containing offset * from thread pointer; - * TLS_GENERAL_DYN: absolute addres of double GOT entry + * TLS_GENERAL_DYN: absolute address of double GOT entry * containing tls_index_t struct; */ int64_t offset; diff --git a/tools/testing/selftests/bpf/progs/task_kfunc_failure.c b/tools/testing/selftests/bpf/progs/task_kfunc_failure.c index 1b47b94dbca0..f19d54eda4f1 100644 --- a/tools/testing/selftests/bpf/progs/task_kfunc_failure.c +++ b/tools/testing/selftests/bpf/progs/task_kfunc_failure.c @@ -5,6 +5,7 @@ #include <bpf/bpf_tracing.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" #include "task_kfunc_common.h" char _license[] SEC("license") = "GPL"; @@ -27,6 +28,7 @@ static struct __tasks_kfunc_map_value *insert_lookup_task(struct task_struct *ta } SEC("tp_btf/task_newtask") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(task_kfunc_acquire_untrusted, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; @@ -44,6 +46,7 @@ int BPF_PROG(task_kfunc_acquire_untrusted, struct task_struct *task, u64 clone_f } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 pointer type STRUCT task_struct must point") int BPF_PROG(task_kfunc_acquire_fp, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired, *stack_task = (struct task_struct *)&clone_flags; @@ -56,6 +59,7 @@ int BPF_PROG(task_kfunc_acquire_fp, struct task_struct *task, u64 clone_flags) } SEC("kretprobe/free_task") +__failure __msg("reg type unsupported for arg#0 function") int BPF_PROG(task_kfunc_acquire_unsafe_kretprobe, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; @@ -68,6 +72,7 @@ int BPF_PROG(task_kfunc_acquire_unsafe_kretprobe, struct task_struct *task, u64 } SEC("tp_btf/task_newtask") +__failure __msg("R1 must be referenced or trusted") int BPF_PROG(task_kfunc_acquire_trusted_walked, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; @@ -81,6 +86,7 @@ int BPF_PROG(task_kfunc_acquire_trusted_walked, struct task_struct *task, u64 cl SEC("tp_btf/task_newtask") +__failure __msg("Possibly NULL pointer passed to trusted arg0") int BPF_PROG(task_kfunc_acquire_null, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; @@ -95,6 +101,7 @@ int BPF_PROG(task_kfunc_acquire_null, struct task_struct *task, u64 clone_flags) } SEC("tp_btf/task_newtask") +__failure __msg("Unreleased reference") int BPF_PROG(task_kfunc_acquire_unreleased, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; @@ -107,6 +114,7 @@ int BPF_PROG(task_kfunc_acquire_unreleased, struct task_struct *task, u64 clone_ } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 expected pointer to map value") int BPF_PROG(task_kfunc_get_non_kptr_param, struct task_struct *task, u64 clone_flags) { struct task_struct *kptr; @@ -122,6 +130,7 @@ int BPF_PROG(task_kfunc_get_non_kptr_param, struct task_struct *task, u64 clone_ } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 expected pointer to map value") int BPF_PROG(task_kfunc_get_non_kptr_acquired, struct task_struct *task, u64 clone_flags) { struct task_struct *kptr, *acquired; @@ -140,6 +149,7 @@ int BPF_PROG(task_kfunc_get_non_kptr_acquired, struct task_struct *task, u64 clo } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 expected pointer to map value") int BPF_PROG(task_kfunc_get_null, struct task_struct *task, u64 clone_flags) { struct task_struct *kptr; @@ -155,6 +165,7 @@ int BPF_PROG(task_kfunc_get_null, struct task_struct *task, u64 clone_flags) } SEC("tp_btf/task_newtask") +__failure __msg("Unreleased reference") int BPF_PROG(task_kfunc_xchg_unreleased, struct task_struct *task, u64 clone_flags) { struct task_struct *kptr; @@ -174,6 +185,7 @@ int BPF_PROG(task_kfunc_xchg_unreleased, struct task_struct *task, u64 clone_fla } SEC("tp_btf/task_newtask") +__failure __msg("Unreleased reference") int BPF_PROG(task_kfunc_get_unreleased, struct task_struct *task, u64 clone_flags) { struct task_struct *kptr; @@ -193,6 +205,7 @@ int BPF_PROG(task_kfunc_get_unreleased, struct task_struct *task, u64 clone_flag } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 is untrusted_ptr_or_null_ expected ptr_ or socket") int BPF_PROG(task_kfunc_release_untrusted, struct task_struct *task, u64 clone_flags) { struct __tasks_kfunc_map_value *v; @@ -208,6 +221,7 @@ int BPF_PROG(task_kfunc_release_untrusted, struct task_struct *task, u64 clone_f } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 pointer type STRUCT task_struct must point") int BPF_PROG(task_kfunc_release_fp, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired = (struct task_struct *)&clone_flags; @@ -219,6 +233,7 @@ int BPF_PROG(task_kfunc_release_fp, struct task_struct *task, u64 clone_flags) } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 is ptr_or_null_ expected ptr_ or socket") int BPF_PROG(task_kfunc_release_null, struct task_struct *task, u64 clone_flags) { struct __tasks_kfunc_map_value local, *v; @@ -251,6 +266,7 @@ int BPF_PROG(task_kfunc_release_null, struct task_struct *task, u64 clone_flags) } SEC("tp_btf/task_newtask") +__failure __msg("release kernel function bpf_task_release expects") int BPF_PROG(task_kfunc_release_unacquired, struct task_struct *task, u64 clone_flags) { /* Cannot release trusted task pointer which was not acquired. */ @@ -260,6 +276,7 @@ int BPF_PROG(task_kfunc_release_unacquired, struct task_struct *task, u64 clone_ } SEC("tp_btf/task_newtask") +__failure __msg("arg#0 is ptr_or_null_ expected ptr_ or socket") int BPF_PROG(task_kfunc_from_pid_no_null_check, struct task_struct *task, u64 clone_flags) { struct task_struct *acquired; @@ -273,6 +290,7 @@ int BPF_PROG(task_kfunc_from_pid_no_null_check, struct task_struct *task, u64 cl } SEC("lsm/task_free") +__failure __msg("reg type unsupported for arg#0 function") int BPF_PROG(task_kfunc_from_lsm_task_free, struct task_struct *task) { struct task_struct *acquired; diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c index a1e45fec8938..3b5dc34d23e9 100644 --- a/tools/testing/selftests/bpf/progs/test_attach_probe.c +++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c @@ -92,18 +92,19 @@ int handle_uretprobe_byname(struct pt_regs *ctx) } SEC("uprobe") -int handle_uprobe_byname2(struct pt_regs *ctx) +int BPF_UPROBE(handle_uprobe_byname2, const char *pathname, const char *mode) { - unsigned int size = PT_REGS_PARM1(ctx); + char mode_buf[2] = {}; - /* verify malloc size */ - if (size == 1) + /* verify fopen mode */ + bpf_probe_read_user(mode_buf, sizeof(mode_buf), mode); + if (mode_buf[0] == 'r' && mode_buf[1] == 0) uprobe_byname2_res = 7; return 0; } SEC("uretprobe") -int handle_uretprobe_byname2(struct pt_regs *ctx) +int BPF_URETPROBE(handle_uretprobe_byname2, void *ret) { uretprobe_byname2_res = 8; return 0; diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf.c b/tools/testing/selftests/bpf/progs/test_bpf_nf.c index 227e85e85dda..9fc603c9d673 100644 --- a/tools/testing/selftests/bpf/progs/test_bpf_nf.c +++ b/tools/testing/selftests/bpf/progs/test_bpf_nf.c @@ -34,6 +34,11 @@ __be16 dport = 0; int test_exist_lookup = -ENOENT; u32 test_exist_lookup_mark = 0; +enum nf_nat_manip_type___local { + NF_NAT_MANIP_SRC___local, + NF_NAT_MANIP_DST___local +}; + struct nf_conn; struct bpf_ct_opts___local { @@ -58,7 +63,7 @@ int bpf_ct_change_timeout(struct nf_conn *, u32) __ksym; int bpf_ct_set_status(struct nf_conn *, u32) __ksym; int bpf_ct_change_status(struct nf_conn *, u32) __ksym; int bpf_ct_set_nat_info(struct nf_conn *, union nf_inet_addr *, - int port, enum nf_nat_manip_type) __ksym; + int port, enum nf_nat_manip_type___local) __ksym; static __always_inline void nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32, @@ -157,10 +162,10 @@ nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32, /* snat */ saddr.ip = bpf_get_prandom_u32(); - bpf_ct_set_nat_info(ct, &saddr, sport, NF_NAT_MANIP_SRC); + bpf_ct_set_nat_info(ct, &saddr, sport, NF_NAT_MANIP_SRC___local); /* dnat */ daddr.ip = bpf_get_prandom_u32(); - bpf_ct_set_nat_info(ct, &daddr, dport, NF_NAT_MANIP_DST); + bpf_ct_set_nat_info(ct, &daddr, dport, NF_NAT_MANIP_DST___local); ct_ins = bpf_ct_insert_entry(ct); if (ct_ins) { diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect.c b/tools/testing/selftests/bpf/progs/test_cls_redirect.c index 2833ad722cb7..66b304982245 100644 --- a/tools/testing/selftests/bpf/progs/test_cls_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_cls_redirect.c @@ -600,7 +600,7 @@ static INLINING ret_t get_next_hop(buf_t *pkt, encap_headers_t *encap, return TC_ACT_SHOT; } - /* Skip the remainig next hops (may be zero). */ + /* Skip the remaining next hops (may be zero). */ return skip_next_hops(pkt, encap->unigue.hop_count - encap->unigue.next_hop - 1); } @@ -610,8 +610,8 @@ static INLINING ret_t get_next_hop(buf_t *pkt, encap_headers_t *encap, * * fill_tuple(&t, foo, sizeof(struct iphdr), 123, 321) * - * clang will substitue a costant for sizeof, which allows the verifier - * to track it's value. Based on this, it can figure out the constant + * clang will substitute a constant for sizeof, which allows the verifier + * to track its value. Based on this, it can figure out the constant * return value, and calling code works while still being "generic" to * IPv4 and IPv6. */ diff --git a/tools/testing/selftests/bpf/progs/test_global_func1.c b/tools/testing/selftests/bpf/progs/test_global_func1.c index 7b42dad187b8..23970a20b324 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func1.c +++ b/tools/testing/selftests/bpf/progs/test_global_func1.c @@ -3,10 +3,9 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" -#ifndef MAX_STACK #define MAX_STACK (512 - 3 * 32 + 8) -#endif static __attribute__ ((noinline)) int f0(int var, struct __sk_buff *skb) @@ -39,7 +38,8 @@ int f3(int val, struct __sk_buff *skb, int var) } SEC("tc") -int test_cls(struct __sk_buff *skb) +__failure __msg("combined stack size of 4 calls is 544") +int global_func1(struct __sk_buff *skb) { return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4); } diff --git a/tools/testing/selftests/bpf/progs/test_global_func10.c b/tools/testing/selftests/bpf/progs/test_global_func10.c index 97b7031d0e22..98327bdbbfd2 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func10.c +++ b/tools/testing/selftests/bpf/progs/test_global_func10.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" struct Small { int x; @@ -21,7 +22,8 @@ __noinline int foo(const struct Big *big) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("invalid indirect read from stack") +int global_func10(struct __sk_buff *skb) { const struct Small small = {.x = skb->len }; diff --git a/tools/testing/selftests/bpf/progs/test_global_func11.c b/tools/testing/selftests/bpf/progs/test_global_func11.c index ef5277d982d9..283e036dc401 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func11.c +++ b/tools/testing/selftests/bpf/progs/test_global_func11.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" struct S { int x; @@ -13,7 +14,8 @@ __noinline int foo(const struct S *s) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("Caller passes invalid args into func#1") +int global_func11(struct __sk_buff *skb) { return foo((const void *)skb); } diff --git a/tools/testing/selftests/bpf/progs/test_global_func12.c b/tools/testing/selftests/bpf/progs/test_global_func12.c index 62343527cc59..7f159d83c6f6 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func12.c +++ b/tools/testing/selftests/bpf/progs/test_global_func12.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" struct S { int x; @@ -13,7 +14,8 @@ __noinline int foo(const struct S *s) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("invalid mem access 'mem_or_null'") +int global_func12(struct __sk_buff *skb) { const struct S s = {.x = skb->len }; diff --git a/tools/testing/selftests/bpf/progs/test_global_func13.c b/tools/testing/selftests/bpf/progs/test_global_func13.c index ff8897c1ac22..02ea80da75b5 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func13.c +++ b/tools/testing/selftests/bpf/progs/test_global_func13.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" struct S { int x; @@ -16,7 +17,8 @@ __noinline int foo(const struct S *s) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("Caller passes invalid args into func#1") +int global_func13(struct __sk_buff *skb) { const struct S *s = (const struct S *)(0xbedabeda); diff --git a/tools/testing/selftests/bpf/progs/test_global_func14.c b/tools/testing/selftests/bpf/progs/test_global_func14.c index 698c77199ebf..33b7d5efd7b2 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func14.c +++ b/tools/testing/selftests/bpf/progs/test_global_func14.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" struct S; @@ -14,7 +15,8 @@ __noinline int foo(const struct S *s) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("reference type('FWD S') size cannot be determined") +int global_func14(struct __sk_buff *skb) { return foo(NULL); diff --git a/tools/testing/selftests/bpf/progs/test_global_func15.c b/tools/testing/selftests/bpf/progs/test_global_func15.c index c19c435988d5..b512d6a6c75e 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func15.c +++ b/tools/testing/selftests/bpf/progs/test_global_func15.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __noinline int foo(unsigned int *v) { @@ -12,7 +13,8 @@ __noinline int foo(unsigned int *v) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("At program exit the register R0 has value") +int global_func15(struct __sk_buff *skb) { unsigned int v = 1; diff --git a/tools/testing/selftests/bpf/progs/test_global_func16.c b/tools/testing/selftests/bpf/progs/test_global_func16.c index 0312d1e8d8c0..e7206304632e 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func16.c +++ b/tools/testing/selftests/bpf/progs/test_global_func16.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __noinline int foo(int (*arr)[10]) { @@ -12,7 +13,8 @@ __noinline int foo(int (*arr)[10]) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__failure __msg("invalid indirect read from stack") +int global_func16(struct __sk_buff *skb) { int array[10]; diff --git a/tools/testing/selftests/bpf/progs/test_global_func17.c b/tools/testing/selftests/bpf/progs/test_global_func17.c index 2b8b9b8ba018..a32e11c7d933 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func17.c +++ b/tools/testing/selftests/bpf/progs/test_global_func17.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only #include <vmlinux.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __noinline int foo(int *p) { @@ -10,7 +11,8 @@ __noinline int foo(int *p) const volatile int i; SEC("tc") -int test_cls(struct __sk_buff *skb) +__failure __msg("Caller passes invalid args into func#1") +int global_func17(struct __sk_buff *skb) { return foo((int *)&i); } diff --git a/tools/testing/selftests/bpf/progs/test_global_func2.c b/tools/testing/selftests/bpf/progs/test_global_func2.c index 2c18d82923a2..3dce97fb52a4 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func2.c +++ b/tools/testing/selftests/bpf/progs/test_global_func2.c @@ -1,4 +1,45 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright (c) 2020 Facebook */ +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + #define MAX_STACK (512 - 3 * 32) -#include "test_global_func1.c" + +static __attribute__ ((noinline)) +int f0(int var, struct __sk_buff *skb) +{ + return skb->len; +} + +__attribute__ ((noinline)) +int f1(struct __sk_buff *skb) +{ + volatile char buf[MAX_STACK] = {}; + + return f0(0, skb) + skb->len; +} + +int f3(int, struct __sk_buff *skb, int); + +__attribute__ ((noinline)) +int f2(int val, struct __sk_buff *skb) +{ + return f1(skb) + f3(val, skb, 1); +} + +__attribute__ ((noinline)) +int f3(int val, struct __sk_buff *skb, int var) +{ + volatile char buf[MAX_STACK] = {}; + + return skb->ifindex * val * var; +} + +SEC("tc") +__success +int global_func2(struct __sk_buff *skb) +{ + return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func3.c b/tools/testing/selftests/bpf/progs/test_global_func3.c index 01bf8275dfd6..142b682d3c2f 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func3.c +++ b/tools/testing/selftests/bpf/progs/test_global_func3.c @@ -3,6 +3,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __attribute__ ((noinline)) int f1(struct __sk_buff *skb) @@ -46,20 +47,15 @@ int f7(struct __sk_buff *skb) return f6(skb); } -#ifndef NO_FN8 __attribute__ ((noinline)) int f8(struct __sk_buff *skb) { return f7(skb); } -#endif SEC("tc") -int test_cls(struct __sk_buff *skb) +__failure __msg("the call stack of 8 frames") +int global_func3(struct __sk_buff *skb) { -#ifndef NO_FN8 return f8(skb); -#else - return f7(skb); -#endif } diff --git a/tools/testing/selftests/bpf/progs/test_global_func4.c b/tools/testing/selftests/bpf/progs/test_global_func4.c index 610f75edf276..1733d87ad3f3 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func4.c +++ b/tools/testing/selftests/bpf/progs/test_global_func4.c @@ -1,4 +1,55 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright (c) 2020 Facebook */ -#define NO_FN8 -#include "test_global_func3.c" +#include <stddef.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +__attribute__ ((noinline)) +int f1(struct __sk_buff *skb) +{ + return skb->len; +} + +__attribute__ ((noinline)) +int f2(int val, struct __sk_buff *skb) +{ + return f1(skb) + val; +} + +__attribute__ ((noinline)) +int f3(int val, struct __sk_buff *skb, int var) +{ + return f2(var, skb) + val; +} + +__attribute__ ((noinline)) +int f4(struct __sk_buff *skb) +{ + return f3(1, skb, 2); +} + +__attribute__ ((noinline)) +int f5(struct __sk_buff *skb) +{ + return f4(skb); +} + +__attribute__ ((noinline)) +int f6(struct __sk_buff *skb) +{ + return f5(skb); +} + +__attribute__ ((noinline)) +int f7(struct __sk_buff *skb) +{ + return f6(skb); +} + +SEC("tc") +__success +int global_func4(struct __sk_buff *skb) +{ + return f7(skb); +} diff --git a/tools/testing/selftests/bpf/progs/test_global_func5.c b/tools/testing/selftests/bpf/progs/test_global_func5.c index 9248d03e0d06..cc55aedaf82d 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func5.c +++ b/tools/testing/selftests/bpf/progs/test_global_func5.c @@ -3,6 +3,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __attribute__ ((noinline)) int f1(struct __sk_buff *skb) @@ -25,7 +26,8 @@ int f3(int val, struct __sk_buff *skb) } SEC("tc") -int test_cls(struct __sk_buff *skb) +__failure __msg("expected pointer to ctx, but got PTR") +int global_func5(struct __sk_buff *skb) { return f1(skb) + f2(2, skb) + f3(3, skb); } diff --git a/tools/testing/selftests/bpf/progs/test_global_func6.c b/tools/testing/selftests/bpf/progs/test_global_func6.c index af8c78bdfb25..46c38c8f2cf0 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func6.c +++ b/tools/testing/selftests/bpf/progs/test_global_func6.c @@ -3,6 +3,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __attribute__ ((noinline)) int f1(struct __sk_buff *skb) @@ -25,7 +26,8 @@ int f3(int val, struct __sk_buff *skb) } SEC("tc") -int test_cls(struct __sk_buff *skb) +__failure __msg("modified ctx ptr R2") +int global_func6(struct __sk_buff *skb) { return f1(skb) + f2(2, skb) + f3(3, skb); } diff --git a/tools/testing/selftests/bpf/progs/test_global_func7.c b/tools/testing/selftests/bpf/progs/test_global_func7.c index 6cb8e2f5254c..f182febfde3c 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func7.c +++ b/tools/testing/selftests/bpf/progs/test_global_func7.c @@ -3,6 +3,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __attribute__ ((noinline)) void foo(struct __sk_buff *skb) @@ -11,7 +12,8 @@ void foo(struct __sk_buff *skb) } SEC("tc") -int test_cls(struct __sk_buff *skb) +__failure __msg("foo() doesn't return scalar") +int global_func7(struct __sk_buff *skb) { foo(skb); return 0; diff --git a/tools/testing/selftests/bpf/progs/test_global_func8.c b/tools/testing/selftests/bpf/progs/test_global_func8.c index d55a6544b1ab..9b9c57fa2dd3 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func8.c +++ b/tools/testing/selftests/bpf/progs/test_global_func8.c @@ -3,6 +3,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" __noinline int foo(struct __sk_buff *skb) { @@ -10,7 +11,8 @@ __noinline int foo(struct __sk_buff *skb) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__success +int global_func8(struct __sk_buff *skb) { if (!foo(skb)) return 0; diff --git a/tools/testing/selftests/bpf/progs/test_global_func9.c b/tools/testing/selftests/bpf/progs/test_global_func9.c index bd233ddede98..1f2cb0159b8d 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func9.c +++ b/tools/testing/selftests/bpf/progs/test_global_func9.c @@ -2,6 +2,7 @@ #include <stddef.h> #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +#include "bpf_misc.h" struct S { int x; @@ -74,7 +75,8 @@ __noinline int quuz(int **p) } SEC("cgroup_skb/ingress") -int test_cls(struct __sk_buff *skb) +__success +int global_func9(struct __sk_buff *skb) { int result = 0; diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c new file mode 100644 index 000000000000..7faa8eef0598 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_core_read.h> +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +static long stack[256]; + +/* + * KPROBE contexts + */ + +__weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) +{ + return bpf_get_stack(ctx, &stack, sizeof(stack), 0); +} + +SEC("?kprobe") +__success +int kprobe_typedef_ctx(void *ctx) +{ + return kprobe_typedef_ctx_subprog(ctx); +} + +#define pt_regs_struct_t typeof(*(__PT_REGS_CAST((struct pt_regs *)NULL))) + +__weak int kprobe_struct_ctx_subprog(pt_regs_struct_t *ctx) +{ + return bpf_get_stack((void *)ctx, &stack, sizeof(stack), 0); +} + +SEC("?kprobe") +__success +int kprobe_resolved_ctx(void *ctx) +{ + return kprobe_struct_ctx_subprog(ctx); +} + +/* this is current hack to make this work on old kernels */ +struct bpf_user_pt_regs_t {}; + +__weak int kprobe_workaround_ctx_subprog(struct bpf_user_pt_regs_t *ctx) +{ + return bpf_get_stack(ctx, &stack, sizeof(stack), 0); +} + +SEC("?kprobe") +__success +int kprobe_workaround_ctx(void *ctx) +{ + return kprobe_workaround_ctx_subprog(ctx); +} + +/* + * RAW_TRACEPOINT contexts + */ + +__weak int raw_tp_ctx_subprog(struct bpf_raw_tracepoint_args *ctx) +{ + return bpf_get_stack(ctx, &stack, sizeof(stack), 0); +} + +SEC("?raw_tp") +__success +int raw_tp_ctx(void *ctx) +{ + return raw_tp_ctx_subprog(ctx); +} + +/* + * RAW_TRACEPOINT_WRITABLE contexts + */ + +__weak int raw_tp_writable_ctx_subprog(struct bpf_raw_tracepoint_args *ctx) +{ + return bpf_get_stack(ctx, &stack, sizeof(stack), 0); +} + +SEC("?raw_tp") +__success +int raw_tp_writable_ctx(void *ctx) +{ + return raw_tp_writable_ctx_subprog(ctx); +} + +/* + * PERF_EVENT contexts + */ + +__weak int perf_event_ctx_subprog(struct bpf_perf_event_data *ctx) +{ + return bpf_get_stack(ctx, &stack, sizeof(stack), 0); +} + +SEC("?perf_event") +__success +int perf_event_ctx(void *ctx) +{ + return perf_event_ctx_subprog(ctx); +} diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c index f4a8250329b2..2fbef3cc7ad8 100644 --- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c +++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c @@ -10,6 +10,7 @@ #include <errno.h> #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> +#include "bpf_misc.h" extern struct bpf_key *bpf_lookup_system_key(__u64 id) __ksym; extern void bpf_key_put(struct bpf_key *key) __ksym; @@ -19,6 +20,7 @@ extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_ptr, struct { __uint(type, BPF_MAP_TYPE_RINGBUF); + __uint(max_entries, 4096); } ringbuf SEC(".maps"); struct { @@ -33,6 +35,7 @@ int err, pid; char _license[] SEC("license") = "GPL"; SEC("?lsm.s/bpf") +__failure __msg("cannot pass in dynptr at an offset=-8") int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size) { unsigned long val; @@ -42,6 +45,7 @@ int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size) } SEC("?lsm.s/bpf") +__failure __msg("arg#0 expected pointer to stack or dynptr_ptr") int BPF_PROG(not_ptr_to_stack, int cmd, union bpf_attr *attr, unsigned int size) { unsigned long val; diff --git a/tools/testing/selftests/bpf/progs/test_sk_assign.c b/tools/testing/selftests/bpf/progs/test_sk_assign.c index 98c6493d9b91..21b19b758c4e 100644 --- a/tools/testing/selftests/bpf/progs/test_sk_assign.c +++ b/tools/testing/selftests/bpf/progs/test_sk_assign.c @@ -16,6 +16,16 @@ #include <bpf/bpf_helpers.h> #include <bpf/bpf_endian.h> +#if defined(IPROUTE2_HAVE_LIBBPF) +/* Use a new-style map definition. */ +struct { + __uint(type, BPF_MAP_TYPE_SOCKMAP); + __type(key, int); + __type(value, __u64); + __uint(pinning, LIBBPF_PIN_BY_NAME); + __uint(max_entries, 1); +} server_map SEC(".maps"); +#else /* Pin map under /sys/fs/bpf/tc/globals/<map name> */ #define PIN_GLOBAL_NS 2 @@ -35,6 +45,7 @@ struct { .max_elem = 1, .pinning = PIN_GLOBAL_NS, }; +#endif char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/test_sk_assign_libbpf.c b/tools/testing/selftests/bpf/progs/test_sk_assign_libbpf.c new file mode 100644 index 000000000000..dcf46adfda04 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_sk_assign_libbpf.c @@ -0,0 +1,3 @@ +// SPDX-License-Identifier: GPL-2.0 +#define IPROUTE2_HAVE_LIBBPF +#include "test_sk_assign.c" diff --git a/tools/testing/selftests/bpf/progs/test_subprogs.c b/tools/testing/selftests/bpf/progs/test_subprogs.c index f8e9256cf18d..a8d602d7c88a 100644 --- a/tools/testing/selftests/bpf/progs/test_subprogs.c +++ b/tools/testing/selftests/bpf/progs/test_subprogs.c @@ -47,7 +47,7 @@ static __noinline int sub5(int v) return sub1(v) - 1; /* compensates sub1()'s + 1 */ } -/* unfortunately verifier rejects `struct task_struct *t` as an unkown pointer +/* unfortunately verifier rejects `struct task_struct *t` as an unknown pointer * type, so we need to accept pointer as integer and then cast it inside the * function */ diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c index a0e7762b1e5a..e6e678aa9874 100644 --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c @@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000; #define VXLAN_FLAGS 0x8 #define VXLAN_VNI 1 +#ifndef NEXTHDR_DEST +#define NEXTHDR_DEST 60 +#endif + /* MPLS label 1000 with S bit (last label) set and ttl of 255. */ static const __u32 mpls_label = __bpf_constant_htonl(1000 << 12 | MPLS_LS_S_MASK | 0xff); @@ -363,6 +367,61 @@ static __always_inline int __encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, return TC_ACT_OK; } +static int encap_ipv6_ipip6(struct __sk_buff *skb) +{ + struct iphdr iph_inner; + struct v6hdr h_outer; + struct tcphdr tcph; + struct ethhdr eth; + __u64 flags; + int olen; + + if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner, + sizeof(iph_inner)) < 0) + return TC_ACT_OK; + + /* filter only packets we want */ + if (bpf_skb_load_bytes(skb, ETH_HLEN + (iph_inner.ihl << 2), + &tcph, sizeof(tcph)) < 0) + return TC_ACT_OK; + + if (tcph.dest != __bpf_constant_htons(cfg_port)) + return TC_ACT_OK; + + olen = sizeof(h_outer.ip); + + flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV6; + + /* add room between mac and network header */ + if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags)) + return TC_ACT_SHOT; + + /* prepare new outer network header */ + memset(&h_outer.ip, 0, sizeof(h_outer.ip)); + h_outer.ip.version = 6; + h_outer.ip.hop_limit = iph_inner.ttl; + h_outer.ip.saddr.s6_addr[1] = 0xfd; + h_outer.ip.saddr.s6_addr[15] = 1; + h_outer.ip.daddr.s6_addr[1] = 0xfd; + h_outer.ip.daddr.s6_addr[15] = 2; + h_outer.ip.payload_len = iph_inner.tot_len; + h_outer.ip.nexthdr = IPPROTO_IPIP; + + /* store new outer network header */ + if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen, + BPF_F_INVALIDATE_HASH) < 0) + return TC_ACT_SHOT; + + /* update eth->h_proto */ + if (bpf_skb_load_bytes(skb, 0, ð, sizeof(eth)) < 0) + return TC_ACT_SHOT; + eth.h_proto = bpf_htons(ETH_P_IPV6); + if (bpf_skb_store_bytes(skb, 0, ð, sizeof(eth), 0) < 0) + return TC_ACT_SHOT; + + return TC_ACT_OK; +} + static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, __u16 l2_proto) { @@ -461,6 +520,15 @@ int __encap_ip6tnl_none(struct __sk_buff *skb) return TC_ACT_OK; } +SEC("encap_ipip6_none") +int __encap_ipip6_none(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv6_ipip6(skb); + else + return TC_ACT_OK; +} + SEC("encap_ip6gre_none") int __encap_ip6gre_none(struct __sk_buff *skb) { @@ -528,13 +596,33 @@ int __encap_ip6vxlan_eth(struct __sk_buff *skb) static int decap_internal(struct __sk_buff *skb, int off, int len, char proto) { + __u64 flags = BPF_F_ADJ_ROOM_FIXED_GSO; + struct ipv6_opt_hdr ip6_opt_hdr; struct gre_hdr greh; struct udphdr udph; int olen = len; switch (proto) { case IPPROTO_IPIP: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV4; + break; case IPPROTO_IPV6: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV6; + break; + case NEXTHDR_DEST: + if (bpf_skb_load_bytes(skb, off + len, &ip6_opt_hdr, + sizeof(ip6_opt_hdr)) < 0) + return TC_ACT_OK; + switch (ip6_opt_hdr.nexthdr) { + case IPPROTO_IPIP: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV4; + break; + case IPPROTO_IPV6: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV6; + break; + default: + return TC_ACT_OK; + } break; case IPPROTO_GRE: olen += sizeof(struct gre_hdr); @@ -569,8 +657,7 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto) return TC_ACT_OK; } - if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, - BPF_F_ADJ_ROOM_FIXED_GSO)) + if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, flags)) return TC_ACT_SHOT; return TC_ACT_OK; diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 98af55f0bcd3..508da4a23c4f 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -82,6 +82,27 @@ int gre_set_tunnel(struct __sk_buff *skb) } SEC("tc") +int gre_set_tunnel_no_key(struct __sk_buff *skb) +{ + int ret; + struct bpf_tunnel_key key; + + __builtin_memset(&key, 0x0, sizeof(key)); + key.remote_ipv4 = 0xac100164; /* 172.16.1.100 */ + key.tunnel_ttl = 64; + + ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), + BPF_F_ZERO_CSUM_TX | BPF_F_SEQ_NUMBER | + BPF_F_NO_TUNNEL_KEY); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + return TC_ACT_OK; +} + +SEC("tc") int gre_get_tunnel(struct __sk_buff *skb) { int ret; diff --git a/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c b/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c index ab75522e2eeb..da4bf89d004c 100644 --- a/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c +++ b/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c @@ -6,18 +6,22 @@ #include <bpf/bpf_core_read.h> #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> +#include "bpf_misc.h" int uprobe_byname_parm1 = 0; int uprobe_byname_ran = 0; int uretprobe_byname_rc = 0; +int uretprobe_byname_ret = 0; int uretprobe_byname_ran = 0; -size_t uprobe_byname2_parm1 = 0; +u64 uprobe_byname2_parm1 = 0; int uprobe_byname2_ran = 0; -char *uretprobe_byname2_rc = NULL; +u64 uretprobe_byname2_rc = 0; int uretprobe_byname2_ran = 0; int test_pid; +int a[8]; + /* This program cannot auto-attach, but that should not stop other * programs from attaching. */ @@ -28,44 +32,84 @@ int handle_uprobe_noautoattach(struct pt_regs *ctx) } SEC("uprobe//proc/self/exe:autoattach_trigger_func") -int handle_uprobe_byname(struct pt_regs *ctx) +int BPF_UPROBE(handle_uprobe_byname + , int arg1 + , int arg2 + , int arg3 +#if FUNC_REG_ARG_CNT > 3 + , int arg4 +#endif +#if FUNC_REG_ARG_CNT > 4 + , int arg5 +#endif +#if FUNC_REG_ARG_CNT > 5 + , int arg6 +#endif +#if FUNC_REG_ARG_CNT > 6 + , int arg7 +#endif +#if FUNC_REG_ARG_CNT > 7 + , int arg8 +#endif +) { uprobe_byname_parm1 = PT_REGS_PARM1_CORE(ctx); uprobe_byname_ran = 1; + + a[0] = arg1; + a[1] = arg2; + a[2] = arg3; +#if FUNC_REG_ARG_CNT > 3 + a[3] = arg4; +#endif +#if FUNC_REG_ARG_CNT > 4 + a[4] = arg5; +#endif +#if FUNC_REG_ARG_CNT > 5 + a[5] = arg6; +#endif +#if FUNC_REG_ARG_CNT > 6 + a[6] = arg7; +#endif +#if FUNC_REG_ARG_CNT > 7 + a[7] = arg8; +#endif return 0; } SEC("uretprobe//proc/self/exe:autoattach_trigger_func") -int handle_uretprobe_byname(struct pt_regs *ctx) +int BPF_URETPROBE(handle_uretprobe_byname, int ret) { uretprobe_byname_rc = PT_REGS_RC_CORE(ctx); + uretprobe_byname_ret = ret; uretprobe_byname_ran = 2; + return 0; } -SEC("uprobe/libc.so.6:malloc") -int handle_uprobe_byname2(struct pt_regs *ctx) +SEC("uprobe/libc.so.6:fopen") +int BPF_UPROBE(handle_uprobe_byname2, const char *pathname, const char *mode) { int pid = bpf_get_current_pid_tgid() >> 32; /* ignore irrelevant invocations */ if (test_pid != pid) return 0; - uprobe_byname2_parm1 = PT_REGS_PARM1_CORE(ctx); + uprobe_byname2_parm1 = (u64)(long)pathname; uprobe_byname2_ran = 3; return 0; } -SEC("uretprobe/libc.so.6:malloc") -int handle_uretprobe_byname2(struct pt_regs *ctx) +SEC("uretprobe/libc.so.6:fopen") +int BPF_URETPROBE(handle_uretprobe_byname2, void *ret) { int pid = bpf_get_current_pid_tgid() >> 32; /* ignore irrelevant invocations */ if (test_pid != pid) return 0; - uretprobe_byname2_rc = (char *)PT_REGS_RC_CORE(ctx); + uretprobe_byname2_rc = (u64)(long)ret; uretprobe_byname2_ran = 4; return 0; } diff --git a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c index ce419304ff1f..7748cc23de8a 100644 --- a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c +++ b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c @@ -59,10 +59,14 @@ int BPF_PROG(bpf, int cmd, union bpf_attr *attr, unsigned int size) if (!data_val) return 0; - bpf_probe_read(&value, sizeof(value), &attr->value); - - bpf_copy_from_user(data_val, sizeof(struct data), - (void *)(unsigned long)value); + ret = bpf_probe_read_kernel(&value, sizeof(value), &attr->value); + if (ret) + return ret; + + ret = bpf_copy_from_user(data_val, sizeof(struct data), + (void *)(unsigned long)value); + if (ret) + return ret; if (data_val->data_len > sizeof(data_val->data)) return -EINVAL; diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c b/tools/testing/selftests/bpf/progs/test_vmlinux.c index e9dfa0313d1b..4b8e37f7fd06 100644 --- a/tools/testing/selftests/bpf/progs/test_vmlinux.c +++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c @@ -42,7 +42,7 @@ int BPF_PROG(handle__raw_tp, struct pt_regs *regs, long id) if (id != __NR_nanosleep) return 0; - ts = (void *)PT_REGS_PARM1_CORE(regs); + ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs); if (bpf_probe_read_user(&tv_nsec, sizeof(ts->tv_nsec), &ts->tv_nsec) || tv_nsec != MY_TV_NSEC) return 0; @@ -60,7 +60,7 @@ int BPF_PROG(handle__tp_btf, struct pt_regs *regs, long id) if (id != __NR_nanosleep) return 0; - ts = (void *)PT_REGS_PARM1_CORE(regs); + ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs); if (bpf_probe_read_user(&tv_nsec, sizeof(ts->tv_nsec), &ts->tv_nsec) || tv_nsec != MY_TV_NSEC) return 0; diff --git a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c index 53b64c999450..297c260fc364 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c @@ -9,6 +9,12 @@ int _xdp_adjust_tail_grow(struct xdp_md *xdp) void *data = (void *)(long)xdp->data; int data_len = bpf_xdp_get_buff_len(xdp); int offset = 0; + /* SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) */ +#if defined(__TARGET_ARCH_s390) + int tailroom = 512; +#else + int tailroom = 320; +#endif /* Data length determine test case */ @@ -20,7 +26,7 @@ int _xdp_adjust_tail_grow(struct xdp_md *xdp) offset = 128; } else if (data_len == 128) { /* Max tail grow 3520 */ - offset = 4096 - 256 - 320 - data_len; + offset = 4096 - 256 - tailroom - data_len; } else if (data_len == 9000) { offset = 10; } else if (data_len == 9001) { diff --git a/tools/testing/selftests/bpf/progs/test_xdp_vlan.c b/tools/testing/selftests/bpf/progs/test_xdp_vlan.c index 134768f6b788..4ddcb6dfe500 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_vlan.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_vlan.c @@ -98,7 +98,7 @@ bool parse_eth_frame(struct ethhdr *eth, void *data_end, struct parse_pkt *pkt) return true; } -/* Hint, VLANs are choosen to hit network-byte-order issues */ +/* Hint, VLANs are chosen to hit network-byte-order issues */ #define TESTVLAN 4011 /* 0xFAB */ // #define TO_VLAN 4000 /* 0xFA0 (hint 0xOA0 = 160) */ @@ -195,7 +195,7 @@ int xdp_prognum2(struct xdp_md *ctx) /* Moving Ethernet header, dest overlap with src, memmove handle this */ dest = data; - dest+= VLAN_HDR_SZ; + dest += VLAN_HDR_SZ; /* * Notice: Taking over vlan_hdr->h_vlan_encapsulated_proto, by * only moving two MAC addrs (12 bytes), not overwriting last 2 bytes diff --git a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c index f3201dc69a60..03ee946c6bf7 100644 --- a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c +++ b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c @@ -16,6 +16,7 @@ struct sample { struct { __uint(type, BPF_MAP_TYPE_USER_RINGBUF); + __uint(max_entries, 4096); } user_ringbuf SEC(".maps"); struct { @@ -39,7 +40,8 @@ bad_access1(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to read before the pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("negative offset dynptr_ptr ptr") int user_ringbuf_callback_bad_access1(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, bad_access1, NULL, 0); @@ -61,7 +63,8 @@ bad_access2(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to read past the end of the pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("dereference of modified dynptr_ptr ptr") int user_ringbuf_callback_bad_access2(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, bad_access2, NULL, 0); @@ -80,7 +83,8 @@ write_forbidden(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to write to that pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("invalid mem access 'dynptr_ptr'") int user_ringbuf_callback_write_forbidden(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, write_forbidden, NULL, 0); @@ -99,7 +103,8 @@ null_context_write(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to write to that pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("invalid mem access 'scalar'") int user_ringbuf_callback_null_context_write(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, null_context_write, NULL, 0); @@ -120,7 +125,8 @@ null_context_read(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to write to that pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("invalid mem access 'scalar'") int user_ringbuf_callback_null_context_read(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, null_context_read, NULL, 0); @@ -139,7 +145,8 @@ try_discard_dynptr(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to read past the end of the pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("cannot release unowned const bpf_dynptr") int user_ringbuf_callback_discard_dynptr(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, try_discard_dynptr, NULL, 0); @@ -158,7 +165,8 @@ try_submit_dynptr(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to read past the end of the pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("cannot release unowned const bpf_dynptr") int user_ringbuf_callback_submit_dynptr(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, try_submit_dynptr, NULL, 0); @@ -175,7 +183,8 @@ invalid_drain_callback_return(struct bpf_dynptr *dynptr, void *context) /* A callback that accesses a dynptr in a bpf_user_ringbuf_drain callback should * not be able to write to that pointer. */ -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("At callback return the register R0 has value") int user_ringbuf_callback_invalid_return(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, invalid_drain_callback_return, NULL, 0); @@ -197,14 +206,16 @@ try_reinit_dynptr_ringbuf(struct bpf_dynptr *dynptr, void *context) return 0; } -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("Dynptr has to be an uninitialized dynptr") int user_ringbuf_callback_reinit_dynptr_mem(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_mem, NULL, 0); return 0; } -SEC("?raw_tp/") +SEC("?raw_tp") +__failure __msg("Dynptr has to be an uninitialized dynptr") int user_ringbuf_callback_reinit_dynptr_ringbuf(void *ctx) { bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_ringbuf, NULL, 0); diff --git a/tools/testing/selftests/bpf/progs/xdp_features.c b/tools/testing/selftests/bpf/progs/xdp_features.c new file mode 100644 index 000000000000..87c247d56f72 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_features.c @@ -0,0 +1,269 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <stdbool.h> +#include <linux/bpf.h> +#include <linux/netdev.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_endian.h> +#include <bpf/bpf_tracing.h> +#include <linux/if_ether.h> +#include <linux/ip.h> +#include <linux/ipv6.h> +#include <linux/in.h> +#include <linux/in6.h> +#include <linux/udp.h> +#include <asm-generic/errno-base.h> + +#include "xdp_features.h" + +#define ipv6_addr_equal(a, b) ((a).s6_addr32[0] == (b).s6_addr32[0] && \ + (a).s6_addr32[1] == (b).s6_addr32[1] && \ + (a).s6_addr32[2] == (b).s6_addr32[2] && \ + (a).s6_addr32[3] == (b).s6_addr32[3]) + +struct net_device; +struct bpf_prog; + +struct xdp_cpumap_stats { + unsigned int redirect; + unsigned int pass; + unsigned int drop; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, __u32); + __type(value, __u32); + __uint(max_entries, 1); +} stats SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, __u32); + __type(value, __u32); + __uint(max_entries, 1); +} dut_stats SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_CPUMAP); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(struct bpf_cpumap_val)); + __uint(max_entries, 1); +} cpu_map SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(struct bpf_devmap_val)); + __uint(max_entries, 1); +} dev_map SEC(".maps"); + +const volatile struct in6_addr tester_addr; +const volatile struct in6_addr dut_addr; + +static __always_inline int +xdp_process_echo_packet(struct xdp_md *xdp, bool dut) +{ + void *data_end = (void *)(long)xdp->data_end; + void *data = (void *)(long)xdp->data; + struct ethhdr *eh = data; + struct tlv_hdr *tlv; + struct udphdr *uh; + __be16 port; + __u8 *cmd; + + if (eh + 1 > (struct ethhdr *)data_end) + return -EINVAL; + + if (eh->h_proto == bpf_htons(ETH_P_IP)) { + struct iphdr *ih = (struct iphdr *)(eh + 1); + __be32 saddr = dut ? tester_addr.s6_addr32[3] + : dut_addr.s6_addr32[3]; + __be32 daddr = dut ? dut_addr.s6_addr32[3] + : tester_addr.s6_addr32[3]; + + ih = (struct iphdr *)(eh + 1); + if (ih + 1 > (struct iphdr *)data_end) + return -EINVAL; + + if (saddr != ih->saddr) + return -EINVAL; + + if (daddr != ih->daddr) + return -EINVAL; + + if (ih->protocol != IPPROTO_UDP) + return -EINVAL; + + uh = (struct udphdr *)(ih + 1); + } else if (eh->h_proto == bpf_htons(ETH_P_IPV6)) { + struct in6_addr saddr = dut ? tester_addr : dut_addr; + struct in6_addr daddr = dut ? dut_addr : tester_addr; + struct ipv6hdr *ih6 = (struct ipv6hdr *)(eh + 1); + + if (ih6 + 1 > (struct ipv6hdr *)data_end) + return -EINVAL; + + if (!ipv6_addr_equal(saddr, ih6->saddr)) + return -EINVAL; + + if (!ipv6_addr_equal(daddr, ih6->daddr)) + return -EINVAL; + + if (ih6->nexthdr != IPPROTO_UDP) + return -EINVAL; + + uh = (struct udphdr *)(ih6 + 1); + } else { + return -EINVAL; + } + + if (uh + 1 > (struct udphdr *)data_end) + return -EINVAL; + + port = dut ? uh->dest : uh->source; + if (port != bpf_htons(DUT_ECHO_PORT)) + return -EINVAL; + + tlv = (struct tlv_hdr *)(uh + 1); + if (tlv + 1 > data_end) + return -EINVAL; + + return bpf_htons(tlv->type) == CMD_ECHO ? 0 : -EINVAL; +} + +static __always_inline int +xdp_update_stats(struct xdp_md *xdp, bool tx, bool dut) +{ + __u32 *val, key = 0; + + if (xdp_process_echo_packet(xdp, tx)) + return -EINVAL; + + if (dut) + val = bpf_map_lookup_elem(&dut_stats, &key); + else + val = bpf_map_lookup_elem(&stats, &key); + + if (val) + __sync_add_and_fetch(val, 1); + + return 0; +} + +/* Tester */ + +SEC("xdp") +int xdp_tester_check_tx(struct xdp_md *xdp) +{ + xdp_update_stats(xdp, true, false); + + return XDP_PASS; +} + +SEC("xdp") +int xdp_tester_check_rx(struct xdp_md *xdp) +{ + xdp_update_stats(xdp, false, false); + + return XDP_PASS; +} + +/* DUT */ + +SEC("xdp") +int xdp_do_pass(struct xdp_md *xdp) +{ + xdp_update_stats(xdp, true, true); + + return XDP_PASS; +} + +SEC("xdp") +int xdp_do_drop(struct xdp_md *xdp) +{ + if (xdp_update_stats(xdp, true, true)) + return XDP_PASS; + + return XDP_DROP; +} + +SEC("xdp") +int xdp_do_aborted(struct xdp_md *xdp) +{ + if (xdp_process_echo_packet(xdp, true)) + return XDP_PASS; + + return XDP_ABORTED; +} + +SEC("xdp") +int xdp_do_tx(struct xdp_md *xdp) +{ + void *data = (void *)(long)xdp->data; + struct ethhdr *eh = data; + __u8 tmp_mac[ETH_ALEN]; + + if (xdp_update_stats(xdp, true, true)) + return XDP_PASS; + + __builtin_memcpy(tmp_mac, eh->h_source, ETH_ALEN); + __builtin_memcpy(eh->h_source, eh->h_dest, ETH_ALEN); + __builtin_memcpy(eh->h_dest, tmp_mac, ETH_ALEN); + + return XDP_TX; +} + +SEC("xdp") +int xdp_do_redirect(struct xdp_md *xdp) +{ + if (xdp_process_echo_packet(xdp, true)) + return XDP_PASS; + + return bpf_redirect_map(&cpu_map, 0, 0); +} + +SEC("tp_btf/xdp_exception") +int BPF_PROG(xdp_exception, const struct net_device *dev, + const struct bpf_prog *xdp, __u32 act) +{ + __u32 *val, key = 0; + + val = bpf_map_lookup_elem(&dut_stats, &key); + if (val) + __sync_add_and_fetch(val, 1); + + return 0; +} + +SEC("tp_btf/xdp_cpumap_kthread") +int BPF_PROG(tp_xdp_cpumap_kthread, int map_id, unsigned int processed, + unsigned int drops, int sched, struct xdp_cpumap_stats *xdp_stats) +{ + __u32 *val, key = 0; + + val = bpf_map_lookup_elem(&dut_stats, &key); + if (val) + __sync_add_and_fetch(val, 1); + + return 0; +} + +SEC("xdp/cpumap") +int xdp_do_redirect_cpumap(struct xdp_md *xdp) +{ + void *data = (void *)(long)xdp->data; + struct ethhdr *eh = data; + __u8 tmp_mac[ETH_ALEN]; + + if (xdp_process_echo_packet(xdp, true)) + return XDP_PASS; + + __builtin_memcpy(tmp_mac, eh->h_source, ETH_ALEN); + __builtin_memcpy(eh->h_source, eh->h_dest, ETH_ALEN); + __builtin_memcpy(eh->h_dest, tmp_mac, ETH_ALEN); + + return bpf_redirect_map(&dev_map, 0, 0); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c new file mode 100644 index 000000000000..4c55b4d79d3d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <vmlinux.h> +#include "xdp_metadata.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_endian.h> + +struct { + __uint(type, BPF_MAP_TYPE_XSKMAP); + __uint(max_entries, 256); + __type(key, __u32); + __type(value, __u32); +} xsk SEC(".maps"); + +extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, + __u64 *timestamp) __ksym; +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, + __u32 *hash) __ksym; + +SEC("xdp") +int rx(struct xdp_md *ctx) +{ + void *data, *data_meta, *data_end; + struct ipv6hdr *ip6h = NULL; + struct ethhdr *eth = NULL; + struct udphdr *udp = NULL; + struct iphdr *iph = NULL; + struct xdp_meta *meta; + int ret; + + data = (void *)(long)ctx->data; + data_end = (void *)(long)ctx->data_end; + eth = data; + if (eth + 1 < data_end) { + if (eth->h_proto == bpf_htons(ETH_P_IP)) { + iph = (void *)(eth + 1); + if (iph + 1 < data_end && iph->protocol == IPPROTO_UDP) + udp = (void *)(iph + 1); + } + if (eth->h_proto == bpf_htons(ETH_P_IPV6)) { + ip6h = (void *)(eth + 1); + if (ip6h + 1 < data_end && ip6h->nexthdr == IPPROTO_UDP) + udp = (void *)(ip6h + 1); + } + if (udp && udp + 1 > data_end) + udp = NULL; + } + + if (!udp) + return XDP_PASS; + + if (udp->dest != bpf_htons(9091)) + return XDP_PASS; + + bpf_printk("forwarding UDP:9091 to AF_XDP"); + + ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta)); + if (ret != 0) { + bpf_printk("bpf_xdp_adjust_meta returned %d", ret); + return XDP_PASS; + } + + data = (void *)(long)ctx->data; + data_meta = (void *)(long)ctx->data_meta; + meta = data_meta; + + if (meta + 1 > data) { + bpf_printk("bpf_xdp_adjust_meta doesn't appear to work"); + return XDP_PASS; + } + + if (!bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp)) + bpf_printk("populated rx_timestamp with %llu", meta->rx_timestamp); + else + meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */ + + if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash)) + bpf_printk("populated rx_hash with %u", meta->rx_hash); + else + meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */ + + return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c new file mode 100644 index 000000000000..77678b034389 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <vmlinux.h> +#include "xdp_metadata.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_endian.h> + +struct { + __uint(type, BPF_MAP_TYPE_XSKMAP); + __uint(max_entries, 4); + __type(key, __u32); + __type(value, __u32); +} xsk SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u32); +} prog_arr SEC(".maps"); + +extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, + __u64 *timestamp) __ksym; +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, + __u32 *hash) __ksym; + +SEC("xdp") +int rx(struct xdp_md *ctx) +{ + void *data, *data_meta; + struct xdp_meta *meta; + u64 timestamp = -1; + int ret; + + /* Reserve enough for all custom metadata. */ + + ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta)); + if (ret != 0) + return XDP_DROP; + + data = (void *)(long)ctx->data; + data_meta = (void *)(long)ctx->data_meta; + + if (data_meta + sizeof(struct xdp_meta) > data) + return XDP_DROP; + + meta = data_meta; + + /* Export metadata. */ + + /* We expect veth bpf_xdp_metadata_rx_timestamp to return 0 HW + * timestamp, so put some non-zero value into AF_XDP frame for + * the userspace. + */ + bpf_xdp_metadata_rx_timestamp(ctx, ×tamp); + if (timestamp == 0) + meta->rx_timestamp = 1; + + bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash); + + return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata2.c b/tools/testing/selftests/bpf/progs/xdp_metadata2.c new file mode 100644 index 000000000000..cf69d05451c3 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_metadata2.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <vmlinux.h> +#include "xdp_metadata.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_endian.h> + +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, + __u32 *hash) __ksym; + +int called; + +SEC("freplace/rx") +int freplace_rx(struct xdp_md *ctx) +{ + u32 hash = 0; + /* Call _any_ metadata function to make sure we don't crash. */ + bpf_xdp_metadata_rx_hash(ctx, &hash); + called++; + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c index 736686e903f6..07d786329105 100644 --- a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c +++ b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c @@ -310,7 +310,7 @@ static __always_inline void values_get_tcpipopts(__u16 *mss, __u8 *wscale, static __always_inline void values_inc_synacks(void) { __u32 key = 1; - __u32 *value; + __u64 *value; value = bpf_map_lookup_elem(&values, &key); if (value) diff --git a/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c b/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c new file mode 100644 index 000000000000..744a01d0e57d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Intel */ + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct { + __uint(type, BPF_MAP_TYPE_XSKMAP); + __uint(max_entries, 1); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); +} xsk SEC(".maps"); + +static unsigned int idx; + +SEC("xdp") int xsk_def_prog(struct xdp_md *xdp) +{ + return bpf_redirect_map(&xsk, 0, XDP_DROP); +} + +SEC("xdp") int xsk_xdp_drop(struct xdp_md *xdp) +{ + /* Drop every other packet */ + if (idx++ % 2) + return XDP_DROP; + + return bpf_redirect_map(&xsk, 0, XDP_DROP); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_cpp.cpp b/tools/testing/selftests/bpf/test_cpp.cpp index 0bd9990e83fa..f4936834f76f 100644 --- a/tools/testing/selftests/bpf/test_cpp.cpp +++ b/tools/testing/selftests/bpf/test_cpp.cpp @@ -91,7 +91,7 @@ static void try_skeleton_template() skel.detach(); - /* destructor will destory underlying skeleton */ + /* destructor will destroy underlying skeleton */ } int main(int argc, char *argv[]) diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index b73152822aa2..7fc00e423e4d 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -1275,7 +1275,7 @@ static void test_map_in_map(void) goto out_map_in_map; } - err = bpf_obj_get_info_by_fd(fd, &info, &len); + err = bpf_map_get_info_by_fd(fd, &info, &len); if (err) { printf("Failed to get map info by fd %d: %d", fd, errno); diff --git a/tools/testing/selftests/bpf/test_offload.py b/tools/testing/selftests/bpf/test_offload.py index 7cb1bc05e5cf..40cba8d368d9 100755 --- a/tools/testing/selftests/bpf/test_offload.py +++ b/tools/testing/selftests/bpf/test_offload.py @@ -1039,7 +1039,7 @@ try: offload = bpf_pinned("/sys/fs/bpf/offload") ret, _, err = sim.set_xdp(offload, "drv", fail=False, include_stderr=True) fail(ret == 0, "attached offloaded XDP program to drv") - check_extack(err, "Using device-bound program without HW_MODE flag is not supported.", args) + check_extack(err, "Using offloaded program without HW_MODE flag is not supported.", args) rm("/sys/fs/bpf/offload") sim.wait_for_flush() @@ -1088,12 +1088,12 @@ try: ret, _, err = sim.set_xdp(pinned, "offload", fail=False, include_stderr=True) fail(ret == 0, "Pinned program loaded for a different device accepted") - check_extack_nsim(err, "program bound to different dev.", args) + check_extack(err, "Program bound to different device.", args) simdev2.remove() ret, _, err = sim.set_xdp(pinned, "offload", fail=False, include_stderr=True) fail(ret == 0, "Pinned program loaded for a removed device accepted") - check_extack_nsim(err, "xdpoffload of non-bound program.", args) + check_extack(err, "Program bound to different device.", args) rm(pin_file) bpftool_prog_list_wait(expected=0) @@ -1334,12 +1334,12 @@ try: ret, _, err = simA.set_xdp(progB, "offload", force=True, JSON=False, fail=False, include_stderr=True) fail(ret == 0, "cross-ASIC program allowed") - check_extack_nsim(err, "program bound to different dev.", args) + check_extack(err, "Program bound to different device.", args) for d in simdevB.nsims: ret, _, err = d.set_xdp(progA, "offload", force=True, JSON=False, fail=False, include_stderr=True) fail(ret == 0, "cross-ASIC program allowed") - check_extack_nsim(err, "program bound to different dev.", args) + check_extack(err, "Program bound to different device.", args) start_test("Test multi-dev ASIC cross-dev map reuse...") diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c index 4716e38e153a..6d5e3022c75f 100644 --- a/tools/testing/selftests/bpf/test_progs.c +++ b/tools/testing/selftests/bpf/test_progs.c @@ -17,6 +17,7 @@ #include <sys/select.h> #include <sys/socket.h> #include <sys/un.h> +#include <bpf/btf.h> static bool verbose(void) { @@ -967,6 +968,43 @@ int write_sysctl(const char *sysctl, const char *value) return 0; } +int get_bpf_max_tramp_links_from(struct btf *btf) +{ + const struct btf_enum *e; + const struct btf_type *t; + __u32 i, type_cnt; + const char *name; + __u16 j, vlen; + + for (i = 1, type_cnt = btf__type_cnt(btf); i < type_cnt; i++) { + t = btf__type_by_id(btf, i); + if (!t || !btf_is_enum(t) || t->name_off) + continue; + e = btf_enum(t); + for (j = 0, vlen = btf_vlen(t); j < vlen; j++, e++) { + name = btf__str_by_offset(btf, e->name_off); + if (name && !strcmp(name, "BPF_MAX_TRAMP_LINKS")) + return e->val; + } + } + + return -1; +} + +int get_bpf_max_tramp_links(void) +{ + struct btf *vmlinux_btf; + int ret; + + vmlinux_btf = btf__load_vmlinux_btf(); + if (!ASSERT_OK_PTR(vmlinux_btf, "vmlinux btf")) + return -1; + ret = get_bpf_max_tramp_links_from(vmlinux_btf); + btf__free(vmlinux_btf); + + return ret; +} + #define MAX_BACKTRACE_SZ 128 void crash_handler(int signum) { @@ -975,12 +1013,12 @@ void crash_handler(int signum) sz = backtrace(bt, ARRAY_SIZE(bt)); + if (env.stdout) + stdio_restore(); if (env.test) { env.test_state->error_cnt++; dump_test_log(env.test, env.test_state, true, false); } - if (env.stdout) - stdio_restore(); if (env.worker_id != -1) fprintf(stderr, "[%d]: ", env.worker_id); fprintf(stderr, "Caught signal #%d!\nStack trace:\n", signum); diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index 3f058dfadbaf..d5d51ec97ec8 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -394,6 +394,8 @@ int kern_sync_rcu(void); int trigger_module_test_read(int read_sz); int trigger_module_test_write(int write_sz); int write_sysctl(const char *sysctl, const char *value); +int get_bpf_max_tramp_links_from(struct btf *btf); +int get_bpf_max_tramp_links(void); #ifdef __x86_64__ #define SYS_NANOSLEEP_KPROBE_NAME "__x64_sys_nanosleep" diff --git a/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c b/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c index 3256de30f563..ed518d075d1d 100644 --- a/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c +++ b/tools/testing/selftests/bpf/test_skb_cgroup_id_user.c @@ -93,7 +93,7 @@ int get_map_fd_by_prog_id(int prog_id) info.nr_map_ids = 1; info.map_ids = (__u64) (unsigned long) map_ids; - if (bpf_obj_get_info_by_fd(prog_fd, &info, &info_len)) { + if (bpf_prog_get_info_by_fd(prog_fd, &info, &info_len)) { log_err("Failed to get info by prog fd %d", prog_fd); goto err; } diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh index 334bdfeab940..910044f08908 100755 --- a/tools/testing/selftests/bpf/test_tc_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh @@ -100,6 +100,9 @@ if [[ "$#" -eq "0" ]]; then echo "ipip" $0 ipv4 ipip none 100 + echo "ipip6" + $0 ipv4 ipip6 none 100 + echo "ip6ip6" $0 ipv6 ip6tnl none 100 @@ -224,6 +227,9 @@ elif [[ "$tuntype" =~ "gre" && "$mac" == "eth" ]]; then elif [[ "$tuntype" =~ "vxlan" && "$mac" == "eth" ]]; then ttype="vxlan" targs="id 1 dstport 8472 udp6zerocsumrx" +elif [[ "$tuntype" == "ipip6" ]]; then + ttype="ip6tnl" + targs="" else ttype=$tuntype targs="" @@ -233,6 +239,9 @@ fi if [[ "${tuntype}" == "sit" ]]; then link_addr1="${ns1_v4}" link_addr2="${ns2_v4}" +elif [[ "${tuntype}" == "ipip6" ]]; then + link_addr1="${ns1_v6}" + link_addr2="${ns2_v6}" else link_addr1="${addr1}" link_addr2="${addr2}" @@ -287,12 +296,6 @@ else server_listen fi -# bpf_skb_net_shrink does not take tunnel flags yet, cannot update L3. -if [[ "${tuntype}" == "sit" ]]; then - echo OK - exit 0 -fi - # serverside, use BPF for decap ip netns exec "${ns2}" ip link del dev testtun0 ip netns exec "${ns2}" tc qdisc add dev veth2 clsact diff --git a/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c b/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c index 5c8ef062f760..32df93747095 100644 --- a/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c +++ b/tools/testing/selftests/bpf/test_tcp_check_syncookie_user.c @@ -96,7 +96,7 @@ static int get_map_fd_by_prog_id(int prog_id, bool *xdp) info.nr_map_ids = 1; info.map_ids = (__u64)(unsigned long)map_ids; - if (bpf_obj_get_info_by_fd(prog_fd, &info, &info_len)) { + if (bpf_prog_get_info_by_fd(prog_fd, &info, &info_len)) { log_err("Failed to get info by prog fd %d", prog_fd); goto err; } diff --git a/tools/testing/selftests/bpf/test_tunnel.sh b/tools/testing/selftests/bpf/test_tunnel.sh index 2eaedc1d9ed3..06857b689c11 100755 --- a/tools/testing/selftests/bpf/test_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tunnel.sh @@ -66,15 +66,20 @@ config_device() add_gre_tunnel() { + tun_key= + if [ -n "$1" ]; then + tun_key="key $1" + fi + # at_ns0 namespace ip netns exec at_ns0 \ - ip link add dev $DEV_NS type $TYPE seq key 2 \ + ip link add dev $DEV_NS type $TYPE seq $tun_key \ local 172.16.1.100 remote 172.16.1.200 ip netns exec at_ns0 ip link set dev $DEV_NS up ip netns exec at_ns0 ip addr add dev $DEV_NS 10.1.1.100/24 # root namespace - ip link add dev $DEV type $TYPE key 2 external + ip link add dev $DEV type $TYPE $tun_key external ip link set dev $DEV up ip addr add dev $DEV 10.1.1.200/24 } @@ -238,7 +243,7 @@ test_gre() check $TYPE config_device - add_gre_tunnel + add_gre_tunnel 2 attach_bpf $DEV gre_set_tunnel gre_get_tunnel ping $PING_ARG 10.1.1.100 check_err $? @@ -253,6 +258,30 @@ test_gre() echo -e ${GREEN}"PASS: $TYPE"${NC} } +test_gre_no_tunnel_key() +{ + TYPE=gre + DEV_NS=gre00 + DEV=gre11 + ret=0 + + check $TYPE + config_device + add_gre_tunnel + attach_bpf $DEV gre_set_tunnel_no_key gre_get_tunnel + ping $PING_ARG 10.1.1.100 + check_err $? + ip netns exec at_ns0 ping $PING_ARG 10.1.1.200 + check_err $? + cleanup + + if [ $ret -ne 0 ]; then + echo -e ${RED}"FAIL: $TYPE"${NC} + return 1 + fi + echo -e ${GREEN}"PASS: $TYPE"${NC} +} + test_ip6gre() { TYPE=ip6gre @@ -589,6 +618,7 @@ cleanup() ip link del ipip6tnl11 2> /dev/null ip link del ip6ip6tnl11 2> /dev/null ip link del gretap11 2> /dev/null + ip link del gre11 2> /dev/null ip link del ip6gre11 2> /dev/null ip link del ip6gretap11 2> /dev/null ip link del geneve11 2> /dev/null @@ -641,6 +671,10 @@ bpf_tunnel_test() test_gre errors=$(( $errors + $? )) + echo "Testing GRE tunnel (without tunnel keys)..." + test_gre_no_tunnel_key + errors=$(( $errors + $? )) + echo "Testing IP6GRE tunnel..." test_ip6gre errors=$(( $errors + $? )) diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 8c808551dfd7..8b9949bb833d 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -209,7 +209,7 @@ loop: insn[i++] = BPF_MOV64_IMM(BPF_REG_2, 1); insn[i++] = BPF_MOV64_IMM(BPF_REG_3, 2); insn[i++] = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, - BPF_FUNC_skb_vlan_push), + BPF_FUNC_skb_vlan_push); insn[i] = BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, len - i - 3); i++; } @@ -220,7 +220,7 @@ loop: i++; insn[i++] = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6); insn[i++] = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, - BPF_FUNC_skb_vlan_pop), + BPF_FUNC_skb_vlan_pop); insn[i] = BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, len - i - 3); i++; } @@ -1239,8 +1239,8 @@ static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt) __u32 xlated_prog_len; __u32 buf_element_size = sizeof(struct bpf_insn); - if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) { - perror("bpf_obj_get_info_by_fd failed"); + if (bpf_prog_get_info_by_fd(fd_prog, &info, &info_len)) { + perror("bpf_prog_get_info_by_fd failed"); return -1; } @@ -1261,8 +1261,8 @@ static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt) bzero(&info, sizeof(info)); info.xlated_prog_len = xlated_prog_len; info.xlated_prog_insns = (__u64)(unsigned long)*buf; - if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) { - perror("second bpf_obj_get_info_by_fd failed"); + if (bpf_prog_get_info_by_fd(fd_prog, &info, &info_len)) { + perror("second bpf_prog_get_info_by_fd failed"); goto out_free_buf; } diff --git a/tools/testing/selftests/bpf/test_xdp_features.sh b/tools/testing/selftests/bpf/test_xdp_features.sh new file mode 100755 index 000000000000..0aa71c4455c0 --- /dev/null +++ b/tools/testing/selftests/bpf/test_xdp_features.sh @@ -0,0 +1,107 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +readonly NS="ns1-$(mktemp -u XXXXXX)" +readonly V0_IP4=10.10.0.11 +readonly V1_IP4=10.10.0.1 +readonly V0_IP6=2001:db8::11 +readonly V1_IP6=2001:db8::1 + +ret=1 + +setup() { + { + ip netns add ${NS} + + ip link add v1 type veth peer name v0 netns ${NS} + + ip link set v1 up + ip addr add $V1_IP4/24 dev v1 + ip addr add $V1_IP6/64 nodad dev v1 + ip -n ${NS} link set dev v0 up + ip -n ${NS} addr add $V0_IP4/24 dev v0 + ip -n ${NS} addr add $V0_IP6/64 nodad dev v0 + + # Enable XDP mode and disable checksum offload + ethtool -K v1 gro on + ethtool -K v1 tx-checksumming off + ip netns exec ${NS} ethtool -K v0 gro on + ip netns exec ${NS} ethtool -K v0 tx-checksumming off + } > /dev/null 2>&1 +} + +cleanup() { + ip link del v1 2> /dev/null + ip netns del ${NS} 2> /dev/null + [ "$(pidof xdp_features)" = "" ] || kill $(pidof xdp_features) 2> /dev/null +} + +wait_for_dut_server() { + while sleep 1; do + ss -tlp | grep -q xdp_features + [ $? -eq 0 ] && break + done +} + +test_xdp_features() { + setup + + ## XDP_PASS + ./xdp_features -f XDP_PASS -D $V1_IP6 -T $V0_IP6 v1 & + wait_for_dut_server + ip netns exec ${NS} ./xdp_features -t -f XDP_PASS \ + -D $V1_IP6 -C $V1_IP6 \ + -T $V0_IP6 v0 + [ $? -ne 0 ] && exit + + ## XDP_DROP + ./xdp_features -f XDP_DROP -D ::ffff:$V1_IP4 -T ::ffff:$V0_IP4 v1 & + wait_for_dut_server + ip netns exec ${NS} ./xdp_features -t -f XDP_DROP \ + -D ::ffff:$V1_IP4 \ + -C ::ffff:$V1_IP4 \ + -T ::ffff:$V0_IP4 v0 + [ $? -ne 0 ] && exit + + ## XDP_ABORTED + ./xdp_features -f XDP_ABORTED -D $V1_IP6 -T $V0_IP6 v1 & + wait_for_dut_server + ip netns exec ${NS} ./xdp_features -t -f XDP_ABORTED \ + -D $V1_IP6 -C $V1_IP6 \ + -T $V0_IP6 v0 + [ $? -ne 0 ] && exit + + ## XDP_TX + ./xdp_features -f XDP_TX -D ::ffff:$V1_IP4 -T ::ffff:$V0_IP4 v1 & + wait_for_dut_server + ip netns exec ${NS} ./xdp_features -t -f XDP_TX \ + -D ::ffff:$V1_IP4 \ + -C ::ffff:$V1_IP4 \ + -T ::ffff:$V0_IP4 v0 + [ $? -ne 0 ] && exit + + ## XDP_REDIRECT + ./xdp_features -f XDP_REDIRECT -D $V1_IP6 -T $V0_IP6 v1 & + wait_for_dut_server + ip netns exec ${NS} ./xdp_features -t -f XDP_REDIRECT \ + -D $V1_IP6 -C $V1_IP6 \ + -T $V0_IP6 v0 + [ $? -ne 0 ] && exit + + ## XDP_NDO_XMIT + ./xdp_features -f XDP_NDO_XMIT -D ::ffff:$V1_IP4 -T ::ffff:$V0_IP4 v1 & + wait_for_dut_server + ip netns exec ${NS} ./xdp_features -t -f XDP_NDO_XMIT \ + -D ::ffff:$V1_IP4 \ + -C ::ffff:$V1_IP4 \ + -T ::ffff:$V0_IP4 v0 + ret=$? + cleanup +} + +set -e +trap cleanup 2 3 6 9 + +test_xdp_features + +exit $ret diff --git a/tools/testing/selftests/bpf/test_xsk.sh b/tools/testing/selftests/bpf/test_xsk.sh index d821fd098504..b077cf58f825 100755 --- a/tools/testing/selftests/bpf/test_xsk.sh +++ b/tools/testing/selftests/bpf/test_xsk.sh @@ -24,8 +24,6 @@ # ----------- | ---------- # | vethX | --------- | vethY | # ----------- peer ---------- -# | | | -# namespaceX | namespaceY # # AF_XDP is an address family optimized for high performance packet processing, # it is XDP’s user-space interface. @@ -39,10 +37,9 @@ # Prerequisites setup by script: # # Set up veth interfaces as per the topology shown ^^: -# * setup two veth interfaces and one namespace -# ** veth<xxxx> in root namespace -# ** veth<yyyy> in af_xdp<xxxx> namespace -# ** namespace af_xdp<xxxx> +# * setup two veth interfaces +# ** veth<xxxx> +# ** veth<yyyy> # *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid # conflict with any existing interface # * tests the veth and xsk layers of the topology @@ -74,6 +71,9 @@ # Run and dump packet contents: # sudo ./test_xsk.sh -D # +# Set up veth interfaces and leave them up so xskxceiver can be launched in a debugger: +# sudo ./test_xsk.sh -d +# # Run test suite for physical device in loopback mode # sudo ./test_xsk.sh -i IFACE @@ -81,11 +81,12 @@ ETH="" -while getopts "vDi:" flag +while getopts "vDi:d" flag do case "${flag}" in v) verbose=1;; D) dump_pkts=1;; + d) debug=1;; i) ETH=${OPTARG};; esac done @@ -99,28 +100,25 @@ VETH0_POSTFIX=$(cat ${URANDOM} | tr -dc '0-9' | fold -w 256 | head -n 1 | head - VETH0=ve${VETH0_POSTFIX} VETH1_POSTFIX=$(cat ${URANDOM} | tr -dc '0-9' | fold -w 256 | head -n 1 | head --bytes 4) VETH1=ve${VETH1_POSTFIX} -NS0=root -NS1=af_xdp${VETH1_POSTFIX} MTU=1500 trap ctrl_c INT function ctrl_c() { - cleanup_exit ${VETH0} ${VETH1} ${NS1} + cleanup_exit ${VETH0} ${VETH1} exit 1 } setup_vethPairs() { if [[ $verbose -eq 1 ]]; then - echo "setting up ${VETH0}: namespace: ${NS0}" + echo "setting up ${VETH0}" fi - ip netns add ${NS1} ip link add ${VETH0} numtxqueues 4 numrxqueues 4 type veth peer name ${VETH1} numtxqueues 4 numrxqueues 4 if [ -f /proc/net/if_inet6 ]; then echo 1 > /proc/sys/net/ipv6/conf/${VETH0}/disable_ipv6 fi if [[ $verbose -eq 1 ]]; then - echo "setting up ${VETH1}: namespace: ${NS1}" + echo "setting up ${VETH1}" fi if [[ $busy_poll -eq 1 ]]; then @@ -130,18 +128,15 @@ setup_vethPairs() { echo 200000 > /sys/class/net/${VETH1}/gro_flush_timeout fi - ip link set ${VETH1} netns ${NS1} - ip netns exec ${NS1} ip link set ${VETH1} mtu ${MTU} + ip link set ${VETH1} mtu ${MTU} ip link set ${VETH0} mtu ${MTU} - ip netns exec ${NS1} ip link set ${VETH1} up - ip netns exec ${NS1} ip link set dev lo up + ip link set ${VETH1} up ip link set ${VETH0} up } if [ ! -z $ETH ]; then VETH0=${ETH} VETH1=${ETH} - NS1="" else validate_root_exec validate_veth_support ${VETH0} @@ -151,7 +146,7 @@ else retval=$? if [ $retval -ne 0 ]; then test_status $retval "${TEST_NAME}" - cleanup_exit ${VETH0} ${VETH1} ${NS1} + cleanup_exit ${VETH0} ${VETH1} exit $retval fi fi @@ -174,10 +169,15 @@ statusList=() TEST_NAME="XSK_SELFTESTS_${VETH0}_SOFTIRQ" +if [[ $debug -eq 1 ]]; then + echo "-i" ${VETH0} "-i" ${VETH1} + exit +fi + exec_xskxceiver if [ -z $ETH ]; then - cleanup_exit ${VETH0} ${VETH1} ${NS1} + cleanup_exit ${VETH0} ${VETH1} fi TEST_NAME="XSK_SELFTESTS_${VETH0}_BUSY_POLL" busy_poll=1 @@ -190,7 +190,7 @@ exec_xskxceiver ## END TESTS if [ -z $ETH ]; then - cleanup_exit ${VETH0} ${VETH1} ${NS1} + cleanup_exit ${VETH0} ${VETH1} fi failures=0 diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c index 9695318e8132..6c44153755e6 100644 --- a/tools/testing/selftests/bpf/testing_helpers.c +++ b/tools/testing/selftests/bpf/testing_helpers.c @@ -164,7 +164,7 @@ __u32 link_info_prog_id(const struct bpf_link *link, struct bpf_link_info *info) int err; memset(info, 0, sizeof(*info)); - err = bpf_obj_get_info_by_fd(bpf_link__fd(link), info, &info_len); + err = bpf_link_get_info_by_fd(bpf_link__fd(link), info, &info_len); if (err) { printf("failed to get link info: %d\n", -errno); return 0; diff --git a/tools/testing/selftests/bpf/verifier/bounds_mix_sign_unsign.c b/tools/testing/selftests/bpf/verifier/bounds_mix_sign_unsign.c index c2aa6f26738b..bf82b923c5fe 100644 --- a/tools/testing/selftests/bpf/verifier/bounds_mix_sign_unsign.c +++ b/tools/testing/selftests/bpf/verifier/bounds_mix_sign_unsign.c @@ -1,13 +1,14 @@ { "bounds checks mixing signed and unsigned, positive bounds", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, 2), BPF_JMP_REG(BPF_JGE, BPF_REG_2, BPF_REG_1, 3), @@ -17,20 +18,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_2, 3), @@ -40,20 +42,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 2", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_2, 5), @@ -65,20 +68,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 3", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_2, 4), @@ -89,20 +93,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 4", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, 1), BPF_ALU64_REG(BPF_AND, BPF_REG_1, BPF_REG_2), @@ -112,19 +117,20 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .result = ACCEPT, }, { "bounds checks mixing signed and unsigned, variant 5", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_2, 5), @@ -135,17 +141,20 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 6", .insns = { + BPF_MOV64_REG(BPF_REG_9, BPF_REG_1), + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_9), BPF_MOV64_IMM(BPF_REG_2, 0), BPF_MOV64_REG(BPF_REG_3, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -512), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_6, -1), BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_6, 5), @@ -163,13 +172,14 @@ { "bounds checks mixing signed and unsigned, variant 7", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, 1024 * 1024 * 1024), BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_2, 3), @@ -179,19 +189,20 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .result = ACCEPT, }, { "bounds checks mixing signed and unsigned, variant 8", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_JMP_REG(BPF_JGT, BPF_REG_2, BPF_REG_1, 2), @@ -203,20 +214,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 9", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 10), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_LD_IMM64(BPF_REG_2, -9223372036854775808ULL), BPF_JMP_REG(BPF_JGT, BPF_REG_2, BPF_REG_1, 2), @@ -228,19 +240,20 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .result = ACCEPT, }, { "bounds checks mixing signed and unsigned, variant 10", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, 0), BPF_JMP_REG(BPF_JGT, BPF_REG_2, BPF_REG_1, 2), @@ -252,20 +265,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 11", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_JMP_REG(BPF_JGE, BPF_REG_2, BPF_REG_1, 2), @@ -278,20 +292,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 12", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -6), BPF_JMP_REG(BPF_JGE, BPF_REG_2, BPF_REG_1, 2), @@ -303,20 +318,21 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 13", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, 2), BPF_JMP_REG(BPF_JGE, BPF_REG_2, BPF_REG_1, 2), @@ -331,7 +347,7 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, @@ -340,13 +356,14 @@ .insns = { BPF_LDX_MEM(BPF_W, BPF_REG_9, BPF_REG_1, offsetof(struct __sk_buff, mark)), + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -1), BPF_MOV64_IMM(BPF_REG_8, 2), @@ -360,20 +377,21 @@ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_2, -3), BPF_JMP_IMM(BPF_JA, 0, 0, -7), }, - .fixup_map_hash_8b = { 4 }, + .fixup_map_hash_8b = { 6 }, .errstr = "unbounded min value", .result = REJECT, }, { "bounds checks mixing signed and unsigned, variant 15", .insns = { + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_0, -16), BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem), - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4), - BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, -8), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3), BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_10, -16), BPF_MOV64_IMM(BPF_REG_2, -6), BPF_JMP_REG(BPF_JGE, BPF_REG_2, BPF_REG_1, 2), @@ -387,7 +405,7 @@ BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .fixup_map_hash_8b = { 3 }, + .fixup_map_hash_8b = { 5 }, .errstr = "unbounded min value", .result = REJECT, }, diff --git a/tools/testing/selftests/bpf/verifier/bpf_st_mem.c b/tools/testing/selftests/bpf/verifier/bpf_st_mem.c new file mode 100644 index 000000000000..3af2501082b2 --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/bpf_st_mem.c @@ -0,0 +1,67 @@ +{ + "BPF_ST_MEM stack imm non-zero", + .insns = { + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 42), + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -8), + BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, -42), + /* if value is tracked correctly R0 is zero */ + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + /* Use prog type that requires return value in range [0, 1] */ + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, + .runs = -1, +}, +{ + "BPF_ST_MEM stack imm zero", + .insns = { + /* mark stack 0000 0000 */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), + /* read and sum a few bytes */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_10, -8), + BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1), + BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_10, -4), + BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1), + BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_10, -1), + BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1), + /* if value is tracked correctly R0 is zero */ + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + /* Use prog type that requires return value in range [0, 1] */ + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, + .runs = -1, +}, +{ + "BPF_ST_MEM stack imm zero, variable offset", + .insns = { + /* set fp[-16], fp[-24] to zeros */ + BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), + BPF_ST_MEM(BPF_DW, BPF_REG_10, -24, 0), + /* r0 = random value in range [-32, -15] */ + BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32), + BPF_JMP_IMM(BPF_JLE, BPF_REG_0, 16, 2), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 32), + /* fp[r0] = 0, make a variable offset write of zero, + * this should preserve zero marks on stack. + */ + BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_10), + BPF_ST_MEM(BPF_B, BPF_REG_0, 0, 0), + /* r0 = fp[-20], if variable offset write was tracked correctly + * r0 would be a known zero. + */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_10, -20), + /* Would fail return code verification if r0 range is not tracked correctly. */ + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + /* Use prog type that requires return value in range [0, 1] */ + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, + .runs = -1, +}, diff --git a/tools/testing/selftests/bpf/verifier/sleepable.c b/tools/testing/selftests/bpf/verifier/sleepable.c new file mode 100644 index 000000000000..1f0d2bdc673f --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/sleepable.c @@ -0,0 +1,91 @@ +{ + "sleepable fentry accept", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_TRACE_FENTRY, + .kfunc = "bpf_fentry_test1", + .result = ACCEPT, + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, +{ + "sleepable fexit accept", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_TRACE_FENTRY, + .kfunc = "bpf_fentry_test1", + .result = ACCEPT, + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, +{ + "sleepable fmod_ret accept", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_MODIFY_RETURN, + .kfunc = "bpf_fentry_test1", + .result = ACCEPT, + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, +{ + "sleepable iter accept", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_TRACE_ITER, + .kfunc = "task", + .result = ACCEPT, + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, +{ + "sleepable lsm accept", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_LSM, + .kfunc = "bpf", + .expected_attach_type = BPF_LSM_MAC, + .result = ACCEPT, + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, +{ + "sleepable uprobe accept", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_KPROBE, + .kfunc = "bpf_fentry_test1", + .result = ACCEPT, + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, +{ + "sleepable raw tracepoint reject", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_TRACING, + .expected_attach_type = BPF_TRACE_RAW_TP, + .kfunc = "sched_switch", + .result = REJECT, + .errstr = "Only fentry/fexit/fmod_ret, lsm, iter, uprobe, and struct_ops programs can be sleepable", + .flags = BPF_F_SLEEPABLE, + .runs = -1, +}, diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c index f961b49b8ef4..83231456d3c5 100644 --- a/tools/testing/selftests/bpf/veristat.c +++ b/tools/testing/selftests/bpf/veristat.c @@ -144,7 +144,7 @@ static struct env { struct verif_stats *prog_stats; int prog_stat_cnt; - /* baseline_stats is allocated and used only in comparsion mode */ + /* baseline_stats is allocated and used only in comparison mode */ struct verif_stats *baseline_stats; int baseline_stat_cnt; @@ -882,7 +882,7 @@ static int process_obj(const char *filename) * that BPF object file is incomplete and has to be statically * linked into a final BPF object file; instead of bailing * out, report it into stderr, mark it as skipped, and - * proceeed + * proceed */ fprintf(stderr, "Failed to open '%s': %d\n", filename, -errno); env.files_skipped++; diff --git a/tools/testing/selftests/bpf/vmtest.sh b/tools/testing/selftests/bpf/vmtest.sh index 316a56d680f2..685034528018 100755 --- a/tools/testing/selftests/bpf/vmtest.sh +++ b/tools/testing/selftests/bpf/vmtest.sh @@ -13,7 +13,7 @@ s390x) QEMU_BINARY=qemu-system-s390x QEMU_CONSOLE="ttyS1" QEMU_FLAGS=(-smp 2) - BZIMAGE="arch/s390/boot/compressed/vmlinux" + BZIMAGE="arch/s390/boot/vmlinux" ;; x86_64) QEMU_BINARY=qemu-system-x86_64 diff --git a/tools/testing/selftests/bpf/xdp_features.c b/tools/testing/selftests/bpf/xdp_features.c new file mode 100644 index 000000000000..fce12165213b --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_features.c @@ -0,0 +1,699 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <uapi/linux/bpf.h> +#include <uapi/linux/netdev.h> +#include <linux/if_link.h> +#include <signal.h> +#include <argp.h> +#include <net/if.h> +#include <sys/socket.h> +#include <netinet/in.h> +#include <netinet/tcp.h> +#include <unistd.h> +#include <arpa/inet.h> +#include <bpf/bpf.h> +#include <bpf/libbpf.h> +#include <pthread.h> + +#include <network_helpers.h> + +#include "xdp_features.skel.h" +#include "xdp_features.h" + +#define RED(str) "\033[0;31m" str "\033[0m" +#define GREEN(str) "\033[0;32m" str "\033[0m" +#define YELLOW(str) "\033[0;33m" str "\033[0m" + +static struct env { + bool verbosity; + int ifindex; + bool is_tester; + struct { + enum netdev_xdp_act drv_feature; + enum xdp_action action; + } feature; + struct sockaddr_storage dut_ctrl_addr; + struct sockaddr_storage dut_addr; + struct sockaddr_storage tester_addr; +} env; + +#define BUFSIZE 128 + +void test__fail(void) { /* for network_helpers.c */ } + +static int libbpf_print_fn(enum libbpf_print_level level, + const char *format, va_list args) +{ + if (level == LIBBPF_DEBUG && !env.verbosity) + return 0; + return vfprintf(stderr, format, args); +} + +static volatile bool exiting; + +static void sig_handler(int sig) +{ + exiting = true; +} + +const char *argp_program_version = "xdp-features 0.0"; +const char argp_program_doc[] = +"XDP features detection application.\n" +"\n" +"XDP features application checks the XDP advertised features match detected ones.\n" +"\n" +"USAGE: ./xdp-features [-vt] [-f <xdp-feature>] [-D <dut-data-ip>] [-T <tester-data-ip>] [-C <dut-ctrl-ip>] <iface-name>\n" +"\n" +"dut-data-ip, tester-data-ip, dut-ctrl-ip: IPv6 or IPv4-mapped-IPv6 addresses;\n" +"\n" +"XDP features\n:" +"- XDP_PASS\n" +"- XDP_DROP\n" +"- XDP_ABORTED\n" +"- XDP_REDIRECT\n" +"- XDP_NDO_XMIT\n" +"- XDP_TX\n"; + +static const struct argp_option opts[] = { + { "verbose", 'v', NULL, 0, "Verbose debug output" }, + { "tester", 't', NULL, 0, "Tester mode" }, + { "feature", 'f', "XDP-FEATURE", 0, "XDP feature to test" }, + { "dut_data_ip", 'D', "DUT-DATA-IP", 0, "DUT IP data channel" }, + { "dut_ctrl_ip", 'C', "DUT-CTRL-IP", 0, "DUT IP control channel" }, + { "tester_data_ip", 'T', "TESTER-DATA-IP", 0, "Tester IP data channel" }, + {}, +}; + +static int get_xdp_feature(const char *arg) +{ + if (!strcmp(arg, "XDP_PASS")) { + env.feature.action = XDP_PASS; + env.feature.drv_feature = NETDEV_XDP_ACT_BASIC; + } else if (!strcmp(arg, "XDP_DROP")) { + env.feature.drv_feature = NETDEV_XDP_ACT_BASIC; + env.feature.action = XDP_DROP; + } else if (!strcmp(arg, "XDP_ABORTED")) { + env.feature.drv_feature = NETDEV_XDP_ACT_BASIC; + env.feature.action = XDP_ABORTED; + } else if (!strcmp(arg, "XDP_TX")) { + env.feature.drv_feature = NETDEV_XDP_ACT_BASIC; + env.feature.action = XDP_TX; + } else if (!strcmp(arg, "XDP_REDIRECT")) { + env.feature.drv_feature = NETDEV_XDP_ACT_REDIRECT; + env.feature.action = XDP_REDIRECT; + } else if (!strcmp(arg, "XDP_NDO_XMIT")) { + env.feature.drv_feature = NETDEV_XDP_ACT_NDO_XMIT; + } else { + return -EINVAL; + } + + return 0; +} + +static char *get_xdp_feature_str(void) +{ + switch (env.feature.action) { + case XDP_PASS: + return YELLOW("XDP_PASS"); + case XDP_DROP: + return YELLOW("XDP_DROP"); + case XDP_ABORTED: + return YELLOW("XDP_ABORTED"); + case XDP_TX: + return YELLOW("XDP_TX"); + case XDP_REDIRECT: + return YELLOW("XDP_REDIRECT"); + default: + break; + } + + if (env.feature.drv_feature == NETDEV_XDP_ACT_NDO_XMIT) + return YELLOW("XDP_NDO_XMIT"); + + return ""; +} + +static error_t parse_arg(int key, char *arg, struct argp_state *state) +{ + switch (key) { + case 'v': + env.verbosity = true; + break; + case 't': + env.is_tester = true; + break; + case 'f': + if (get_xdp_feature(arg) < 0) { + fprintf(stderr, "Invalid xdp feature: %s\n", arg); + argp_usage(state); + return ARGP_ERR_UNKNOWN; + } + break; + case 'D': + if (make_sockaddr(AF_INET6, arg, DUT_ECHO_PORT, + &env.dut_addr, NULL)) { + fprintf(stderr, "Invalid DUT address: %s\n", arg); + return ARGP_ERR_UNKNOWN; + } + break; + case 'C': + if (make_sockaddr(AF_INET6, arg, DUT_CTRL_PORT, + &env.dut_ctrl_addr, NULL)) { + fprintf(stderr, "Invalid DUT CTRL address: %s\n", arg); + return ARGP_ERR_UNKNOWN; + } + break; + case 'T': + if (make_sockaddr(AF_INET6, arg, 0, &env.tester_addr, NULL)) { + fprintf(stderr, "Invalid Tester address: %s\n", arg); + return ARGP_ERR_UNKNOWN; + } + break; + case ARGP_KEY_ARG: + errno = 0; + if (strlen(arg) >= IF_NAMESIZE) { + fprintf(stderr, "Invalid device name: %s\n", arg); + argp_usage(state); + return ARGP_ERR_UNKNOWN; + } + + env.ifindex = if_nametoindex(arg); + if (!env.ifindex) + env.ifindex = strtoul(arg, NULL, 0); + if (!env.ifindex) { + fprintf(stderr, + "Bad interface index or name (%d): %s\n", + errno, strerror(errno)); + argp_usage(state); + return ARGP_ERR_UNKNOWN; + } + break; + default: + return ARGP_ERR_UNKNOWN; + } + + return 0; +} + +static const struct argp argp = { + .options = opts, + .parser = parse_arg, + .doc = argp_program_doc, +}; + +static void set_env_default(void) +{ + env.feature.drv_feature = NETDEV_XDP_ACT_NDO_XMIT; + env.feature.action = -EINVAL; + env.ifindex = -ENODEV; + make_sockaddr(AF_INET6, "::ffff:127.0.0.1", DUT_CTRL_PORT, + &env.dut_ctrl_addr, NULL); + make_sockaddr(AF_INET6, "::ffff:127.0.0.1", DUT_ECHO_PORT, + &env.dut_addr, NULL); + make_sockaddr(AF_INET6, "::ffff:127.0.0.1", 0, &env.tester_addr, NULL); +} + +static void *dut_echo_thread(void *arg) +{ + unsigned char buf[sizeof(struct tlv_hdr)]; + int sockfd = *(int *)arg; + + while (!exiting) { + struct tlv_hdr *tlv = (struct tlv_hdr *)buf; + struct sockaddr_storage addr; + socklen_t addrlen; + size_t n; + + n = recvfrom(sockfd, buf, sizeof(buf), MSG_WAITALL, + (struct sockaddr *)&addr, &addrlen); + if (n != ntohs(tlv->len)) + continue; + + if (ntohs(tlv->type) != CMD_ECHO) + continue; + + sendto(sockfd, buf, sizeof(buf), MSG_NOSIGNAL | MSG_CONFIRM, + (struct sockaddr *)&addr, addrlen); + } + + pthread_exit((void *)0); + close(sockfd); + + return NULL; +} + +static int dut_run_echo_thread(pthread_t *t, int *sockfd) +{ + int err; + + sockfd = start_reuseport_server(AF_INET6, SOCK_DGRAM, NULL, + DUT_ECHO_PORT, 0, 1); + if (!sockfd) { + fprintf(stderr, "Failed to create echo socket\n"); + return -errno; + } + + /* start echo channel */ + err = pthread_create(t, NULL, dut_echo_thread, sockfd); + if (err) { + fprintf(stderr, "Failed creating dut_echo thread: %s\n", + strerror(-err)); + free_fds(sockfd, 1); + return -EINVAL; + } + + return 0; +} + +static int dut_attach_xdp_prog(struct xdp_features *skel, int flags) +{ + enum xdp_action action = env.feature.action; + struct bpf_program *prog; + unsigned int key = 0; + int err, fd = 0; + + if (env.feature.drv_feature == NETDEV_XDP_ACT_NDO_XMIT) { + struct bpf_devmap_val entry = { + .ifindex = env.ifindex, + }; + + err = bpf_map__update_elem(skel->maps.dev_map, + &key, sizeof(key), + &entry, sizeof(entry), 0); + if (err < 0) + return err; + + fd = bpf_program__fd(skel->progs.xdp_do_redirect_cpumap); + action = XDP_REDIRECT; + } + + switch (action) { + case XDP_TX: + prog = skel->progs.xdp_do_tx; + break; + case XDP_DROP: + prog = skel->progs.xdp_do_drop; + break; + case XDP_ABORTED: + prog = skel->progs.xdp_do_aborted; + break; + case XDP_PASS: + prog = skel->progs.xdp_do_pass; + break; + case XDP_REDIRECT: { + struct bpf_cpumap_val entry = { + .qsize = 2048, + .bpf_prog.fd = fd, + }; + + err = bpf_map__update_elem(skel->maps.cpu_map, + &key, sizeof(key), + &entry, sizeof(entry), 0); + if (err < 0) + return err; + + prog = skel->progs.xdp_do_redirect; + break; + } + default: + return -EINVAL; + } + + err = bpf_xdp_attach(env.ifindex, bpf_program__fd(prog), flags, NULL); + if (err) + fprintf(stderr, + "Failed to attach XDP program to ifindex %d\n", + env.ifindex); + return err; +} + +static int recv_msg(int sockfd, void *buf, size_t bufsize, void *val, + size_t val_size) +{ + struct tlv_hdr *tlv = (struct tlv_hdr *)buf; + size_t len; + + len = recv(sockfd, buf, bufsize, 0); + if (len != ntohs(tlv->len) || len < sizeof(*tlv)) + return -EINVAL; + + if (val) { + len -= sizeof(*tlv); + if (len > val_size) + return -ENOMEM; + + memcpy(val, tlv->data, len); + } + + return 0; +} + +static int dut_run(struct xdp_features *skel) +{ + int flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE; + int state, err, *sockfd, ctrl_sockfd, echo_sockfd; + struct sockaddr_storage ctrl_addr; + pthread_t dut_thread; + socklen_t addrlen; + + sockfd = start_reuseport_server(AF_INET6, SOCK_STREAM, NULL, + DUT_CTRL_PORT, 0, 1); + if (!sockfd) { + fprintf(stderr, "Failed to create DUT socket\n"); + return -errno; + } + + ctrl_sockfd = accept(*sockfd, (struct sockaddr *)&ctrl_addr, &addrlen); + if (ctrl_sockfd < 0) { + fprintf(stderr, "Failed to accept connection on DUT socket\n"); + free_fds(sockfd, 1); + return -errno; + } + + /* CTRL loop */ + while (!exiting) { + unsigned char buf[BUFSIZE] = {}; + struct tlv_hdr *tlv = (struct tlv_hdr *)buf; + + err = recv_msg(ctrl_sockfd, buf, BUFSIZE, NULL, 0); + if (err) + continue; + + switch (ntohs(tlv->type)) { + case CMD_START: { + if (state == CMD_START) + continue; + + state = CMD_START; + /* Load the XDP program on the DUT */ + err = dut_attach_xdp_prog(skel, flags); + if (err) + goto out; + + err = dut_run_echo_thread(&dut_thread, &echo_sockfd); + if (err < 0) + goto out; + + tlv->type = htons(CMD_ACK); + tlv->len = htons(sizeof(*tlv)); + err = send(ctrl_sockfd, buf, sizeof(*tlv), 0); + if (err < 0) + goto end_thread; + break; + } + case CMD_STOP: + if (state != CMD_START) + break; + + state = CMD_STOP; + + exiting = true; + bpf_xdp_detach(env.ifindex, flags, NULL); + + tlv->type = htons(CMD_ACK); + tlv->len = htons(sizeof(*tlv)); + err = send(ctrl_sockfd, buf, sizeof(*tlv), 0); + goto end_thread; + case CMD_GET_XDP_CAP: { + LIBBPF_OPTS(bpf_xdp_query_opts, opts); + unsigned long long val; + size_t n; + + err = bpf_xdp_query(env.ifindex, XDP_FLAGS_DRV_MODE, + &opts); + if (err) { + fprintf(stderr, + "Failed to query XDP cap for ifindex %d\n", + env.ifindex); + goto end_thread; + } + + tlv->type = htons(CMD_ACK); + n = sizeof(*tlv) + sizeof(opts.feature_flags); + tlv->len = htons(n); + + val = htobe64(opts.feature_flags); + memcpy(tlv->data, &val, sizeof(val)); + + err = send(ctrl_sockfd, buf, n, 0); + if (err < 0) + goto end_thread; + break; + } + case CMD_GET_STATS: { + unsigned int key = 0, val; + size_t n; + + err = bpf_map__lookup_elem(skel->maps.dut_stats, + &key, sizeof(key), + &val, sizeof(val), 0); + if (err) { + fprintf(stderr, "bpf_map_lookup_elem failed\n"); + goto end_thread; + } + + tlv->type = htons(CMD_ACK); + n = sizeof(*tlv) + sizeof(val); + tlv->len = htons(n); + + val = htonl(val); + memcpy(tlv->data, &val, sizeof(val)); + + err = send(ctrl_sockfd, buf, n, 0); + if (err < 0) + goto end_thread; + break; + } + default: + break; + } + } + +end_thread: + pthread_join(dut_thread, NULL); +out: + bpf_xdp_detach(env.ifindex, flags, NULL); + close(ctrl_sockfd); + free_fds(sockfd, 1); + + return err; +} + +static bool tester_collect_detected_cap(struct xdp_features *skel, + unsigned int dut_stats) +{ + unsigned int err, key = 0, val; + + if (!dut_stats) + return false; + + err = bpf_map__lookup_elem(skel->maps.stats, &key, sizeof(key), + &val, sizeof(val), 0); + if (err) { + fprintf(stderr, "bpf_map_lookup_elem failed\n"); + return false; + } + + switch (env.feature.action) { + case XDP_PASS: + case XDP_TX: + case XDP_REDIRECT: + return val > 0; + case XDP_DROP: + case XDP_ABORTED: + return val == 0; + default: + break; + } + + if (env.feature.drv_feature == NETDEV_XDP_ACT_NDO_XMIT) + return val > 0; + + return false; +} + +static int send_and_recv_msg(int sockfd, enum test_commands cmd, void *val, + size_t val_size) +{ + unsigned char buf[BUFSIZE] = {}; + struct tlv_hdr *tlv = (struct tlv_hdr *)buf; + int err; + + tlv->type = htons(cmd); + tlv->len = htons(sizeof(*tlv)); + + err = send(sockfd, buf, sizeof(*tlv), 0); + if (err < 0) + return err; + + err = recv_msg(sockfd, buf, BUFSIZE, val, val_size); + if (err < 0) + return err; + + return ntohs(tlv->type) == CMD_ACK ? 0 : -EINVAL; +} + +static int send_echo_msg(void) +{ + unsigned char buf[sizeof(struct tlv_hdr)]; + struct tlv_hdr *tlv = (struct tlv_hdr *)buf; + int sockfd, n; + + sockfd = socket(AF_INET6, SOCK_DGRAM, 0); + if (sockfd < 0) { + fprintf(stderr, "Failed to create echo socket\n"); + return -errno; + } + + tlv->type = htons(CMD_ECHO); + tlv->len = htons(sizeof(*tlv)); + + n = sendto(sockfd, buf, sizeof(*tlv), MSG_NOSIGNAL | MSG_CONFIRM, + (struct sockaddr *)&env.dut_addr, sizeof(env.dut_addr)); + close(sockfd); + + return n == ntohs(tlv->len) ? 0 : -EINVAL; +} + +static int tester_run(struct xdp_features *skel) +{ + int flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE; + unsigned long long advertised_feature; + struct bpf_program *prog; + unsigned int stats; + int i, err, sockfd; + bool detected_cap; + + sockfd = socket(AF_INET6, SOCK_STREAM, 0); + if (sockfd < 0) { + fprintf(stderr, "Failed to create tester socket\n"); + return -errno; + } + + if (settimeo(sockfd, 1000) < 0) + return -EINVAL; + + err = connect(sockfd, (struct sockaddr *)&env.dut_ctrl_addr, + sizeof(env.dut_ctrl_addr)); + if (err) { + fprintf(stderr, "Failed to connect to the DUT\n"); + return -errno; + } + + err = send_and_recv_msg(sockfd, CMD_GET_XDP_CAP, &advertised_feature, + sizeof(advertised_feature)); + if (err < 0) { + close(sockfd); + return err; + } + + advertised_feature = be64toh(advertised_feature); + + if (env.feature.drv_feature == NETDEV_XDP_ACT_NDO_XMIT || + env.feature.action == XDP_TX) + prog = skel->progs.xdp_tester_check_tx; + else + prog = skel->progs.xdp_tester_check_rx; + + err = bpf_xdp_attach(env.ifindex, bpf_program__fd(prog), flags, NULL); + if (err) { + fprintf(stderr, "Failed to attach XDP program to ifindex %d\n", + env.ifindex); + goto out; + } + + err = send_and_recv_msg(sockfd, CMD_START, NULL, 0); + if (err) + goto out; + + for (i = 0; i < 10 && !exiting; i++) { + err = send_echo_msg(); + if (err < 0) + goto out; + + sleep(1); + } + + err = send_and_recv_msg(sockfd, CMD_GET_STATS, &stats, sizeof(stats)); + if (err) + goto out; + + /* stop the test */ + err = send_and_recv_msg(sockfd, CMD_STOP, NULL, 0); + /* send a new echo message to wake echo thread of the dut */ + send_echo_msg(); + + detected_cap = tester_collect_detected_cap(skel, ntohl(stats)); + + fprintf(stdout, "Feature %s: [%s][%s]\n", get_xdp_feature_str(), + detected_cap ? GREEN("DETECTED") : RED("NOT DETECTED"), + env.feature.drv_feature & advertised_feature ? GREEN("ADVERTISED") + : RED("NOT ADVERTISED")); +out: + bpf_xdp_detach(env.ifindex, flags, NULL); + close(sockfd); + return err < 0 ? err : 0; +} + +int main(int argc, char **argv) +{ + struct xdp_features *skel; + int err; + + libbpf_set_strict_mode(LIBBPF_STRICT_ALL); + libbpf_set_print(libbpf_print_fn); + + signal(SIGINT, sig_handler); + signal(SIGTERM, sig_handler); + + set_env_default(); + + /* Parse command line arguments */ + err = argp_parse(&argp, argc, argv, 0, NULL, NULL); + if (err) + return err; + + if (env.ifindex < 0) { + fprintf(stderr, "Invalid ifindex\n"); + return -ENODEV; + } + + /* Load and verify BPF application */ + skel = xdp_features__open(); + if (!skel) { + fprintf(stderr, "Failed to open and load BPF skeleton\n"); + return -EINVAL; + } + + skel->rodata->tester_addr = + ((struct sockaddr_in6 *)&env.tester_addr)->sin6_addr; + skel->rodata->dut_addr = + ((struct sockaddr_in6 *)&env.dut_addr)->sin6_addr; + + /* Load & verify BPF programs */ + err = xdp_features__load(skel); + if (err) { + fprintf(stderr, "Failed to load and verify BPF skeleton\n"); + goto cleanup; + } + + err = xdp_features__attach(skel); + if (err) { + fprintf(stderr, "Failed to attach BPF skeleton\n"); + goto cleanup; + } + + if (env.is_tester) { + /* Tester */ + fprintf(stdout, "Starting tester on device %d\n", env.ifindex); + err = tester_run(skel); + } else { + /* DUT */ + fprintf(stdout, "Starting DUT on device %d\n", env.ifindex); + err = dut_run(skel); + } + +cleanup: + xdp_features__destroy(skel); + + return err < 0 ? -err : 0; +} diff --git a/tools/testing/selftests/bpf/xdp_features.h b/tools/testing/selftests/bpf/xdp_features.h new file mode 100644 index 000000000000..2670c541713b --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_features.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* test commands */ +enum test_commands { + CMD_STOP, /* CMD */ + CMD_START, /* CMD */ + CMD_ECHO, /* CMD */ + CMD_ACK, /* CMD + data */ + CMD_GET_XDP_CAP, /* CMD */ + CMD_GET_STATS, /* CMD */ +}; + +#define DUT_CTRL_PORT 12345 +#define DUT_ECHO_PORT 12346 + +struct tlv_hdr { + __be16 type; + __be16 len; + __u8 data[]; +}; diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c new file mode 100644 index 000000000000..1c8acb68b977 --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -0,0 +1,445 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Reference program for verifying XDP metadata on real HW. Functional test + * only, doesn't test the performance. + * + * RX: + * - UDP 9091 packets are diverted into AF_XDP + * - Metadata verified: + * - rx_timestamp + * - rx_hash + * + * TX: + * - TBD + */ + +#include <test_progs.h> +#include <network_helpers.h> +#include "xdp_hw_metadata.skel.h" +#include "xsk.h" + +#include <error.h> +#include <linux/errqueue.h> +#include <linux/if_link.h> +#include <linux/net_tstamp.h> +#include <linux/udp.h> +#include <linux/sockios.h> +#include <sys/mman.h> +#include <net/if.h> +#include <poll.h> + +#include "xdp_metadata.h" + +#define UMEM_NUM 16 +#define UMEM_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE +#define UMEM_SIZE (UMEM_FRAME_SIZE * UMEM_NUM) +#define XDP_FLAGS (XDP_FLAGS_DRV_MODE | XDP_FLAGS_REPLACE) + +struct xsk { + void *umem_area; + struct xsk_umem *umem; + struct xsk_ring_prod fill; + struct xsk_ring_cons comp; + struct xsk_ring_prod tx; + struct xsk_ring_cons rx; + struct xsk_socket *socket; +}; + +struct xdp_hw_metadata *bpf_obj; +struct xsk *rx_xsk; +const char *ifname; +int ifindex; +int rxq; + +void test__fail(void) { /* for network_helpers.c */ } + +static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) +{ + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; + const struct xsk_socket_config socket_config = { + .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .bind_flags = XDP_COPY, + }; + const struct xsk_umem_config umem_config = { + .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, + .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, + .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + }; + __u32 idx; + u64 addr; + int ret; + int i; + + xsk->umem_area = mmap(NULL, UMEM_SIZE, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); + if (xsk->umem_area == MAP_FAILED) + return -ENOMEM; + + ret = xsk_umem__create(&xsk->umem, + xsk->umem_area, UMEM_SIZE, + &xsk->fill, + &xsk->comp, + &umem_config); + if (ret) + return ret; + + ret = xsk_socket__create(&xsk->socket, ifindex, queue_id, + xsk->umem, + &xsk->rx, + &xsk->tx, + &socket_config); + if (ret) + return ret; + + /* First half of umem is for TX. This way address matches 1-to-1 + * to the completion queue index. + */ + + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = i * UMEM_FRAME_SIZE; + printf("%p: tx_desc[%d] -> %lx\n", xsk, i, addr); + } + + /* Second half of umem is for RX. */ + + ret = xsk_ring_prod__reserve(&xsk->fill, UMEM_NUM / 2, &idx); + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = (UMEM_NUM / 2 + i) * UMEM_FRAME_SIZE; + printf("%p: rx_desc[%d] -> %lx\n", xsk, i, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, i) = addr; + } + xsk_ring_prod__submit(&xsk->fill, ret); + + return 0; +} + +static void close_xsk(struct xsk *xsk) +{ + if (xsk->umem) + xsk_umem__delete(xsk->umem); + if (xsk->socket) + xsk_socket__delete(xsk->socket); + munmap(xsk->umem_area, UMEM_SIZE); +} + +static void refill_rx(struct xsk *xsk, __u64 addr) +{ + __u32 idx; + + if (xsk_ring_prod__reserve(&xsk->fill, 1, &idx) == 1) { + printf("%p: complete idx=%u addr=%llx\n", xsk, idx, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; + xsk_ring_prod__submit(&xsk->fill, 1); + } +} + +static void verify_xdp_metadata(void *data) +{ + struct xdp_meta *meta; + + meta = data - sizeof(*meta); + + printf("rx_timestamp: %llu\n", meta->rx_timestamp); + printf("rx_hash: %u\n", meta->rx_hash); +} + +static void verify_skb_metadata(int fd) +{ + char cmsg_buf[1024]; + char packet_buf[128]; + + struct scm_timestamping *ts; + struct iovec packet_iov; + struct cmsghdr *cmsg; + struct msghdr hdr; + + memset(&hdr, 0, sizeof(hdr)); + hdr.msg_iov = &packet_iov; + hdr.msg_iovlen = 1; + packet_iov.iov_base = packet_buf; + packet_iov.iov_len = sizeof(packet_buf); + + hdr.msg_control = cmsg_buf; + hdr.msg_controllen = sizeof(cmsg_buf); + + if (recvmsg(fd, &hdr, 0) < 0) + error(1, errno, "recvmsg"); + + for (cmsg = CMSG_FIRSTHDR(&hdr); cmsg != NULL; + cmsg = CMSG_NXTHDR(&hdr, cmsg)) { + + if (cmsg->cmsg_level != SOL_SOCKET) + continue; + + switch (cmsg->cmsg_type) { + case SCM_TIMESTAMPING: + ts = (struct scm_timestamping *)CMSG_DATA(cmsg); + if (ts->ts[2].tv_sec || ts->ts[2].tv_nsec) { + printf("found skb hwtstamp = %lu.%lu\n", + ts->ts[2].tv_sec, ts->ts[2].tv_nsec); + return; + } + break; + default: + break; + } + } + + printf("skb hwtstamp is not found!\n"); +} + +static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd) +{ + const struct xdp_desc *rx_desc; + struct pollfd fds[rxq + 1]; + __u64 comp_addr; + __u64 addr; + __u32 idx; + int ret; + int i; + + for (i = 0; i < rxq; i++) { + fds[i].fd = xsk_socket__fd(rx_xsk[i].socket); + fds[i].events = POLLIN; + fds[i].revents = 0; + } + + fds[rxq].fd = server_fd; + fds[rxq].events = POLLIN; + fds[rxq].revents = 0; + + while (true) { + errno = 0; + ret = poll(fds, rxq + 1, 1000); + printf("poll: %d (%d)\n", ret, errno); + if (ret < 0) + break; + if (ret == 0) + continue; + + if (fds[rxq].revents) + verify_skb_metadata(server_fd); + + for (i = 0; i < rxq; i++) { + if (fds[i].revents == 0) + continue; + + struct xsk *xsk = &rx_xsk[i]; + + ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); + printf("xsk_ring_cons__peek: %d\n", ret); + if (ret != 1) + continue; + + rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); + comp_addr = xsk_umem__extract_addr(rx_desc->addr); + addr = xsk_umem__add_offset_to_addr(rx_desc->addr); + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", + xsk, idx, rx_desc->addr, addr, comp_addr); + verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr)); + xsk_ring_cons__release(&xsk->rx, 1); + refill_rx(xsk, comp_addr); + } + } + + return 0; +} + +struct ethtool_channels { + __u32 cmd; + __u32 max_rx; + __u32 max_tx; + __u32 max_other; + __u32 max_combined; + __u32 rx_count; + __u32 tx_count; + __u32 other_count; + __u32 combined_count; +}; + +#define ETHTOOL_GCHANNELS 0x0000003c /* Get no of channels */ + +static int rxq_num(const char *ifname) +{ + struct ethtool_channels ch = { + .cmd = ETHTOOL_GCHANNELS, + }; + + struct ifreq ifr = { + .ifr_data = (void *)&ch, + }; + strncpy(ifr.ifr_name, ifname, IF_NAMESIZE - 1); + int fd, ret; + + fd = socket(AF_UNIX, SOCK_DGRAM, 0); + if (fd < 0) + error(1, errno, "socket"); + + ret = ioctl(fd, SIOCETHTOOL, &ifr); + if (ret < 0) + error(1, errno, "ioctl(SIOCETHTOOL)"); + + close(fd); + + return ch.rx_count + ch.combined_count; +} + +static void hwtstamp_ioctl(int op, const char *ifname, struct hwtstamp_config *cfg) +{ + struct ifreq ifr = { + .ifr_data = (void *)cfg, + }; + strncpy(ifr.ifr_name, ifname, IF_NAMESIZE - 1); + int fd, ret; + + fd = socket(AF_UNIX, SOCK_DGRAM, 0); + if (fd < 0) + error(1, errno, "socket"); + + ret = ioctl(fd, op, &ifr); + if (ret < 0) + error(1, errno, "ioctl(%d)", op); + + close(fd); +} + +static struct hwtstamp_config saved_hwtstamp_cfg; +static const char *saved_hwtstamp_ifname; + +static void hwtstamp_restore(void) +{ + hwtstamp_ioctl(SIOCSHWTSTAMP, saved_hwtstamp_ifname, &saved_hwtstamp_cfg); +} + +static void hwtstamp_enable(const char *ifname) +{ + struct hwtstamp_config cfg = { + .rx_filter = HWTSTAMP_FILTER_ALL, + }; + + hwtstamp_ioctl(SIOCGHWTSTAMP, ifname, &saved_hwtstamp_cfg); + saved_hwtstamp_ifname = strdup(ifname); + atexit(hwtstamp_restore); + + hwtstamp_ioctl(SIOCSHWTSTAMP, ifname, &cfg); +} + +static void cleanup(void) +{ + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); + int ret; + int i; + + if (bpf_obj) { + opts.old_prog_fd = bpf_program__fd(bpf_obj->progs.rx); + if (opts.old_prog_fd >= 0) { + printf("detaching bpf program....\n"); + ret = bpf_xdp_detach(ifindex, XDP_FLAGS, &opts); + if (ret) + printf("failed to detach XDP program: %d\n", ret); + } + } + + for (i = 0; i < rxq; i++) + close_xsk(&rx_xsk[i]); + + if (bpf_obj) + xdp_hw_metadata__destroy(bpf_obj); +} + +static void handle_signal(int sig) +{ + /* interrupting poll() is all we need */ +} + +static void timestamping_enable(int fd, int val) +{ + int ret; + + ret = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val)); + if (ret < 0) + error(1, errno, "setsockopt(SO_TIMESTAMPING)"); +} + +int main(int argc, char *argv[]) +{ + int server_fd = -1; + int ret; + int i; + + struct bpf_program *prog; + + if (argc != 2) { + fprintf(stderr, "pass device name\n"); + return -1; + } + + ifname = argv[1]; + ifindex = if_nametoindex(ifname); + rxq = rxq_num(ifname); + + printf("rxq: %d\n", rxq); + + hwtstamp_enable(ifname); + + rx_xsk = malloc(sizeof(struct xsk) * rxq); + if (!rx_xsk) + error(1, ENOMEM, "malloc"); + + for (i = 0; i < rxq; i++) { + printf("open_xsk(%s, %p, %d)\n", ifname, &rx_xsk[i], i); + ret = open_xsk(ifindex, &rx_xsk[i], i); + if (ret) + error(1, -ret, "open_xsk"); + + printf("xsk_socket__fd() -> %d\n", xsk_socket__fd(rx_xsk[i].socket)); + } + + printf("open bpf program...\n"); + bpf_obj = xdp_hw_metadata__open(); + if (libbpf_get_error(bpf_obj)) + error(1, libbpf_get_error(bpf_obj), "xdp_hw_metadata__open"); + + prog = bpf_object__find_program_by_name(bpf_obj->obj, "rx"); + bpf_program__set_ifindex(prog, ifindex); + bpf_program__set_flags(prog, BPF_F_XDP_DEV_BOUND_ONLY); + + printf("load bpf program...\n"); + ret = xdp_hw_metadata__load(bpf_obj); + if (ret) + error(1, -ret, "xdp_hw_metadata__load"); + + printf("prepare skb endpoint...\n"); + server_fd = start_server(AF_INET6, SOCK_DGRAM, NULL, 9092, 1000); + if (server_fd < 0) + error(1, errno, "start_server"); + timestamping_enable(server_fd, + SOF_TIMESTAMPING_SOFTWARE | + SOF_TIMESTAMPING_RAW_HARDWARE); + + printf("prepare xsk map...\n"); + for (i = 0; i < rxq; i++) { + int sock_fd = xsk_socket__fd(rx_xsk[i].socket); + __u32 queue_id = i; + + printf("map[%d] = %d\n", queue_id, sock_fd); + ret = bpf_map_update_elem(bpf_map__fd(bpf_obj->maps.xsk), &queue_id, &sock_fd, 0); + if (ret) + error(1, -ret, "bpf_map_update_elem"); + } + + printf("attach bpf program...\n"); + ret = bpf_xdp_attach(ifindex, + bpf_program__fd(bpf_obj->progs.rx), + XDP_FLAGS, NULL); + if (ret) + error(1, -ret, "bpf_xdp_attach"); + + signal(SIGINT, handle_signal); + ret = verify_metadata(rx_xsk, rxq, server_fd); + close(server_fd); + cleanup(); + if (ret) + error(1, -ret, "verify_metadata"); +} diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h new file mode 100644 index 000000000000..f6780fbb0a21 --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_metadata.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#pragma once + +#ifndef ETH_P_IP +#define ETH_P_IP 0x0800 +#endif + +#ifndef ETH_P_IPV6 +#define ETH_P_IPV6 0x86DD +#endif + +struct xdp_meta { + __u64 rx_timestamp; + __u32 rx_hash; +}; diff --git a/tools/testing/selftests/bpf/xdp_synproxy.c b/tools/testing/selftests/bpf/xdp_synproxy.c index 410a1385a01d..ce68c342b56f 100644 --- a/tools/testing/selftests/bpf/xdp_synproxy.c +++ b/tools/testing/selftests/bpf/xdp_synproxy.c @@ -116,6 +116,7 @@ static void parse_options(int argc, char *argv[], unsigned int *ifindex, __u32 * *tcpipopts = 0; *ports = NULL; *single = false; + *tc = false; while (true) { int opt; @@ -216,9 +217,10 @@ static int syncookie_attach(const char *argv0, unsigned int ifindex, bool tc) prog_fd = bpf_program__fd(prog); - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &info, &info_len); if (err < 0) { - fprintf(stderr, "Error: bpf_obj_get_info_by_fd: %s\n", strerror(-err)); + fprintf(stderr, "Error: bpf_prog_get_info_by_fd: %s\n", + strerror(-err)); goto out; } attached_tc = tc; @@ -291,9 +293,10 @@ static int syncookie_open_bpf_maps(__u32 prog_id, int *values_map_fd, int *ports }; info_len = sizeof(prog_info); - err = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &info_len); + err = bpf_prog_get_info_by_fd(prog_fd, &prog_info, &info_len); if (err != 0) { - fprintf(stderr, "Error: bpf_obj_get_info_by_fd: %s\n", strerror(-err)); + fprintf(stderr, "Error: bpf_prog_get_info_by_fd: %s\n", + strerror(-err)); goto out; } @@ -316,9 +319,10 @@ static int syncookie_open_bpf_maps(__u32 prog_id, int *values_map_fd, int *ports map_fd = err; info_len = sizeof(map_info); - err = bpf_obj_get_info_by_fd(map_fd, &map_info, &info_len); + err = bpf_map_get_info_by_fd(map_fd, &map_info, &info_len); if (err != 0) { - fprintf(stderr, "Error: bpf_obj_get_info_by_fd: %s\n", strerror(-err)); + fprintf(stderr, "Error: bpf_map_get_info_by_fd: %s\n", + strerror(-err)); close(map_fd); goto err_close_map_fds; } diff --git a/tools/testing/selftests/bpf/xsk.c b/tools/testing/selftests/bpf/xsk.c index 39d349509ba4..687d83e707f8 100644 --- a/tools/testing/selftests/bpf/xsk.c +++ b/tools/testing/selftests/bpf/xsk.c @@ -49,10 +49,7 @@ #define pr_warn(fmt, ...) fprintf(stderr, fmt, ##__VA_ARGS__) -enum xsk_prog { - XSK_PROG_FALLBACK, - XSK_PROG_REDIRECT_FLAGS, -}; +#define XSKMAP_SIZE 1 struct xsk_umem { struct xsk_ring_prod *fill_save; @@ -74,43 +71,16 @@ struct xsk_ctx { int refcount; int ifindex; struct list_head list; - int prog_fd; - int link_fd; - int xsks_map_fd; - char ifname[IFNAMSIZ]; - bool has_bpf_link; }; struct xsk_socket { struct xsk_ring_cons *rx; struct xsk_ring_prod *tx; - __u64 outstanding_tx; struct xsk_ctx *ctx; struct xsk_socket_config config; int fd; }; -struct xsk_nl_info { - bool xdp_prog_attached; - int ifindex; - int fd; -}; - -/* Up until and including Linux 5.3 */ -struct xdp_ring_offset_v1 { - __u64 producer; - __u64 consumer; - __u64 desc; -}; - -/* Up until and including Linux 5.3 */ -struct xdp_mmap_offsets_v1 { - struct xdp_ring_offset_v1 rx; - struct xdp_ring_offset_v1 tx; - struct xdp_ring_offset_v1 fr; - struct xdp_ring_offset_v1 cr; -}; - int xsk_umem__fd(const struct xsk_umem *umem) { return umem ? umem->fd : -EINVAL; @@ -153,55 +123,17 @@ static int xsk_set_xdp_socket_config(struct xsk_socket_config *cfg, if (!usr_cfg) { cfg->rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; cfg->tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; - cfg->libbpf_flags = 0; - cfg->xdp_flags = 0; cfg->bind_flags = 0; return 0; } - if (usr_cfg->libbpf_flags & ~XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD) - return -EINVAL; - cfg->rx_size = usr_cfg->rx_size; cfg->tx_size = usr_cfg->tx_size; - cfg->libbpf_flags = usr_cfg->libbpf_flags; - cfg->xdp_flags = usr_cfg->xdp_flags; cfg->bind_flags = usr_cfg->bind_flags; return 0; } -static void xsk_mmap_offsets_v1(struct xdp_mmap_offsets *off) -{ - struct xdp_mmap_offsets_v1 off_v1; - - /* getsockopt on a kernel <= 5.3 has no flags fields. - * Copy over the offsets to the correct places in the >=5.4 format - * and put the flags where they would have been on that kernel. - */ - memcpy(&off_v1, off, sizeof(off_v1)); - - off->rx.producer = off_v1.rx.producer; - off->rx.consumer = off_v1.rx.consumer; - off->rx.desc = off_v1.rx.desc; - off->rx.flags = off_v1.rx.consumer + sizeof(__u32); - - off->tx.producer = off_v1.tx.producer; - off->tx.consumer = off_v1.tx.consumer; - off->tx.desc = off_v1.tx.desc; - off->tx.flags = off_v1.tx.consumer + sizeof(__u32); - - off->fr.producer = off_v1.fr.producer; - off->fr.consumer = off_v1.fr.consumer; - off->fr.desc = off_v1.fr.desc; - off->fr.flags = off_v1.fr.consumer + sizeof(__u32); - - off->cr.producer = off_v1.cr.producer; - off->cr.consumer = off_v1.cr.consumer; - off->cr.desc = off_v1.cr.desc; - off->cr.flags = off_v1.cr.consumer + sizeof(__u32); -} - static int xsk_get_mmap_offsets(int fd, struct xdp_mmap_offsets *off) { socklen_t optlen; @@ -215,11 +147,6 @@ static int xsk_get_mmap_offsets(int fd, struct xdp_mmap_offsets *off) if (optlen == sizeof(*off)) return 0; - if (optlen == sizeof(struct xdp_mmap_offsets_v1)) { - xsk_mmap_offsets_v1(off); - return 0; - } - return -EINVAL; } @@ -340,531 +267,56 @@ out_umem_alloc: return err; } -struct xsk_umem_config_v1 { - __u32 fill_size; - __u32 comp_size; - __u32 frame_size; - __u32 frame_headroom; -}; - -static enum xsk_prog get_xsk_prog(void) -{ - enum xsk_prog detected = XSK_PROG_FALLBACK; - char data_in = 0, data_out; - struct bpf_insn insns[] = { - BPF_LD_MAP_FD(BPF_REG_1, 0), - BPF_MOV64_IMM(BPF_REG_2, 0), - BPF_MOV64_IMM(BPF_REG_3, XDP_PASS), - BPF_EMIT_CALL(BPF_FUNC_redirect_map), - BPF_EXIT_INSN(), - }; - LIBBPF_OPTS(bpf_test_run_opts, opts, - .data_in = &data_in, - .data_size_in = 1, - .data_out = &data_out, - ); - - int prog_fd, map_fd, ret, insn_cnt = ARRAY_SIZE(insns); - - map_fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, NULL, sizeof(int), sizeof(int), 1, NULL); - if (map_fd < 0) - return detected; - - insns[0].imm = map_fd; - - prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, NULL); - if (prog_fd < 0) { - close(map_fd); - return detected; - } - - ret = bpf_prog_test_run_opts(prog_fd, &opts); - if (!ret && opts.retval == XDP_PASS) - detected = XSK_PROG_REDIRECT_FLAGS; - close(prog_fd); - close(map_fd); - return detected; -} - -static int xsk_load_xdp_prog(struct xsk_socket *xsk) -{ - static const int log_buf_size = 16 * 1024; - struct xsk_ctx *ctx = xsk->ctx; - char log_buf[log_buf_size]; - int prog_fd; - - /* This is the fallback C-program: - * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) - * { - * int ret, index = ctx->rx_queue_index; - * - * // A set entry here means that the correspnding queue_id - * // has an active AF_XDP socket bound to it. - * ret = bpf_redirect_map(&xsks_map, index, XDP_PASS); - * if (ret > 0) - * return ret; - * - * // Fallback for pre-5.3 kernels, not supporting default - * // action in the flags parameter. - * if (bpf_map_lookup_elem(&xsks_map, &index)) - * return bpf_redirect_map(&xsks_map, index, 0); - * return XDP_PASS; - * } - */ - struct bpf_insn prog[] = { - /* r2 = *(u32 *)(r1 + 16) */ - BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16), - /* *(u32 *)(r10 - 4) = r2 */ - BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -4), - /* r1 = xskmap[] */ - BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd), - /* r3 = XDP_PASS */ - BPF_MOV64_IMM(BPF_REG_3, 2), - /* call bpf_redirect_map */ - BPF_EMIT_CALL(BPF_FUNC_redirect_map), - /* if w0 != 0 goto pc+13 */ - BPF_JMP32_IMM(BPF_JSGT, BPF_REG_0, 0, 13), - /* r2 = r10 */ - BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), - /* r2 += -4 */ - BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4), - /* r1 = xskmap[] */ - BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd), - /* call bpf_map_lookup_elem */ - BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), - /* r1 = r0 */ - BPF_MOV64_REG(BPF_REG_1, BPF_REG_0), - /* r0 = XDP_PASS */ - BPF_MOV64_IMM(BPF_REG_0, 2), - /* if r1 == 0 goto pc+5 */ - BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 5), - /* r2 = *(u32 *)(r10 - 4) */ - BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_10, -4), - /* r1 = xskmap[] */ - BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd), - /* r3 = 0 */ - BPF_MOV64_IMM(BPF_REG_3, 0), - /* call bpf_redirect_map */ - BPF_EMIT_CALL(BPF_FUNC_redirect_map), - /* The jumps are to this instruction */ - BPF_EXIT_INSN(), - }; - - /* This is the post-5.3 kernel C-program: - * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) - * { - * return bpf_redirect_map(&xsks_map, ctx->rx_queue_index, XDP_PASS); - * } - */ - struct bpf_insn prog_redirect_flags[] = { - /* r2 = *(u32 *)(r1 + 16) */ - BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16), - /* r1 = xskmap[] */ - BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd), - /* r3 = XDP_PASS */ - BPF_MOV64_IMM(BPF_REG_3, 2), - /* call bpf_redirect_map */ - BPF_EMIT_CALL(BPF_FUNC_redirect_map), - BPF_EXIT_INSN(), - }; - size_t insns_cnt[] = {ARRAY_SIZE(prog), - ARRAY_SIZE(prog_redirect_flags), - }; - struct bpf_insn *progs[] = {prog, prog_redirect_flags}; - enum xsk_prog option = get_xsk_prog(); - LIBBPF_OPTS(bpf_prog_load_opts, opts, - .log_buf = log_buf, - .log_size = log_buf_size, - ); - - prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "LGPL-2.1 or BSD-2-Clause", - progs[option], insns_cnt[option], &opts); - if (prog_fd < 0) { - pr_warn("BPF log buffer:\n%s", log_buf); - return prog_fd; - } - - ctx->prog_fd = prog_fd; - return 0; -} - -static int xsk_create_bpf_link(struct xsk_socket *xsk) -{ - DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts); - struct xsk_ctx *ctx = xsk->ctx; - __u32 prog_id = 0; - int link_fd; - int err; - - err = bpf_xdp_query_id(ctx->ifindex, xsk->config.xdp_flags, &prog_id); - if (err) { - pr_warn("getting XDP prog id failed\n"); - return err; - } - - /* if there's a netlink-based XDP prog loaded on interface, bail out - * and ask user to do the removal by himself - */ - if (prog_id) { - pr_warn("Netlink-based XDP prog detected, please unload it in order to launch AF_XDP prog\n"); - return -EINVAL; - } - - opts.flags = xsk->config.xdp_flags & ~(XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_REPLACE); - - link_fd = bpf_link_create(ctx->prog_fd, ctx->ifindex, BPF_XDP, &opts); - if (link_fd < 0) { - pr_warn("bpf_link_create failed: %s\n", strerror(errno)); - return link_fd; - } - - ctx->link_fd = link_fd; - return 0; -} - -static int xsk_get_max_queues(struct xsk_socket *xsk) -{ - struct ethtool_channels channels = { .cmd = ETHTOOL_GCHANNELS }; - struct xsk_ctx *ctx = xsk->ctx; - struct ifreq ifr = {}; - int fd, err, ret; - - fd = socket(AF_LOCAL, SOCK_DGRAM | SOCK_CLOEXEC, 0); - if (fd < 0) - return -errno; - - ifr.ifr_data = (void *)&channels; - bpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ); - err = ioctl(fd, SIOCETHTOOL, &ifr); - if (err && errno != EOPNOTSUPP) { - ret = -errno; - goto out; - } - - if (err) { - /* If the device says it has no channels, then all traffic - * is sent to a single stream, so max queues = 1. - */ - ret = 1; - } else { - /* Take the max of rx, tx, combined. Drivers return - * the number of channels in different ways. - */ - ret = max(channels.max_rx, channels.max_tx); - ret = max(ret, (int)channels.max_combined); - } - -out: - close(fd); - return ret; -} - -static int xsk_create_bpf_maps(struct xsk_socket *xsk) -{ - struct xsk_ctx *ctx = xsk->ctx; - int max_queues; - int fd; - - max_queues = xsk_get_max_queues(xsk); - if (max_queues < 0) - return max_queues; - - fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, "xsks_map", - sizeof(int), sizeof(int), max_queues, NULL); - if (fd < 0) - return fd; - - ctx->xsks_map_fd = fd; - - return 0; -} - -static void xsk_delete_bpf_maps(struct xsk_socket *xsk) -{ - struct xsk_ctx *ctx = xsk->ctx; - - bpf_map_delete_elem(ctx->xsks_map_fd, &ctx->queue_id); - close(ctx->xsks_map_fd); -} - -static int xsk_lookup_bpf_maps(struct xsk_socket *xsk) -{ - __u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info); - __u32 map_len = sizeof(struct bpf_map_info); - struct bpf_prog_info prog_info = {}; - struct xsk_ctx *ctx = xsk->ctx; - struct bpf_map_info map_info; - int fd, err; - - err = bpf_obj_get_info_by_fd(ctx->prog_fd, &prog_info, &prog_len); - if (err) - return err; - - num_maps = prog_info.nr_map_ids; - - map_ids = calloc(prog_info.nr_map_ids, sizeof(*map_ids)); - if (!map_ids) - return -ENOMEM; - - memset(&prog_info, 0, prog_len); - prog_info.nr_map_ids = num_maps; - prog_info.map_ids = (__u64)(unsigned long)map_ids; - - err = bpf_obj_get_info_by_fd(ctx->prog_fd, &prog_info, &prog_len); - if (err) - goto out_map_ids; - - ctx->xsks_map_fd = -1; - - for (i = 0; i < prog_info.nr_map_ids; i++) { - fd = bpf_map_get_fd_by_id(map_ids[i]); - if (fd < 0) - continue; - - memset(&map_info, 0, map_len); - err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len); - if (err) { - close(fd); - continue; - } - - if (!strncmp(map_info.name, "xsks_map", sizeof(map_info.name))) { - ctx->xsks_map_fd = fd; - break; - } - - close(fd); - } - - if (ctx->xsks_map_fd == -1) - err = -ENOENT; - -out_map_ids: - free(map_ids); - return err; -} - -static int xsk_set_bpf_maps(struct xsk_socket *xsk) -{ - struct xsk_ctx *ctx = xsk->ctx; - - return bpf_map_update_elem(ctx->xsks_map_fd, &ctx->queue_id, - &xsk->fd, 0); -} - -static int xsk_link_lookup(int ifindex, __u32 *prog_id, int *link_fd) +bool xsk_is_in_mode(u32 ifindex, int mode) { - struct bpf_link_info link_info; - __u32 link_len; - __u32 id = 0; - int err; - int fd; - - while (true) { - err = bpf_link_get_next_id(id, &id); - if (err) { - if (errno == ENOENT) { - err = 0; - break; - } - pr_warn("can't get next link: %s\n", strerror(errno)); - break; - } - - fd = bpf_link_get_fd_by_id(id); - if (fd < 0) { - if (errno == ENOENT) - continue; - pr_warn("can't get link by id (%u): %s\n", id, strerror(errno)); - err = -errno; - break; - } + LIBBPF_OPTS(bpf_xdp_query_opts, opts); + int ret; - link_len = sizeof(struct bpf_link_info); - memset(&link_info, 0, link_len); - err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len); - if (err) { - pr_warn("can't get link info: %s\n", strerror(errno)); - close(fd); - break; - } - if (link_info.type == BPF_LINK_TYPE_XDP) { - if (link_info.xdp.ifindex == ifindex) { - *link_fd = fd; - if (prog_id) - *prog_id = link_info.prog_id; - break; - } - } - close(fd); + ret = bpf_xdp_query(ifindex, mode, &opts); + if (ret) { + printf("XDP mode query returned error %s\n", strerror(errno)); + return false; } - return err; -} + if (mode == XDP_FLAGS_DRV_MODE) + return opts.attach_mode == XDP_ATTACHED_DRV; + else if (mode == XDP_FLAGS_SKB_MODE) + return opts.attach_mode == XDP_ATTACHED_SKB; -static bool xsk_probe_bpf_link(void) -{ - LIBBPF_OPTS(bpf_link_create_opts, opts, .flags = XDP_FLAGS_SKB_MODE); - struct bpf_insn insns[2] = { - BPF_MOV64_IMM(BPF_REG_0, XDP_PASS), - BPF_EXIT_INSN() - }; - int prog_fd, link_fd = -1, insn_cnt = ARRAY_SIZE(insns); - int ifindex_lo = 1; - bool ret = false; - int err; - - err = xsk_link_lookup(ifindex_lo, NULL, &link_fd); - if (err) - return ret; - - if (link_fd >= 0) - return true; - - prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, NULL); - if (prog_fd < 0) - return ret; - - link_fd = bpf_link_create(prog_fd, ifindex_lo, BPF_XDP, &opts); - close(prog_fd); - - if (link_fd >= 0) { - ret = true; - close(link_fd); - } - - return ret; + return false; } -static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk) +int xsk_attach_xdp_program(struct bpf_program *prog, int ifindex, u32 xdp_flags) { - char ifname[IFNAMSIZ]; - struct xsk_ctx *ctx; - char *interface; - - ctx = calloc(1, sizeof(*ctx)); - if (!ctx) - return -ENOMEM; - - interface = if_indextoname(ifindex, &ifname[0]); - if (!interface) { - free(ctx); - return -errno; - } - - ctx->ifindex = ifindex; - bpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); - - xsk->ctx = ctx; - xsk->ctx->has_bpf_link = xsk_probe_bpf_link(); + int prog_fd; - return 0; + prog_fd = bpf_program__fd(prog); + return bpf_xdp_attach(ifindex, prog_fd, xdp_flags, NULL); } -static int xsk_init_xdp_res(struct xsk_socket *xsk, - int *xsks_map_fd) +void xsk_detach_xdp_program(int ifindex, u32 xdp_flags) { - struct xsk_ctx *ctx = xsk->ctx; - int err; - - err = xsk_create_bpf_maps(xsk); - if (err) - return err; - - err = xsk_load_xdp_prog(xsk); - if (err) - goto err_load_xdp_prog; - - if (ctx->has_bpf_link) - err = xsk_create_bpf_link(xsk); - else - err = bpf_xdp_attach(xsk->ctx->ifindex, ctx->prog_fd, - xsk->config.xdp_flags, NULL); - - if (err) - goto err_attach_xdp_prog; - - if (!xsk->rx) - return err; - - err = xsk_set_bpf_maps(xsk); - if (err) - goto err_set_bpf_maps; - - return err; - -err_set_bpf_maps: - if (ctx->has_bpf_link) - close(ctx->link_fd); - else - bpf_xdp_detach(ctx->ifindex, 0, NULL); -err_attach_xdp_prog: - close(ctx->prog_fd); -err_load_xdp_prog: - xsk_delete_bpf_maps(xsk); - return err; + bpf_xdp_detach(ifindex, xdp_flags, NULL); } -static int xsk_lookup_xdp_res(struct xsk_socket *xsk, int *xsks_map_fd, int prog_id) +void xsk_clear_xskmap(struct bpf_map *map) { - struct xsk_ctx *ctx = xsk->ctx; - int err; - - ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id); - if (ctx->prog_fd < 0) { - err = -errno; - goto err_prog_fd; - } - err = xsk_lookup_bpf_maps(xsk); - if (err) - goto err_lookup_maps; - - if (!xsk->rx) - return err; - - err = xsk_set_bpf_maps(xsk); - if (err) - goto err_set_maps; + u32 index = 0; + int map_fd; - return err; - -err_set_maps: - close(ctx->xsks_map_fd); -err_lookup_maps: - close(ctx->prog_fd); -err_prog_fd: - if (ctx->has_bpf_link) - close(ctx->link_fd); - return err; + map_fd = bpf_map__fd(map); + bpf_map_delete_elem(map_fd, &index); } -static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp, int *xsks_map_fd) +int xsk_update_xskmap(struct bpf_map *map, struct xsk_socket *xsk) { - struct xsk_socket *xsk = _xdp; - struct xsk_ctx *ctx = xsk->ctx; - __u32 prog_id = 0; - int err; + int map_fd, sock_fd; + u32 index = 0; - if (ctx->has_bpf_link) - err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd); - else - err = bpf_xdp_query_id(ctx->ifindex, xsk->config.xdp_flags, &prog_id); + map_fd = bpf_map__fd(map); + sock_fd = xsk_socket__fd(xsk); - if (err) - return err; - - err = !prog_id ? xsk_init_xdp_res(xsk, xsks_map_fd) : - xsk_lookup_xdp_res(xsk, xsks_map_fd, prog_id); - - if (!err && xsks_map_fd) - *xsks_map_fd = ctx->xsks_map_fd; - - return err; -} - -int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd) -{ - return __xsk_setup_xdp_prog(xsk, xsks_map_fd); + return bpf_map_update_elem(map_fd, &index, &sock_fd, 0); } static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex, @@ -913,7 +365,7 @@ out_free: static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk, struct xsk_umem *umem, int ifindex, - const char *ifname, __u32 queue_id, + __u32 queue_id, struct xsk_ring_prod *fill, struct xsk_ring_cons *comp) { @@ -940,51 +392,15 @@ static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk, ctx->refcount = 1; ctx->umem = umem; ctx->queue_id = queue_id; - bpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ); ctx->fill = fill; ctx->comp = comp; list_add(&ctx->list, &umem->ctx_list); - ctx->has_bpf_link = xsk_probe_bpf_link(); return ctx; } -static void xsk_destroy_xsk_struct(struct xsk_socket *xsk) -{ - free(xsk->ctx); - free(xsk); -} - -int xsk_socket__update_xskmap(struct xsk_socket *xsk, int fd) -{ - xsk->ctx->xsks_map_fd = fd; - return xsk_set_bpf_maps(xsk); -} - -int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd) -{ - struct xsk_socket *xsk; - int res; - - xsk = calloc(1, sizeof(*xsk)); - if (!xsk) - return -ENOMEM; - - res = xsk_create_xsk_struct(ifindex, xsk); - if (res) { - free(xsk); - return -EINVAL; - } - - res = __xsk_setup_xdp_prog(xsk, xsks_map_fd); - - xsk_destroy_xsk_struct(xsk); - - return res; -} - int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, - const char *ifname, + int ifindex, __u32 queue_id, struct xsk_umem *umem, struct xsk_ring_cons *rx, struct xsk_ring_prod *tx, @@ -998,7 +414,7 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, struct xdp_mmap_offsets off; struct xsk_socket *xsk; struct xsk_ctx *ctx; - int err, ifindex; + int err; if (!umem || !xsk_ptr || !(rx || tx)) return -EFAULT; @@ -1013,13 +429,6 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, if (err) goto out_xsk_alloc; - xsk->outstanding_tx = 0; - ifindex = if_nametoindex(ifname); - if (!ifindex) { - err = -errno; - goto out_xsk_alloc; - } - if (umem->refcount++ > 0) { xsk->fd = socket(AF_XDP, SOCK_RAW | SOCK_CLOEXEC, 0); if (xsk->fd < 0) { @@ -1039,8 +448,7 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, goto out_socket; } - ctx = xsk_create_ctx(xsk, umem, ifindex, ifname, queue_id, - fill, comp); + ctx = xsk_create_ctx(xsk, umem, ifindex, queue_id, fill, comp); if (!ctx) { err = -ENOMEM; goto out_socket; @@ -1138,12 +546,6 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, goto out_mmap_tx; } - if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) { - err = __xsk_setup_xdp_prog(xsk, NULL); - if (err) - goto out_mmap_tx; - } - *xsk_ptr = xsk; umem->fill_save = NULL; umem->comp_save = NULL; @@ -1167,7 +569,7 @@ out_xsk_alloc: return err; } -int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, +int xsk_socket__create(struct xsk_socket **xsk_ptr, int ifindex, __u32 queue_id, struct xsk_umem *umem, struct xsk_ring_cons *rx, struct xsk_ring_prod *tx, const struct xsk_socket_config *usr_config) @@ -1175,7 +577,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, if (!umem) return -EFAULT; - return xsk_socket__create_shared(xsk_ptr, ifname, queue_id, umem, + return xsk_socket__create_shared(xsk_ptr, ifindex, queue_id, umem, rx, tx, umem->fill_save, umem->comp_save, usr_config); } @@ -1219,13 +621,6 @@ void xsk_socket__delete(struct xsk_socket *xsk) ctx = xsk->ctx; umem = ctx->umem; - if (ctx->refcount == 1) { - xsk_delete_bpf_maps(xsk); - close(ctx->prog_fd); - if (ctx->has_bpf_link) - close(ctx->link_fd); - } - xsk_put_ctx(ctx, true); err = xsk_get_mmap_offsets(xsk->fd, &off); diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h index 997723b0bfb2..04ed8b544712 100644 --- a/tools/testing/selftests/bpf/xsk.h +++ b/tools/testing/selftests/bpf/xsk.h @@ -23,77 +23,6 @@ extern "C" { #endif -/* This whole API has been deprecated and moved to libxdp that can be found at - * https://github.com/xdp-project/xdp-tools. The APIs are exactly the same so - * it should just be linking with libxdp instead of libbpf for this set of - * functionality. If not, please submit a bug report on the aforementioned page. - */ - -/* Load-Acquire Store-Release barriers used by the XDP socket - * library. The following macros should *NOT* be considered part of - * the xsk.h API, and is subject to change anytime. - * - * LIBRARY INTERNAL - */ - -#define __XSK_READ_ONCE(x) (*(volatile typeof(x) *)&x) -#define __XSK_WRITE_ONCE(x, v) (*(volatile typeof(x) *)&x) = (v) - -#if defined(__i386__) || defined(__x86_64__) -# define libbpf_smp_store_release(p, v) \ - do { \ - asm volatile("" : : : "memory"); \ - __XSK_WRITE_ONCE(*p, v); \ - } while (0) -# define libbpf_smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \ - asm volatile("" : : : "memory"); \ - ___p1; \ - }) -#elif defined(__aarch64__) -# define libbpf_smp_store_release(p, v) \ - asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory") -# define libbpf_smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1; \ - asm volatile ("ldar %w0, %1" \ - : "=r" (___p1) : "Q" (*p) : "memory"); \ - ___p1; \ - }) -#elif defined(__riscv) -# define libbpf_smp_store_release(p, v) \ - do { \ - asm volatile ("fence rw,w" : : : "memory"); \ - __XSK_WRITE_ONCE(*p, v); \ - } while (0) -# define libbpf_smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \ - asm volatile ("fence r,rw" : : : "memory"); \ - ___p1; \ - }) -#endif - -#ifndef libbpf_smp_store_release -#define libbpf_smp_store_release(p, v) \ - do { \ - __sync_synchronize(); \ - __XSK_WRITE_ONCE(*p, v); \ - } while (0) -#endif - -#ifndef libbpf_smp_load_acquire -#define libbpf_smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \ - __sync_synchronize(); \ - ___p1; \ - }) -#endif - -/* LIBRARY INTERNAL -- END */ - /* Do not access these members directly. Use the functions below. */ #define DEFINE_XSK_RING(name) \ struct name { \ @@ -168,7 +97,7 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb) * this function. Without this optimization it whould have been * free_entries = r->cached_prod - r->cached_cons + r->size. */ - r->cached_cons = libbpf_smp_load_acquire(r->consumer); + r->cached_cons = __atomic_load_n(r->consumer, __ATOMIC_ACQUIRE); r->cached_cons += r->size; return r->cached_cons - r->cached_prod; @@ -179,7 +108,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb) __u32 entries = r->cached_prod - r->cached_cons; if (entries == 0) { - r->cached_prod = libbpf_smp_load_acquire(r->producer); + r->cached_prod = __atomic_load_n(r->producer, __ATOMIC_ACQUIRE); entries = r->cached_prod - r->cached_cons; } @@ -202,7 +131,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb) /* Make sure everything has been written to the ring before indicating * this to the kernel by writing the producer pointer. */ - libbpf_smp_store_release(prod->producer, *prod->producer + nb); + __atomic_store_n(prod->producer, *prod->producer + nb, __ATOMIC_RELEASE); } static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx) @@ -227,8 +156,7 @@ static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb) /* Make sure data has been read before indicating we are done * with the entries by updating the consumer pointer. */ - libbpf_smp_store_release(cons->consumer, *cons->consumer + nb); - + __atomic_store_n(cons->consumer, *cons->consumer + nb, __ATOMIC_RELEASE); } static inline void *xsk_umem__get_data(void *umem_area, __u64 addr) @@ -269,18 +197,15 @@ struct xsk_umem_config { __u32 flags; }; -int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd); -int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd); -int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd); - -/* Flags for the libbpf_flags field. */ -#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0) +int xsk_attach_xdp_program(struct bpf_program *prog, int ifindex, u32 xdp_flags); +void xsk_detach_xdp_program(int ifindex, u32 xdp_flags); +int xsk_update_xskmap(struct bpf_map *map, struct xsk_socket *xsk); +void xsk_clear_xskmap(struct bpf_map *map); +bool xsk_is_in_mode(u32 ifindex, int mode); struct xsk_socket_config { __u32 rx_size; __u32 tx_size; - __u32 libbpf_flags; - __u32 xdp_flags; __u16 bind_flags; }; @@ -291,13 +216,13 @@ int xsk_umem__create(struct xsk_umem **umem, struct xsk_ring_cons *comp, const struct xsk_umem_config *config); int xsk_socket__create(struct xsk_socket **xsk, - const char *ifname, __u32 queue_id, + int ifindex, __u32 queue_id, struct xsk_umem *umem, struct xsk_ring_cons *rx, struct xsk_ring_prod *tx, const struct xsk_socket_config *config); int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, - const char *ifname, + int ifindex, __u32 queue_id, struct xsk_umem *umem, struct xsk_ring_cons *rx, struct xsk_ring_prod *tx, diff --git a/tools/testing/selftests/bpf/xsk_prereqs.sh b/tools/testing/selftests/bpf/xsk_prereqs.sh index a0b71723a818..ae697a10a056 100755 --- a/tools/testing/selftests/bpf/xsk_prereqs.sh +++ b/tools/testing/selftests/bpf/xsk_prereqs.sh @@ -55,21 +55,13 @@ test_exit() clear_configs() { - if [ $(ip netns show | grep $3 &>/dev/null; echo $?;) == 0 ]; then - [ $(ip netns exec $3 ip link show $2 &>/dev/null; echo $?;) == 0 ] && - { ip netns exec $3 ip link del $2; } - ip netns del $3 - fi - #Once we delete a veth pair node, the entire veth pair is removed, - #this is just to be cautious just incase the NS does not exist then - #veth node inside NS won't get removed so we explicitly remove it [ $(ip link show $1 &>/dev/null; echo $?;) == 0 ] && { ip link del $1; } } cleanup_exit() { - clear_configs $1 $2 $3 + clear_configs $1 $2 } validate_ip_utility() @@ -83,7 +75,7 @@ exec_xskxceiver() ARGS+="-b " fi - ./${XSKOBJ} -i ${VETH0} -i ${VETH1},${NS1} ${ARGS} + ./${XSKOBJ} -i ${VETH0} -i ${VETH1} ${ARGS} retval=$? test_status $retval "${TEST_NAME}" diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 162d3a516f2c..a17655107a94 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -55,12 +55,11 @@ * Flow: * ----- * - Single process spawns two threads: Tx and Rx - * - Each of these two threads attach to a veth interface within their assigned - * namespaces - * - Each thread Creates one AF_XDP socket connected to a unique umem for each + * - Each of these two threads attach to a veth interface + * - Each thread creates one AF_XDP socket connected to a unique umem for each * veth interface - * - Tx thread Transmits 10k packets from veth<xxxx> to veth<yyyy> - * - Rx thread verifies if all 10k packets were received and delivered in-order, + * - Tx thread Transmits a number of packets from veth<xxxx> to veth<yyyy> + * - Rx thread verifies if all packets were received and delivered in-order, * and have the right content * * Enable/disable packet dump mode: @@ -97,18 +96,14 @@ #include <time.h> #include <unistd.h> #include <stdatomic.h> + +#include "xsk_xdp_progs.skel.h" #include "xsk.h" #include "xskxceiver.h" #include <bpf/bpf.h> #include <linux/filter.h> #include "../kselftest.h" -/* AF_XDP APIs were moved into libxdp and marked as deprecated in libbpf. - * Until xskxceiver is either moved or re-writed into libxdp, suppress - * deprecation warnings in this file - */ -#pragma GCC diagnostic ignored "-Wdeprecated-declarations" - static const char *MAC1 = "\x00\x0A\x56\x9E\xEE\x62"; static const char *MAC2 = "\x00\x0A\x56\x9E\xEE\x61"; static const char *IP1 = "192.168.100.162"; @@ -269,6 +264,11 @@ static void gen_udp_csum(struct udphdr *udp_hdr, struct iphdr *ip_hdr) udp_csum(ip_hdr->saddr, ip_hdr->daddr, UDP_PKT_SIZE, IPPROTO_UDP, (u16 *)udp_hdr); } +static u32 mode_to_xdp_flags(enum test_mode mode) +{ + return (mode == TEST_MODE_SKB) ? XDP_FLAGS_SKB_MODE : XDP_FLAGS_DRV_MODE; +} + static int xsk_configure_umem(struct xsk_umem_info *umem, void *buffer, u64 size) { struct xsk_umem_config cfg = { @@ -322,15 +322,13 @@ static int __xsk_configure_socket(struct xsk_socket_info *xsk, struct xsk_umem_i xsk->umem = umem; cfg.rx_size = xsk->rxqsize; cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; - cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; - cfg.xdp_flags = ifobject->xdp_flags; cfg.bind_flags = ifobject->bind_flags; if (shared) cfg.bind_flags |= XDP_SHARED_UMEM; txr = ifobject->tx_on ? &xsk->tx : NULL; rxr = ifobject->rx_on ? &xsk->rx : NULL; - return xsk_socket__create(&xsk->xsk, ifobject->ifname, 0, umem->umem, rxr, txr, &cfg); + return xsk_socket__create(&xsk->xsk, ifobject->ifindex, 0, umem->umem, rxr, txr, &cfg); } static bool ifobj_zc_avail(struct ifobject *ifobject) @@ -350,7 +348,7 @@ static bool ifobj_zc_avail(struct ifobject *ifobject) umem = calloc(1, sizeof(struct xsk_umem_info)); if (!umem) { munmap(bufs, umem_sz); - exit_with_error(-ENOMEM); + exit_with_error(ENOMEM); } umem->frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; ret = xsk_configure_umem(umem, bufs, umem_sz); @@ -360,8 +358,6 @@ static bool ifobj_zc_avail(struct ifobject *ifobject) xsk = calloc(1, sizeof(struct xsk_socket_info)); if (!xsk) goto out; - ifobject->xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; - ifobject->xdp_flags |= XDP_FLAGS_DRV_MODE; ifobject->bind_flags = XDP_USE_NEED_WAKEUP | XDP_ZEROCOPY; ifobject->rx_on = true; xsk->rxqsize = XSK_RING_CONS__DEFAULT_NUM_DESCS; @@ -399,28 +395,6 @@ static void usage(const char *prog) ksft_print_msg(str, prog); } -static int switch_namespace(const char *nsname) -{ - char fqns[26] = "/var/run/netns/"; - int nsfd; - - if (!nsname || strlen(nsname) == 0) - return -1; - - strncat(fqns, nsname, sizeof(fqns) - strlen(fqns) - 1); - nsfd = open(fqns, O_RDONLY); - - if (nsfd == -1) - exit_with_error(errno); - - if (setns(nsfd, 0) == -1) - exit_with_error(errno); - - print_verbose("NS switched: %s\n", nsname); - - return nsfd; -} - static bool validate_interface(struct ifobject *ifobj) { if (!strcmp(ifobj->ifname, "")) @@ -438,8 +412,6 @@ static void parse_command_line(struct ifobject *ifobj_tx, struct ifobject *ifobj opterr = 0; for (;;) { - char *sptr, *token; - c = getopt_long(argc, argv, "i:Dvb", long_options, &option_index); if (c == -1) break; @@ -453,11 +425,13 @@ static void parse_command_line(struct ifobject *ifobj_tx, struct ifobject *ifobj else break; - sptr = strndupa(optarg, strlen(optarg)); - memcpy(ifobj->ifname, strsep(&sptr, ","), MAX_INTERFACE_NAME_CHARS); - token = strsep(&sptr, ","); - if (token) - memcpy(ifobj->nsname, token, MAX_INTERFACES_NAMESPACE_CHARS); + memcpy(ifobj->ifname, optarg, + min_t(size_t, MAX_INTERFACE_NAME_CHARS, strlen(optarg))); + + ifobj->ifindex = if_nametoindex(ifobj->ifname); + if (!ifobj->ifindex) + exit_with_error(errno); + interface_nb++; break; case 'D': @@ -520,6 +494,10 @@ static void __test_spec_init(struct test_spec *test, struct ifobject *ifobj_tx, test->total_steps = 1; test->nb_sockets = 1; test->fail = false; + test->xdp_prog_rx = ifobj_rx->xdp_progs->progs.xsk_def_prog; + test->xskmap_rx = ifobj_rx->xdp_progs->maps.xsk; + test->xdp_prog_tx = ifobj_tx->xdp_progs->progs.xsk_def_prog; + test->xskmap_tx = ifobj_tx->xdp_progs->maps.xsk; } static void test_spec_init(struct test_spec *test, struct ifobject *ifobj_tx, @@ -538,12 +516,6 @@ static void test_spec_init(struct test_spec *test, struct ifobject *ifobj_tx, for (i = 0; i < MAX_INTERFACES; i++) { struct ifobject *ifobj = i ? ifobj_rx : ifobj_tx; - ifobj->xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; - if (mode == TEST_MODE_SKB) - ifobj->xdp_flags |= XDP_FLAGS_SKB_MODE; - else - ifobj->xdp_flags |= XDP_FLAGS_DRV_MODE; - ifobj->bind_flags = XDP_USE_NEED_WAKEUP; if (mode == TEST_MODE_ZC) ifobj->bind_flags |= XDP_ZEROCOPY; @@ -565,6 +537,16 @@ static void test_spec_set_name(struct test_spec *test, const char *name) strncpy(test->name, name, MAX_TEST_NAME_SIZE); } +static void test_spec_set_xdp_prog(struct test_spec *test, struct bpf_program *xdp_prog_rx, + struct bpf_program *xdp_prog_tx, struct bpf_map *xskmap_rx, + struct bpf_map *xskmap_tx) +{ + test->xdp_prog_rx = xdp_prog_rx; + test->xdp_prog_tx = xdp_prog_tx; + test->xskmap_rx = xskmap_rx; + test->xskmap_tx = xskmap_tx; +} + static void pkt_stream_reset(struct pkt_stream *pkt_stream) { if (pkt_stream) @@ -767,7 +749,7 @@ static void pkt_dump(void *pkt, u32 len) struct ethhdr *ethhdr; struct udphdr *udphdr; struct iphdr *iphdr; - int payload, i; + u32 payload, i; ethhdr = pkt; iphdr = pkt + sizeof(*ethhdr); @@ -792,7 +774,7 @@ static void pkt_dump(void *pkt, u32 len) fprintf(stdout, "DEBUG>> L4: udp_hdr->src: %d\n", ntohs(udphdr->source)); fprintf(stdout, "DEBUG>> L4: udp_hdr->dst: %d\n", ntohs(udphdr->dest)); /*extract L5 frame */ - payload = *((uint32_t *)(pkt + PKT_HDR_SIZE)); + payload = ntohl(*((u32 *)(pkt + PKT_HDR_SIZE))); fprintf(stdout, "DEBUG>> L5: payload: %d\n", payload); fprintf(stdout, "---------------------------------------\n"); @@ -936,7 +918,7 @@ static int receive_pkts(struct test_spec *test, struct pollfd *fds) if (ifobj->use_poll) { ret = poll(fds, 1, POLL_TMOUT); if (ret < 0) - exit_with_error(-ret); + exit_with_error(errno); if (!ret) { if (!is_umem_valid(test->ifobj_tx)) @@ -963,7 +945,7 @@ static int receive_pkts(struct test_spec *test, struct pollfd *fds) if (xsk_ring_prod__needs_wakeup(&umem->fq)) { ret = poll(fds, 1, POLL_TMOUT); if (ret < 0) - exit_with_error(-ret); + exit_with_error(errno); } ret = xsk_ring_prod__reserve(&umem->fq, rcvd, &idx_fq); } @@ -1015,7 +997,7 @@ static int __send_pkts(struct ifobject *ifobject, u32 *pkt_nb, struct pollfd *fd if (timeout) { if (ret < 0) { ksft_print_msg("ERROR: [%s] Poll error %d\n", - __func__, ret); + __func__, errno); return TEST_FAILURE; } if (ret == 0) @@ -1024,7 +1006,7 @@ static int __send_pkts(struct ifobject *ifobject, u32 *pkt_nb, struct pollfd *fd } if (ret <= 0) { ksft_print_msg("ERROR: [%s] Poll error %d\n", - __func__, ret); + __func__, errno); return TEST_FAILURE; } } @@ -1240,7 +1222,7 @@ static void thread_common_ops_tx(struct test_spec *test, struct ifobject *ifobje { xsk_configure_socket(test, ifobject, test->ifobj_rx->umem, true); ifobject->xsk = &ifobject->xsk_arr[0]; - ifobject->xsk_map_fd = test->ifobj_rx->xsk_map_fd; + ifobject->xskmap = test->ifobj_rx->xskmap; memcpy(ifobject->umem, test->ifobj_rx->umem, sizeof(struct xsk_umem_info)); } @@ -1272,7 +1254,7 @@ static void xsk_populate_fill_ring(struct xsk_umem_info *umem, struct pkt_stream *xsk_ring_prod__fill_addr(&umem->fq, idx++) = addr; } - xsk_ring_prod__submit(&umem->fq, buffers_to_fill); + xsk_ring_prod__submit(&umem->fq, i); } static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) @@ -1280,10 +1262,8 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) u64 umem_sz = ifobject->umem->num_frames * ifobject->umem->frame_size; int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; LIBBPF_OPTS(bpf_xdp_query_opts, opts); - int ret, ifindex; void *bufs; - - ifobject->ns_fd = switch_namespace(ifobject->nsname); + int ret; if (ifobject->umem->unaligned_mode) mmap_flags |= MAP_HUGETLB; @@ -1308,33 +1288,9 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) if (!ifobject->rx_on) return; - ifindex = if_nametoindex(ifobject->ifname); - if (!ifindex) - exit_with_error(errno); - - ret = xsk_setup_xdp_prog_xsk(ifobject->xsk->xsk, &ifobject->xsk_map_fd); - if (ret) - exit_with_error(-ret); - - ret = bpf_xdp_query(ifindex, ifobject->xdp_flags, &opts); - if (ret) - exit_with_error(-ret); - - if (ifobject->xdp_flags & XDP_FLAGS_SKB_MODE) { - if (opts.attach_mode != XDP_ATTACHED_SKB) { - ksft_print_msg("ERROR: [%s] XDP prog not in SKB mode\n"); - exit_with_error(-EINVAL); - } - } else if (ifobject->xdp_flags & XDP_FLAGS_DRV_MODE) { - if (opts.attach_mode != XDP_ATTACHED_DRV) { - ksft_print_msg("ERROR: [%s] XDP prog not in DRV mode\n"); - exit_with_error(-EINVAL); - } - } - - ret = xsk_socket__update_xskmap(ifobject->xsk->xsk, ifobject->xsk_map_fd); + ret = xsk_update_xskmap(ifobject->xskmap, ifobject->xsk->xsk); if (ret) - exit_with_error(-ret); + exit_with_error(errno); } static void *worker_testapp_validate_tx(void *arg) @@ -1367,14 +1323,17 @@ static void *worker_testapp_validate_rx(void *arg) struct test_spec *test = (struct test_spec *)arg; struct ifobject *ifobject = test->ifobj_rx; struct pollfd fds = { }; - int id = 0; int err; if (test->current_step == 1) { thread_common_ops(test, ifobject); } else { - bpf_map_delete_elem(ifobject->xsk_map_fd, &id); - xsk_socket__update_xskmap(ifobject->xsk->xsk, ifobject->xsk_map_fd); + xsk_clear_xskmap(ifobject->xskmap); + err = xsk_update_xskmap(ifobject->xskmap, ifobject->xsk->xsk); + if (err) { + printf("Error: Failed to update xskmap, error %s\n", strerror(-err)); + exit_with_error(-err); + } } fds.fd = xsk_socket__fd(ifobject->xsk->xsk); @@ -1412,84 +1371,106 @@ static void handler(int signum) pthread_exit(NULL); } -static int testapp_validate_traffic_single_thread(struct test_spec *test, struct ifobject *ifobj, - enum test_type type) +static bool xdp_prog_changed(struct test_spec *test, struct ifobject *ifobj) { - bool old_shared_umem = ifobj->shared_umem; - pthread_t t0; - - if (pthread_barrier_init(&barr, NULL, 2)) - exit_with_error(errno); - - test->current_step++; - if (type == TEST_TYPE_POLL_RXQ_TMOUT) - pkt_stream_reset(ifobj->pkt_stream); - pkts_in_flight = 0; - - test->ifobj_rx->shared_umem = false; - test->ifobj_tx->shared_umem = false; + return ifobj->xdp_prog != test->xdp_prog_rx || ifobj->mode != test->mode; +} - signal(SIGUSR1, handler); - /* Spawn thread */ - pthread_create(&t0, NULL, ifobj->func_ptr, test); +static void xsk_reattach_xdp(struct ifobject *ifobj, struct bpf_program *xdp_prog, + struct bpf_map *xskmap, enum test_mode mode) +{ + int err; - if (type != TEST_TYPE_POLL_TXQ_TMOUT) - pthread_barrier_wait(&barr); + xsk_detach_xdp_program(ifobj->ifindex, mode_to_xdp_flags(ifobj->mode)); + err = xsk_attach_xdp_program(xdp_prog, ifobj->ifindex, mode_to_xdp_flags(mode)); + if (err) { + printf("Error attaching XDP program\n"); + exit_with_error(-err); + } - if (pthread_barrier_destroy(&barr)) - exit_with_error(errno); + if (ifobj->mode != mode && (mode == TEST_MODE_DRV || mode == TEST_MODE_ZC)) + if (!xsk_is_in_mode(ifobj->ifindex, XDP_FLAGS_DRV_MODE)) { + ksft_print_msg("ERROR: XDP prog not in DRV mode\n"); + exit_with_error(EINVAL); + } - pthread_kill(t0, SIGUSR1); - pthread_join(t0, NULL); + ifobj->xdp_prog = xdp_prog; + ifobj->xskmap = xskmap; + ifobj->mode = mode; +} - if (test->total_steps == test->current_step || test->fail) { - xsk_socket__delete(ifobj->xsk->xsk); - testapp_clean_xsk_umem(ifobj); - } +static void xsk_attach_xdp_progs(struct test_spec *test, struct ifobject *ifobj_rx, + struct ifobject *ifobj_tx) +{ + if (xdp_prog_changed(test, ifobj_rx)) + xsk_reattach_xdp(ifobj_rx, test->xdp_prog_rx, test->xskmap_rx, test->mode); - test->ifobj_rx->shared_umem = old_shared_umem; - test->ifobj_tx->shared_umem = old_shared_umem; + if (!ifobj_tx || ifobj_tx->shared_umem) + return; - return !!test->fail; + if (xdp_prog_changed(test, ifobj_tx)) + xsk_reattach_xdp(ifobj_tx, test->xdp_prog_tx, test->xskmap_tx, test->mode); } -static int testapp_validate_traffic(struct test_spec *test) +static int __testapp_validate_traffic(struct test_spec *test, struct ifobject *ifobj1, + struct ifobject *ifobj2) { - struct ifobject *ifobj_tx = test->ifobj_tx; - struct ifobject *ifobj_rx = test->ifobj_rx; pthread_t t0, t1; - if (pthread_barrier_init(&barr, NULL, 2)) - exit_with_error(errno); + if (ifobj2) + if (pthread_barrier_init(&barr, NULL, 2)) + exit_with_error(errno); test->current_step++; - pkt_stream_reset(ifobj_rx->pkt_stream); + pkt_stream_reset(ifobj1->pkt_stream); pkts_in_flight = 0; + signal(SIGUSR1, handler); /*Spawn RX thread */ - pthread_create(&t0, NULL, ifobj_rx->func_ptr, test); + pthread_create(&t0, NULL, ifobj1->func_ptr, test); - pthread_barrier_wait(&barr); - if (pthread_barrier_destroy(&barr)) - exit_with_error(errno); + if (ifobj2) { + pthread_barrier_wait(&barr); + if (pthread_barrier_destroy(&barr)) + exit_with_error(errno); - /*Spawn TX thread */ - pthread_create(&t1, NULL, ifobj_tx->func_ptr, test); + /*Spawn TX thread */ + pthread_create(&t1, NULL, ifobj2->func_ptr, test); - pthread_join(t1, NULL); - pthread_join(t0, NULL); + pthread_join(t1, NULL); + } + + if (!ifobj2) + pthread_kill(t0, SIGUSR1); + else + pthread_join(t0, NULL); if (test->total_steps == test->current_step || test->fail) { - xsk_socket__delete(ifobj_tx->xsk->xsk); - xsk_socket__delete(ifobj_rx->xsk->xsk); - testapp_clean_xsk_umem(ifobj_rx); - if (!ifobj_tx->shared_umem) - testapp_clean_xsk_umem(ifobj_tx); + if (ifobj2) + xsk_socket__delete(ifobj2->xsk->xsk); + xsk_socket__delete(ifobj1->xsk->xsk); + testapp_clean_xsk_umem(ifobj1); + if (ifobj2 && !ifobj2->shared_umem) + testapp_clean_xsk_umem(ifobj2); } return !!test->fail; } +static int testapp_validate_traffic(struct test_spec *test) +{ + struct ifobject *ifobj_rx = test->ifobj_rx; + struct ifobject *ifobj_tx = test->ifobj_tx; + + xsk_attach_xdp_progs(test, ifobj_rx, ifobj_tx); + return __testapp_validate_traffic(test, ifobj_rx, ifobj_tx); +} + +static int testapp_validate_traffic_single_thread(struct test_spec *test, struct ifobject *ifobj) +{ + return __testapp_validate_traffic(test, ifobj, NULL); +} + static void testapp_teardown(struct test_spec *test) { int i; @@ -1525,7 +1506,7 @@ static void testapp_bidi(struct test_spec *test) print_verbose("Switching Tx/Rx vectors\n"); swap_directions(&test->ifobj_rx, &test->ifobj_tx); - testapp_validate_traffic(test); + __testapp_validate_traffic(test, test->ifobj_rx, test->ifobj_tx); swap_directions(&test->ifobj_rx, &test->ifobj_tx); } @@ -1539,9 +1520,9 @@ static void swap_xsk_resources(struct ifobject *ifobj_tx, struct ifobject *ifobj ifobj_tx->xsk = &ifobj_tx->xsk_arr[1]; ifobj_rx->xsk = &ifobj_rx->xsk_arr[1]; - ret = xsk_socket__update_xskmap(ifobj_rx->xsk->xsk, ifobj_rx->xsk_map_fd); + ret = xsk_update_xskmap(ifobj_rx->xskmap, ifobj_rx->xsk->xsk); if (ret) - exit_with_error(-ret); + exit_with_error(errno); } static void testapp_bpf_res(struct test_spec *test) @@ -1580,8 +1561,6 @@ static void testapp_stats_tx_invalid_descs(struct test_spec *test) pkt_stream_replace_half(test, XSK_UMEM__INVALID_FRAME_SIZE, 0); test->ifobj_tx->validation_func = validate_tx_invalid_descs; testapp_validate_traffic(test); - - pkt_stream_restore_default(test); } static void testapp_stats_rx_full(struct test_spec *test) @@ -1597,8 +1576,6 @@ static void testapp_stats_rx_full(struct test_spec *test) test->ifobj_rx->release_rx = false; test->ifobj_rx->validation_func = validate_rx_full; testapp_validate_traffic(test); - - pkt_stream_restore_default(test); } static void testapp_stats_fill_empty(struct test_spec *test) @@ -1613,8 +1590,6 @@ static void testapp_stats_fill_empty(struct test_spec *test) test->ifobj_rx->use_fill_ring = false; test->ifobj_rx->validation_func = validate_fill_empty; testapp_validate_traffic(test); - - pkt_stream_restore_default(test); } /* Simple test */ @@ -1647,7 +1622,6 @@ static bool testapp_unaligned(struct test_spec *test) test->ifobj_rx->pkt_stream->use_addr_for_fill = true; testapp_validate_traffic(test); - pkt_stream_restore_default(test); return true; } @@ -1657,7 +1631,6 @@ static void testapp_single_pkt(struct test_spec *test) pkt_stream_generate_custom(test, pkts, ARRAY_SIZE(pkts)); testapp_validate_traffic(test); - pkt_stream_restore_default(test); } static void testapp_invalid_desc(struct test_spec *test) @@ -1698,7 +1671,51 @@ static void testapp_invalid_desc(struct test_spec *test) pkt_stream_generate_custom(test, pkts, ARRAY_SIZE(pkts)); testapp_validate_traffic(test); - pkt_stream_restore_default(test); +} + +static void testapp_xdp_drop(struct test_spec *test) +{ + struct xsk_xdp_progs *skel_rx = test->ifobj_rx->xdp_progs; + struct xsk_xdp_progs *skel_tx = test->ifobj_tx->xdp_progs; + + test_spec_set_name(test, "XDP_DROP_HALF"); + test_spec_set_xdp_prog(test, skel_rx->progs.xsk_xdp_drop, skel_tx->progs.xsk_xdp_drop, + skel_rx->maps.xsk, skel_tx->maps.xsk); + + pkt_stream_receive_half(test); + testapp_validate_traffic(test); +} + +static void testapp_poll_txq_tmout(struct test_spec *test) +{ + test_spec_set_name(test, "POLL_TXQ_FULL"); + + test->ifobj_tx->use_poll = true; + /* create invalid frame by set umem frame_size and pkt length equal to 2048 */ + test->ifobj_tx->umem->frame_size = 2048; + pkt_stream_replace(test, 2 * DEFAULT_PKT_CNT, 2048); + testapp_validate_traffic_single_thread(test, test->ifobj_tx); +} + +static void testapp_poll_rxq_tmout(struct test_spec *test) +{ + test_spec_set_name(test, "POLL_RXQ_EMPTY"); + test->ifobj_rx->use_poll = true; + testapp_validate_traffic_single_thread(test, test->ifobj_rx); +} + +static int xsk_load_xdp_programs(struct ifobject *ifobj) +{ + ifobj->xdp_progs = xsk_xdp_progs__open_and_load(); + if (libbpf_get_error(ifobj->xdp_progs)) + return libbpf_get_error(ifobj->xdp_progs); + + return 0; +} + +static void xsk_unload_xdp_programs(struct ifobject *ifobj) +{ + xsk_xdp_progs__destroy(ifobj->xdp_progs); } static void init_iface(struct ifobject *ifobj, const char *dst_mac, const char *src_mac, @@ -1706,6 +1723,7 @@ static void init_iface(struct ifobject *ifobj, const char *dst_mac, const char * const u16 src_port, thread_func_t func_ptr) { struct in_addr ip; + int err; memcpy(ifobj->dst_mac, dst_mac, ETH_ALEN); memcpy(ifobj->src_mac, src_mac, ETH_ALEN); @@ -1720,6 +1738,12 @@ static void init_iface(struct ifobject *ifobj, const char *dst_mac, const char * ifobj->src_port = src_port; ifobj->func_ptr = func_ptr; + + err = xsk_load_xdp_programs(ifobj); + if (err) { + printf("Error loading XDP program\n"); + exit_with_error(err); + } } static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_type type) @@ -1764,8 +1788,6 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ test->ifobj_rx->umem->frame_size = 2048; pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); testapp_validate_traffic(test); - - pkt_stream_restore_default(test); break; case TEST_TYPE_RX_POLL: test->ifobj_rx->use_poll = true; @@ -1778,18 +1800,10 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ testapp_validate_traffic(test); break; case TEST_TYPE_POLL_TXQ_TMOUT: - test_spec_set_name(test, "POLL_TXQ_FULL"); - test->ifobj_tx->use_poll = true; - /* create invalid frame by set umem frame_size and pkt length equal to 2048 */ - test->ifobj_tx->umem->frame_size = 2048; - pkt_stream_replace(test, 2 * DEFAULT_PKT_CNT, 2048); - testapp_validate_traffic_single_thread(test, test->ifobj_tx, type); - pkt_stream_restore_default(test); + testapp_poll_txq_tmout(test); break; case TEST_TYPE_POLL_RXQ_TMOUT: - test_spec_set_name(test, "POLL_RXQ_EMPTY"); - test->ifobj_rx->use_poll = true; - testapp_validate_traffic_single_thread(test, test->ifobj_rx, type); + testapp_poll_rxq_tmout(test); break; case TEST_TYPE_ALIGNED_INV_DESC: test_spec_set_name(test, "ALIGNED_INV_DESC"); @@ -1818,6 +1832,9 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ case TEST_TYPE_HEADROOM: testapp_headroom(test); break; + case TEST_TYPE_XDP_DROP_HALF: + testapp_xdp_drop(test); + break; default: break; } @@ -1825,6 +1842,7 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ if (!test->fail) ksft_test_result_pass("PASS: %s %s%s\n", mode_string(test), busy_poll_string(test), test->name); + pkt_stream_restore_default(test); } static struct ifobject *ifobject_create(void) @@ -1843,8 +1861,6 @@ static struct ifobject *ifobject_create(void) if (!ifobj->umem) goto out_umem; - ifobj->ns_fd = -1; - return ifobj; out_umem: @@ -1856,14 +1872,12 @@ out_xsk_arr: static void ifobject_delete(struct ifobject *ifobj) { - if (ifobj->ns_fd != -1) - close(ifobj->ns_fd); free(ifobj->umem); free(ifobj->xsk_arr); free(ifobj); } -static bool is_xdp_supported(struct ifobject *ifobject) +static bool is_xdp_supported(int ifindex) { int flags = XDP_FLAGS_DRV_MODE; @@ -1872,7 +1886,6 @@ static bool is_xdp_supported(struct ifobject *ifobject) BPF_MOV64_IMM(BPF_REG_0, XDP_PASS), BPF_EXIT_INSN() }; - int ifindex = if_nametoindex(ifobject->ifname); int prog_fd, insn_cnt = ARRAY_SIZE(insns); int err; @@ -1900,7 +1913,7 @@ int main(int argc, char **argv) int modes = TEST_MODE_SKB + 1; u32 i, j, failed_tests = 0; struct test_spec test; - bool shared_umem; + bool shared_netdev; /* Use libbpf 1.0 API mode */ libbpf_set_strict_mode(LIBBPF_STRICT_ALL); @@ -1915,27 +1928,27 @@ int main(int argc, char **argv) setlocale(LC_ALL, ""); parse_command_line(ifobj_tx, ifobj_rx, argc, argv); - shared_umem = !strcmp(ifobj_tx->ifname, ifobj_rx->ifname); - ifobj_tx->shared_umem = shared_umem; - ifobj_rx->shared_umem = shared_umem; + shared_netdev = (ifobj_tx->ifindex == ifobj_rx->ifindex); + ifobj_tx->shared_umem = shared_netdev; + ifobj_rx->shared_umem = shared_netdev; if (!validate_interface(ifobj_tx) || !validate_interface(ifobj_rx)) { usage(basename(argv[0])); ksft_exit_xfail(); } - init_iface(ifobj_tx, MAC1, MAC2, IP1, IP2, UDP_PORT1, UDP_PORT2, - worker_testapp_validate_tx); - init_iface(ifobj_rx, MAC2, MAC1, IP2, IP1, UDP_PORT2, UDP_PORT1, - worker_testapp_validate_rx); - - if (is_xdp_supported(ifobj_tx)) { + if (is_xdp_supported(ifobj_tx->ifindex)) { modes++; if (ifobj_zc_avail(ifobj_tx)) modes++; } + init_iface(ifobj_rx, MAC1, MAC2, IP1, IP2, UDP_PORT1, UDP_PORT2, + worker_testapp_validate_rx); + init_iface(ifobj_tx, MAC2, MAC1, IP2, IP1, UDP_PORT2, UDP_PORT1, + worker_testapp_validate_tx); + test_spec_init(&test, ifobj_tx, ifobj_rx, 0); tx_pkt_stream_default = pkt_stream_generate(ifobj_tx->umem, DEFAULT_PKT_CNT, PKT_SIZE); rx_pkt_stream_default = pkt_stream_generate(ifobj_rx->umem, DEFAULT_PKT_CNT, PKT_SIZE); @@ -1946,7 +1959,7 @@ int main(int argc, char **argv) ksft_set_plan(modes * TEST_TYPE_MAX); - for (i = 0; i < modes; i++) + for (i = 0; i < modes; i++) { for (j = 0; j < TEST_TYPE_MAX; j++) { test_spec_init(&test, ifobj_tx, ifobj_rx, i); run_pkt_test(&test, i, j); @@ -1955,9 +1968,12 @@ int main(int argc, char **argv) if (test.fail) failed_tests++; } + } pkt_stream_delete(tx_pkt_stream_default); pkt_stream_delete(rx_pkt_stream_default); + xsk_unload_xdp_programs(ifobj_tx); + xsk_unload_xdp_programs(ifobj_rx); ifobject_delete(ifobj_tx); ifobject_delete(ifobj_rx); diff --git a/tools/testing/selftests/bpf/xskxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h index edb76d2def9f..3e8ec7d8ec32 100644 --- a/tools/testing/selftests/bpf/xskxceiver.h +++ b/tools/testing/selftests/bpf/xskxceiver.h @@ -5,6 +5,8 @@ #ifndef XSKXCEIVER_H_ #define XSKXCEIVER_H_ +#include "xsk_xdp_progs.skel.h" + #ifndef SOL_XDP #define SOL_XDP 283 #endif @@ -30,7 +32,6 @@ #define TEST_CONTINUE 1 #define MAX_INTERFACES 2 #define MAX_INTERFACE_NAME_CHARS 16 -#define MAX_INTERFACES_NAMESPACE_CHARS 16 #define MAX_SOCKETS 2 #define MAX_TEST_NAME_SIZE 32 #define MAX_TEARDOWN_ITER 10 @@ -86,6 +87,7 @@ enum test_type { TEST_TYPE_STATS_RX_FULL, TEST_TYPE_STATS_FILL_EMPTY, TEST_TYPE_BPF_RES, + TEST_TYPE_XDP_DROP_HALF, TEST_TYPE_MAX }; @@ -133,18 +135,19 @@ typedef void *(*thread_func_t)(void *arg); struct ifobject { char ifname[MAX_INTERFACE_NAME_CHARS]; - char nsname[MAX_INTERFACES_NAMESPACE_CHARS]; struct xsk_socket_info *xsk; struct xsk_socket_info *xsk_arr; struct xsk_umem_info *umem; thread_func_t func_ptr; validation_func_t validation_func; struct pkt_stream *pkt_stream; - int ns_fd; - int xsk_map_fd; + struct xsk_xdp_progs *xdp_progs; + struct bpf_map *xskmap; + struct bpf_program *xdp_prog; + enum test_mode mode; + int ifindex; u32 dst_ip; u32 src_ip; - u32 xdp_flags; u32 bind_flags; u16 src_port; u16 dst_port; @@ -164,6 +167,10 @@ struct test_spec { struct ifobject *ifobj_rx; struct pkt_stream *tx_pkt_stream_default; struct pkt_stream *rx_pkt_stream_default; + struct bpf_program *xdp_prog_rx; + struct bpf_program *xdp_prog_tx; + struct bpf_map *xskmap_rx; + struct bpf_map *xskmap_tx; u16 total_steps; u16 current_step; u16 nb_sockets; diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_defprio.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_defprio.sh index 71066bc4b886..5492fa5550d7 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_defprio.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_defprio.sh @@ -5,18 +5,18 @@ # prioritized according to the default priority specified at the port. # rx_octets_prio_* counters are used to verify the prioritization. # -# +-----------------------+ -# | H1 | -# | + $h1 | -# | | 192.0.2.1/28 | -# +----|------------------+ +# +----------------------------------+ +# | H1 | +# | + $h1 | +# | | 192.0.2.1/28 | +# +----|-----------------------------+ # | -# +----|------------------+ -# | SW | | -# | + $swp1 | -# | 192.0.2.2/28 | -# | APP=<prio>,1,0 | -# +-----------------------+ +# +----|-----------------------------+ +# | SW | | +# | + $swp1 | +# | 192.0.2.2/28 | +# | dcb app default-prio <prio> | +# +----------------------------------+ ALL_TESTS=" ping_ipv4 @@ -29,42 +29,6 @@ NUM_NETIFS=2 : ${HIT_TIMEOUT:=1000} # ms source $lib_dir/lib.sh -declare -a APP - -defprio_install() -{ - local dev=$1; shift - local prio=$1; shift - local app="app=$prio,1,0" - - lldptool -T -i $dev -V APP $app >/dev/null - lldpad_app_wait_set $dev - APP[$prio]=$app -} - -defprio_uninstall() -{ - local dev=$1; shift - local prio=$1; shift - local app=${APP[$prio]} - - lldptool -T -i $dev -V APP -d $app >/dev/null - lldpad_app_wait_del - unset APP[$prio] -} - -defprio_flush() -{ - local dev=$1; shift - local prio - - if ((${#APP[@]})); then - lldptool -T -i $dev -V APP -d ${APP[@]} >/dev/null - fi - lldpad_app_wait_del - APP=() -} - h1_create() { simple_if_init $h1 192.0.2.1/28 @@ -83,7 +47,7 @@ switch_create() switch_destroy() { - defprio_flush $swp1 + dcb app flush dev $swp1 default-prio ip addr del dev $swp1 192.0.2.2/28 ip link set dev $swp1 down } @@ -124,7 +88,7 @@ __test_defprio() RET=0 - defprio_install $swp1 $prio_install + dcb app add dev $swp1 default-prio $prio_install local t0=$(ethtool_stats_get $swp1 rx_frames_prio_$prio_observe) mausezahn -q $h1 -d 100m -c 10 -t arp reply @@ -134,7 +98,7 @@ __test_defprio() check_err $? "Default priority $prio_install/$prio_observe: Expected to capture 10 packets, got $((t1 - t0))." log_test "Default priority $prio_install/$prio_observe" - defprio_uninstall $swp1 $prio_install + dcb app del dev $swp1 default-prio $prio_install } test_defprio() @@ -145,7 +109,7 @@ test_defprio() __test_defprio $prio $prio done - defprio_install $swp1 3 + dcb app add dev $swp1 default-prio 3 __test_defprio 0 3 __test_defprio 1 3 __test_defprio 2 3 @@ -153,7 +117,7 @@ test_defprio() __test_defprio 5 5 __test_defprio 6 6 __test_defprio 7 7 - defprio_uninstall $swp1 3 + dcb app del dev $swp1 default-prio 3 } trap cleanup EXIT diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_bridge.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_bridge.sh index 28a570006d4d..87c41f5727c9 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_bridge.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_bridge.sh @@ -20,7 +20,7 @@ # | SW | | | # | +-|----------------------------------------------------------------|-+ | # | | + $swp1 BR $swp2 + | | -# | | APP=0,5,10 .. 7,5,17 APP=0,5,20 .. 7,5,27 | | +# | | dcb dscp-prio 10:0...17:7 dcb dscp-prio 20:0...27:7 | | # | +--------------------------------------------------------------------+ | # +---------------------------------------------------------------------------+ @@ -62,16 +62,6 @@ h2_destroy() simple_if_fini $h2 192.0.2.2/28 } -dscp_map() -{ - local base=$1; shift - local prio - - for prio in {0..7}; do - echo app=$prio,5,$((base + prio)) - done -} - switch_create() { ip link add name br1 type bridge vlan_filtering 1 @@ -81,17 +71,14 @@ switch_create() ip link set dev $swp2 master br1 ip link set dev $swp2 up - lldptool -T -i $swp1 -V APP $(dscp_map 10) >/dev/null - lldptool -T -i $swp2 -V APP $(dscp_map 20) >/dev/null - lldpad_app_wait_set $swp1 - lldpad_app_wait_set $swp2 + dcb app add dev $swp1 dscp-prio 10:0 11:1 12:2 13:3 14:4 15:5 16:6 17:7 + dcb app add dev $swp2 dscp-prio 20:0 21:1 22:2 23:3 24:4 25:5 26:6 27:7 } switch_destroy() { - lldptool -T -i $swp2 -V APP -d $(dscp_map 20) >/dev/null - lldptool -T -i $swp1 -V APP -d $(dscp_map 10) >/dev/null - lldpad_app_wait_del + dcb app del dev $swp2 dscp-prio 20:0 21:1 22:2 23:3 24:4 25:5 26:6 27:7 + dcb app del dev $swp1 dscp-prio 10:0 11:1 12:2 13:3 14:4 15:5 16:6 17:7 ip link set dev $swp2 down ip link set dev $swp2 nomaster diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_router.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_router.sh index 4cb2aa65278a..f6c23f84423e 100755 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_router.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_dscp_router.sh @@ -94,16 +94,6 @@ h2_destroy() simple_if_fini $h2 192.0.2.18/28 } -dscp_map() -{ - local base=$1; shift - local prio - - for prio in {0..7}; do - echo app=$prio,5,$((base + prio)) - done -} - switch_create() { simple_if_init $swp1 192.0.2.2/28 @@ -112,17 +102,14 @@ switch_create() tc qdisc add dev $swp1 clsact tc qdisc add dev $swp2 clsact - lldptool -T -i $swp1 -V APP $(dscp_map 0) >/dev/null - lldptool -T -i $swp2 -V APP $(dscp_map 0) >/dev/null - lldpad_app_wait_set $swp1 - lldpad_app_wait_set $swp2 + dcb app add dev $swp1 dscp-prio 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 + dcb app add dev $swp2 dscp-prio 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 } switch_destroy() { - lldptool -T -i $swp2 -V APP -d $(dscp_map 0) >/dev/null - lldptool -T -i $swp1 -V APP -d $(dscp_map 0) >/dev/null - lldpad_app_wait_del + dcb app del dev $swp2 dscp-prio 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 + dcb app del dev $swp1 dscp-prio 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 tc qdisc del dev $swp2 clsact tc qdisc del dev $swp1 clsact @@ -265,13 +252,11 @@ test_dscp_leftover() { echo "Test that last removed DSCP rule is deconfigured correctly" - lldptool -T -i $swp2 -V APP -d $(dscp_map 0) >/dev/null - lldpad_app_wait_del + dcb app del dev $swp2 dscp-prio 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 __test_update 0 zero - lldptool -T -i $swp2 -V APP $(dscp_map 0) >/dev/null - lldpad_app_wait_set $swp2 + dcb app add dev $swp2 dscp-prio 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 } trap cleanup EXIT diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh index a08c02abde12..7f7d20f22207 100755 --- a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh +++ b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh @@ -17,6 +17,18 @@ SYSFS_NET_DIR=/sys/bus/netdevsim/devices/$DEV_NAME/net/ DEBUGFS_DIR=/sys/kernel/debug/netdevsim/$DEV_NAME/ DL_HANDLE=netdevsim/$DEV_NAME +wait_for_devlink() +{ + "$@" | grep -q $DL_HANDLE +} + +devlink_wait() +{ + local timeout=$1 + + busywait "$timeout" wait_for_devlink devlink dev +} + fw_flash_test() { RET=0 @@ -256,6 +268,9 @@ netns_reload_test() ip netns del testns2 ip netns del testns1 + # Wait until netns async cleanup is done. + devlink_wait 2000 + log_test "netns reload test" } @@ -348,6 +363,9 @@ resource_test() ip netns del testns2 ip netns del testns1 + # Wait until netns async cleanup is done. + devlink_wait 2000 + log_test "resource test" } diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 3007e98a6d64..6cd8993454d7 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -38,6 +38,7 @@ TEST_PROGS += srv6_end_dt6_l3vpn_test.sh TEST_PROGS += srv6_hencap_red_l3vpn_test.sh TEST_PROGS += srv6_hl2encap_red_l2vpn_test.sh TEST_PROGS += srv6_end_next_csid_l3vpn_test.sh +TEST_PROGS += srv6_end_flavors_test.sh TEST_PROGS += vrf_strict_mode_test.sh TEST_PROGS += arp_ndisc_evict_nocarrier.sh TEST_PROGS += ndisc_unsolicited_na_test.sh @@ -45,6 +46,8 @@ TEST_PROGS += arp_ndisc_untracked_subnets.sh TEST_PROGS += stress_reuseport_listen.sh TEST_PROGS += l2_tos_ttl_inherit.sh TEST_PROGS += bind_bhash.sh +TEST_PROGS += ip_local_port_range.sh +TEST_PROGS += rps_default_mask.sh TEST_PROGS_EXTENDED := in_netns.sh setup_loopback.sh setup_veth.sh TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh TEST_GEN_FILES = socket nettest @@ -75,14 +78,61 @@ TEST_GEN_PROGS += so_incoming_cpu TEST_PROGS += sctp_vrf.sh TEST_GEN_FILES += sctp_hello TEST_GEN_FILES += csum +TEST_GEN_FILES += nat6to4.o +TEST_GEN_FILES += ip_local_port_range TEST_FILES := settings include ../lib.mk -include bpf/Makefile - $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma $(OUTPUT)/tcp_mmap: LDLIBS += -lpthread $(OUTPUT)/tcp_inq: LDLIBS += -lpthread $(OUTPUT)/bind_bhash: LDLIBS += -lpthread + +# Rules to generate bpf obj nat6to4.o +CLANG ?= clang +SCRATCH_DIR := $(OUTPUT)/tools +BUILD_DIR := $(SCRATCH_DIR)/build +BPFDIR := $(abspath ../../../lib/bpf) +APIDIR := $(abspath ../../../include/uapi) + +CCINCLUDE += -I../bpf +CCINCLUDE += -I../../../../usr/include/ +CCINCLUDE += -I$(SCRATCH_DIR)/include + +BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a + +MAKE_DIRS := $(BUILD_DIR)/libbpf +$(MAKE_DIRS): + mkdir -p $@ + +# Get Clang's default includes on this system, as opposed to those seen by +# '-target bpf'. This fixes "missing" files on some architectures/distros, +# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc. +# +# Use '-idirafter': Don't interfere with include mechanics except where the +# build would have failed anyways. +define get_sys_includes +$(shell $(1) $(2) -v -E - </dev/null 2>&1 \ + | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \ +$(shell $(1) $(2) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}') +endef + +ifneq ($(CROSS_COMPILE),) +CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%)) +endif + +CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG),$(CLANG_TARGET_ARCH)) + +$(OUTPUT)/nat6to4.o: nat6to4.c $(BPFOBJ) | $(MAKE_DIRS) + $(CLANG) -O2 -target bpf -c $< $(CCINCLUDE) $(CLANG_SYS_INCLUDES) -o $@ + +$(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ + $(APIDIR)/linux/bpf.h \ + | $(BUILD_DIR)/libbpf + $(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/ \ + EXTRA_CFLAGS='-g -O0' \ + DESTDIR=$(SCRATCH_DIR) prefix= all install_headers + +EXTRA_CLEAN := $(SCRATCH_DIR) diff --git a/tools/testing/selftests/net/bpf/Makefile b/tools/testing/selftests/net/bpf/Makefile deleted file mode 100644 index 4abaf16d2077..000000000000 --- a/tools/testing/selftests/net/bpf/Makefile +++ /dev/null @@ -1,51 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0 - -CLANG ?= clang -SCRATCH_DIR := $(OUTPUT)/tools -BUILD_DIR := $(SCRATCH_DIR)/build -BPFDIR := $(abspath ../../../lib/bpf) -APIDIR := $(abspath ../../../include/uapi) - -CCINCLUDE += -I../../bpf -CCINCLUDE += -I../../../../../usr/include/ -CCINCLUDE += -I$(SCRATCH_DIR)/include - -BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a - -MAKE_DIRS := $(BUILD_DIR)/libbpf $(OUTPUT)/bpf -$(MAKE_DIRS): - mkdir -p $@ - -TEST_CUSTOM_PROGS = $(OUTPUT)/bpf/nat6to4.o -all: $(TEST_CUSTOM_PROGS) - -# Get Clang's default includes on this system, as opposed to those seen by -# '-target bpf'. This fixes "missing" files on some architectures/distros, -# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc. -# -# Use '-idirafter': Don't interfere with include mechanics except where the -# build would have failed anyways. -define get_sys_includes -$(shell $(1) $(2) -v -E - </dev/null 2>&1 \ - | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \ -$(shell $(1) $(2) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}') -endef - -ifneq ($(CROSS_COMPILE),) -CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%)) -endif - -CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG),$(CLANG_TARGET_ARCH)) - -$(TEST_CUSTOM_PROGS): $(OUTPUT)/%.o: %.c $(BPFOBJ) | $(MAKE_DIRS) - $(CLANG) -O2 -target bpf -c $< $(CCINCLUDE) $(CLANG_SYS_INCLUDES) -o $@ - -$(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ - $(APIDIR)/linux/bpf.h \ - | $(BUILD_DIR)/libbpf - $(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/ \ - EXTRA_CFLAGS='-g -O0' \ - DESTDIR=$(SCRATCH_DIR) prefix= all install_headers - -EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) - diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config index bd89198cd817..cc9fd55ab869 100644 --- a/tools/testing/selftests/net/config +++ b/tools/testing/selftests/net/config @@ -3,6 +3,9 @@ CONFIG_NET_NS=y CONFIG_BPF_SYSCALL=y CONFIG_TEST_BPF=m CONFIG_NUMA=y +CONFIG_RPS=y +CONFIG_SYSFS=y +CONFIG_PROC_SYSCTL=y CONFIG_NET_VRF=y CONFIG_NET_L3_MASTER_DEV=y CONFIG_IPV6=y diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index 5637b5dadabd..70ea8798b1f6 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -2065,6 +2065,8 @@ EOF ################################################################################ # main +trap cleanup EXIT + while getopts :t:pPhv o do case $o in diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile index 453ae006fbcf..91201ab3c4fc 100644 --- a/tools/testing/selftests/net/forwarding/Makefile +++ b/tools/testing/selftests/net/forwarding/Makefile @@ -4,6 +4,7 @@ TEST_PROGS = bridge_igmp.sh \ bridge_locked_port.sh \ bridge_mdb.sh \ bridge_mdb_host.sh \ + bridge_mdb_max.sh \ bridge_mdb_port_down.sh \ bridge_mld.sh \ bridge_port_isolation.sh \ diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb.sh b/tools/testing/selftests/net/forwarding/bridge_mdb.sh index 2fa5973c0c28..ae3f9462a2b6 100755 --- a/tools/testing/selftests/net/forwarding/bridge_mdb.sh +++ b/tools/testing/selftests/net/forwarding/bridge_mdb.sh @@ -742,10 +742,109 @@ cfg_test_port() cfg_test_port_l2 } +ipv4_grps_get() +{ + local max_grps=$1; shift + local i + + for i in $(seq 0 $((max_grps - 1))); do + echo "239.1.1.$i" + done +} + +ipv6_grps_get() +{ + local max_grps=$1; shift + local i + + for i in $(seq 0 $((max_grps - 1))); do + echo "ff0e::$(printf %x $i)" + done +} + +l2_grps_get() +{ + local max_grps=$1; shift + local i + + for i in $(seq 0 $((max_grps - 1))); do + echo "01:00:00:00:00:$(printf %02x $i)" + done +} + +cfg_test_dump_common() +{ + local name=$1; shift + local fn=$1; shift + local max_bridges=2 + local max_grps=256 + local max_ports=32 + local num_entries + local batch_file + local grp + local i j + + RET=0 + + # Create net devices. + for i in $(seq 1 $max_bridges); do + ip link add name br-test${i} up type bridge vlan_filtering 1 \ + mcast_snooping 1 + for j in $(seq 1 $max_ports); do + ip link add name br-test${i}-du${j} up \ + master br-test${i} type dummy + done + done + + # Create batch file with MDB entries. + batch_file=$(mktemp) + for i in $(seq 1 $max_bridges); do + for j in $(seq 1 $max_ports); do + for grp in $($fn $max_grps); do + echo "mdb add dev br-test${i} \ + port br-test${i}-du${j} grp $grp \ + permanent vid 1" >> $batch_file + done + done + done + + # Program the batch file and check for expected number of entries. + bridge -b $batch_file + for i in $(seq 1 $max_bridges); do + num_entries=$(bridge mdb show dev br-test${i} | \ + grep "permanent" | wc -l) + [[ $num_entries -eq $((max_grps * max_ports)) ]] + check_err $? "Wrong number of entries in br-test${i}" + done + + # Cleanup. + rm $batch_file + for i in $(seq 1 $max_bridges); do + ip link del dev br-test${i} + for j in $(seq $max_ports); do + ip link del dev br-test${i}-du${j} + done + done + + log_test "$name large scale dump tests" +} + +# Check large scale dump. +cfg_test_dump() +{ + echo + log_info "# Large scale dump tests" + + cfg_test_dump_common "IPv4" ipv4_grps_get + cfg_test_dump_common "IPv6" ipv6_grps_get + cfg_test_dump_common "L2" l2_grps_get +} + cfg_test() { cfg_test_host cfg_test_port + cfg_test_dump } __fwd_test_host_ip() @@ -1018,26 +1117,6 @@ fwd_test() ip -6 address del fe80::1/64 dev br0 } -igmpv3_is_in_get() -{ - local igmpv3 - - igmpv3=$(: - )"22:"$( : Type - Membership Report - )"00:"$( : Reserved - )"2a:f8:"$( : Checksum - )"00:00:"$( : Reserved - )"00:01:"$( : Number of Group Records - )"01:"$( : Record Type - IS_IN - )"00:"$( : Aux Data Len - )"00:01:"$( : Number of Sources - )"ef:01:01:01:"$( : Multicast Address - 239.1.1.1 - )"c0:00:02:02"$( : Source Address - 192.0.2.2 - ) - - echo $igmpv3 -} - ctrl_igmpv3_is_in_test() { RET=0 @@ -1049,7 +1128,7 @@ ctrl_igmpv3_is_in_test() # IS_IN ( 192.0.2.2 ) $MZ $h1.10 -c 1 -A 192.0.2.1 -B 239.1.1.1 \ - -t ip proto=2,p=$(igmpv3_is_in_get) -q + -t ip proto=2,p=$(igmpv3_is_in_get 239.1.1.1 192.0.2.2) -q bridge -d mdb show dev br0 vid 10 | grep 239.1.1.1 | grep -q 192.0.2.2 check_fail $? "Permanent entry affected by IGMP packet" @@ -1062,7 +1141,7 @@ ctrl_igmpv3_is_in_test() # IS_IN ( 192.0.2.2 ) $MZ $h1.10 -c 1 -A 192.0.2.1 -B 239.1.1.1 \ - -t ip proto=2,p=$(igmpv3_is_in_get) -q + -t ip proto=2,p=$(igmpv3_is_in_get 239.1.1.1 192.0.2.2) -q bridge -d mdb show dev br0 vid 10 | grep 239.1.1.1 | grep -v "src" | \ grep -q 192.0.2.2 @@ -1074,36 +1153,7 @@ ctrl_igmpv3_is_in_test() bridge mdb del dev br0 port $swp1 grp 239.1.1.1 vid 10 - log_test "IGMPv3 MODE_IS_INCLUE tests" -} - -mldv2_is_in_get() -{ - local hbh - local icmpv6 - - hbh=$(: - )"3a:"$( : Next Header - ICMPv6 - )"00:"$( : Hdr Ext Len - )"00:00:00:00:00:00:"$( : Options and Padding - ) - - icmpv6=$(: - )"8f:"$( : Type - MLDv2 Report - )"00:"$( : Code - )"45:39:"$( : Checksum - )"00:00:"$( : Reserved - )"00:01:"$( : Number of Group Records - )"01:"$( : Record Type - IS_IN - )"00:"$( : Aux Data Len - )"00:01:"$( : Number of Sources - )"ff:0e:00:00:00:00:00:00:"$( : Multicast address - ff0e::1 - )"00:00:00:00:00:00:00:01:"$( : - )"20:01:0d:b8:00:01:00:00:"$( : Source Address - 2001:db8:1::2 - )"00:00:00:00:00:00:00:02:"$( : - ) - - echo ${hbh}${icmpv6} + log_test "IGMPv3 MODE_IS_INCLUDE tests" } ctrl_mldv2_is_in_test() @@ -1116,8 +1166,9 @@ ctrl_mldv2_is_in_test() filter_mode include source_list 2001:db8:1::1 # IS_IN ( 2001:db8:1::2 ) + local p=$(mldv2_is_in_get fe80::1 ff0e::1 2001:db8:1::2) $MZ -6 $h1.10 -c 1 -A fe80::1 -B ff0e::1 \ - -t ip hop=1,next=0,p=$(mldv2_is_in_get) -q + -t ip hop=1,next=0,p="$p" -q bridge -d mdb show dev br0 vid 10 | grep ff0e::1 | \ grep -q 2001:db8:1::2 @@ -1131,7 +1182,7 @@ ctrl_mldv2_is_in_test() # IS_IN ( 2001:db8:1::2 ) $MZ -6 $h1.10 -c 1 -A fe80::1 -B ff0e::1 \ - -t ip hop=1,next=0,p=$(mldv2_is_in_get) -q + -t ip hop=1,next=0,p="$p" -q bridge -d mdb show dev br0 vid 10 | grep ff0e::1 | grep -v "src" | \ grep -q 2001:db8:1::2 diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh b/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh new file mode 100755 index 000000000000..ae255b662ba3 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh @@ -0,0 +1,1336 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# +-----------------------+ +------------------------+ +# | H1 (vrf) | | H2 (vrf) | +# | + $h1.10 | | + $h2.10 | +# | | 192.0.2.1/28 | | | 192.0.2.2/28 | +# | | 2001:db8:1::1/64 | | | 2001:db8:1::2/64 | +# | | | | | | +# | | + $h1.20 | | | + $h2.20 | +# | \ | 198.51.100.1/24 | | \ | 198.51.100.2/24 | +# | \ | 2001:db8:2::1/64 | | \ | 2001:db8:2::2/64 | +# | \| | | \| | +# | + $h1 | | + $h2 | +# +----|------------------+ +----|-------------------+ +# | | +# +----|--------------------------------------------------|-------------------+ +# | SW | | | +# | +--|--------------------------------------------------|-----------------+ | +# | | + $swp1 BR0 (802.1q) + $swp2 | | +# | | vid 10 vid 10 | | +# | | vid 20 vid 20 | | +# | | | | +# | +-----------------------------------------------------------------------+ | +# +---------------------------------------------------------------------------+ + +ALL_TESTS=" + test_8021d + test_8021q + test_8021qvs +" + +NUM_NETIFS=4 +source lib.sh +source tc_common.sh + +h1_create() +{ + simple_if_init $h1 + vlan_create $h1 10 v$h1 192.0.2.1/28 2001:db8:1::1/64 + vlan_create $h1 20 v$h1 198.51.100.1/24 2001:db8:2::1/64 +} + +h1_destroy() +{ + vlan_destroy $h1 20 + vlan_destroy $h1 10 + simple_if_fini $h1 +} + +h2_create() +{ + simple_if_init $h2 + vlan_create $h2 10 v$h2 192.0.2.2/28 + vlan_create $h2 20 v$h2 198.51.100.2/24 +} + +h2_destroy() +{ + vlan_destroy $h2 20 + vlan_destroy $h2 10 + simple_if_fini $h2 +} + +switch_create_8021d() +{ + log_info "802.1d tests" + + ip link add name br0 type bridge vlan_filtering 0 \ + mcast_snooping 1 \ + mcast_igmp_version 3 mcast_mld_version 2 + ip link set dev br0 up + + ip link set dev $swp1 master br0 + ip link set dev $swp1 up + bridge link set dev $swp1 fastleave on + + ip link set dev $swp2 master br0 + ip link set dev $swp2 up +} + +switch_create_8021q() +{ + local br_flags=$1; shift + + log_info "802.1q $br_flags${br_flags:+ }tests" + + ip link add name br0 type bridge vlan_filtering 1 vlan_default_pvid 0 \ + mcast_snooping 1 $br_flags \ + mcast_igmp_version 3 mcast_mld_version 2 + bridge vlan add vid 10 dev br0 self + bridge vlan add vid 20 dev br0 self + ip link set dev br0 up + + ip link set dev $swp1 master br0 + ip link set dev $swp1 up + bridge link set dev $swp1 fastleave on + bridge vlan add vid 10 dev $swp1 + bridge vlan add vid 20 dev $swp1 + + ip link set dev $swp2 master br0 + ip link set dev $swp2 up + bridge vlan add vid 10 dev $swp2 + bridge vlan add vid 20 dev $swp2 +} + +switch_create_8021qvs() +{ + switch_create_8021q "mcast_vlan_snooping 1" + bridge vlan global set dev br0 vid 10 mcast_igmp_version 3 + bridge vlan global set dev br0 vid 10 mcast_mld_version 2 + bridge vlan global set dev br0 vid 20 mcast_igmp_version 3 + bridge vlan global set dev br0 vid 20 mcast_mld_version 2 +} + +switch_destroy() +{ + ip link set dev $swp2 down + ip link set dev $swp2 nomaster + + ip link set dev $swp1 down + ip link set dev $swp1 nomaster + + ip link set dev br0 down + ip link del dev br0 +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + swp1=${NETIFS[p2]} + + swp2=${NETIFS[p3]} + h2=${NETIFS[p4]} + + vrf_prepare + forwarding_enable + + h1_create + h2_create +} + +cleanup() +{ + pre_cleanup + + switch_destroy 2>/dev/null + h2_destroy + h1_destroy + + forwarding_restore + vrf_cleanup +} + +cfg_src_list() +{ + local IPs=("$@") + local IPstr=$(echo ${IPs[@]} | tr '[:space:]' , | sed 's/,$//') + + echo ${IPstr:+source_list }${IPstr} +} + +cfg_group_op() +{ + local op=$1; shift + local locus=$1; shift + local GRP=$1; shift + local state=$1; shift + local IPs=("$@") + + local source_list=$(cfg_src_list ${IPs[@]}) + + # Everything besides `bridge mdb' uses the "dev X vid Y" syntax, + # so we use it here as well and convert. + local br_locus=$(echo "$locus" | sed 's/^dev /port /') + + bridge mdb $op dev br0 $br_locus grp $GRP $state \ + filter_mode include $source_list +} + +cfg4_entries_op() +{ + local op=$1; shift + local locus=$1; shift + local state=$1; shift + local n=$1; shift + local grp=${1:-1}; shift + + local GRP=239.1.1.${grp} + local IPs=$(seq -f 192.0.2.%g 1 $((n - 1))) + cfg_group_op "$op" "$locus" "$GRP" "$state" ${IPs[@]} +} + +cfg4_entries_add() +{ + cfg4_entries_op add "$@" +} + +cfg4_entries_del() +{ + cfg4_entries_op del "$@" +} + +cfg6_entries_op() +{ + local op=$1; shift + local locus=$1; shift + local state=$1; shift + local n=$1; shift + local grp=${1:-1}; shift + + local GRP=ff0e::${grp} + local IPs=$(printf "2001:db8:1::%x\n" $(seq 1 $((n - 1)))) + cfg_group_op "$op" "$locus" "$GRP" "$state" ${IPs[@]} +} + +cfg6_entries_add() +{ + cfg6_entries_op add "$@" +} + +cfg6_entries_del() +{ + cfg6_entries_op del "$@" +} + +locus_dev_peer() +{ + local dev_kw=$1; shift + local dev=$1; shift + local vid_kw=$1; shift + local vid=$1; shift + + echo "$h1.${vid:-10}" +} + +locus_dev() +{ + local dev_kw=$1; shift + local dev=$1; shift + + echo $dev +} + +ctl4_entries_add() +{ + local locus=$1; shift + local state=$1; shift + local n=$1; shift + local grp=${1:-1}; shift + + local IPs=$(seq -f 192.0.2.%g 1 $((n - 1))) + local peer=$(locus_dev_peer $locus) + local GRP=239.1.1.${grp} + $MZ $peer -c 1 -A 192.0.2.1 -B $GRP \ + -t ip proto=2,p=$(igmpv3_is_in_get $GRP $IPs) -q + sleep 1 + + local nn=$(bridge mdb show dev br0 | grep $GRP | wc -l) + if ((nn != n)); then + echo mcast_max_groups > /dev/stderr + false + fi +} + +ctl4_entries_del() +{ + local locus=$1; shift + local state=$1; shift + local n=$1; shift + local grp=${1:-1}; shift + + local peer=$(locus_dev_peer $locus) + local GRP=239.1.1.${grp} + $MZ $peer -c 1 -A 192.0.2.1 -B 224.0.0.2 \ + -t ip proto=2,p=$(igmpv2_leave_get $GRP) -q + sleep 1 + ! bridge mdb show dev br0 | grep -q $GRP +} + +ctl6_entries_add() +{ + local locus=$1; shift + local state=$1; shift + local n=$1; shift + local grp=${1:-1}; shift + + local IPs=$(printf "2001:db8:1::%x\n" $(seq 1 $((n - 1)))) + local peer=$(locus_dev_peer $locus) + local SIP=fe80::1 + local GRP=ff0e::${grp} + local p=$(mldv2_is_in_get $SIP $GRP $IPs) + $MZ -6 $peer -c 1 -A $SIP -B $GRP -t ip hop=1,next=0,p="$p" -q + sleep 1 + + local nn=$(bridge mdb show dev br0 | grep $GRP | wc -l) + if ((nn != n)); then + echo mcast_max_groups > /dev/stderr + false + fi +} + +ctl6_entries_del() +{ + local locus=$1; shift + local state=$1; shift + local n=$1; shift + local grp=${1:-1}; shift + + local peer=$(locus_dev_peer $locus) + local SIP=fe80::1 + local GRP=ff0e::${grp} + local p=$(mldv1_done_get $SIP $GRP) + $MZ -6 $peer -c 1 -A $SIP -B $GRP -t ip hop=1,next=0,p="$p" -q + sleep 1 + ! bridge mdb show dev br0 | grep -q $GRP +} + +bridge_maxgroups_errmsg_check_cfg() +{ + local msg=$1; shift + local needle=$1; shift + + echo "$msg" | grep -q mcast_max_groups + check_err $? "Adding MDB entries failed for the wrong reason: $msg" +} + +bridge_maxgroups_errmsg_check_cfg4() +{ + bridge_maxgroups_errmsg_check_cfg "$@" +} + +bridge_maxgroups_errmsg_check_cfg6() +{ + bridge_maxgroups_errmsg_check_cfg "$@" +} + +bridge_maxgroups_errmsg_check_ctl4() +{ + : +} + +bridge_maxgroups_errmsg_check_ctl6() +{ + : +} + +bridge_port_ngroups_get() +{ + local locus=$1; shift + + bridge -j -d link show $locus | + jq '.[].mcast_n_groups' +} + +bridge_port_maxgroups_get() +{ + local locus=$1; shift + + bridge -j -d link show $locus | + jq '.[].mcast_max_groups' +} + +bridge_port_maxgroups_set() +{ + local locus=$1; shift + local max=$1; shift + + bridge link set dev $(locus_dev $locus) mcast_max_groups $max +} + +bridge_port_vlan_ngroups_get() +{ + local locus=$1; shift + + bridge -j -d vlan show $locus | + jq '.[].vlans[].mcast_n_groups' +} + +bridge_port_vlan_maxgroups_get() +{ + local locus=$1; shift + + bridge -j -d vlan show $locus | + jq '.[].vlans[].mcast_max_groups' +} + +bridge_port_vlan_maxgroups_set() +{ + local locus=$1; shift + local max=$1; shift + + bridge vlan set $locus mcast_max_groups $max +} + +test_ngroups_reporting() +{ + local CFG=$1; shift + local context=$1; shift + local locus=$1; shift + + RET=0 + + local n0=$(bridge_${context}_ngroups_get "$locus") + ${CFG}_entries_add "$locus" temp 5 + check_err $? "Couldn't add MDB entries" + local n1=$(bridge_${context}_ngroups_get "$locus") + + ((n1 == n0 + 5)) + check_err $? "Number of groups was $n0, now is $n1, but $((n0 + 5)) expected" + + ${CFG}_entries_del "$locus" temp 5 + check_err $? "Couldn't delete MDB entries" + local n2=$(bridge_${context}_ngroups_get "$locus") + + ((n2 == n0)) + check_err $? "Number of groups was $n0, now is $n2, but should be back to $n0" + + log_test "$CFG: $context: ngroups reporting" +} + +test_8021d_ngroups_reporting_cfg4() +{ + test_ngroups_reporting cfg4 port "dev $swp1" +} + +test_8021d_ngroups_reporting_ctl4() +{ + test_ngroups_reporting ctl4 port "dev $swp1" +} + +test_8021d_ngroups_reporting_cfg6() +{ + test_ngroups_reporting cfg6 port "dev $swp1" +} + +test_8021d_ngroups_reporting_ctl6() +{ + test_ngroups_reporting ctl6 port "dev $swp1" +} + +test_8021q_ngroups_reporting_cfg4() +{ + test_ngroups_reporting cfg4 port "dev $swp1 vid 10" +} + +test_8021q_ngroups_reporting_ctl4() +{ + test_ngroups_reporting ctl4 port "dev $swp1 vid 10" +} + +test_8021q_ngroups_reporting_cfg6() +{ + test_ngroups_reporting cfg6 port "dev $swp1 vid 10" +} + +test_8021q_ngroups_reporting_ctl6() +{ + test_ngroups_reporting ctl6 port "dev $swp1 vid 10" +} + +test_8021qvs_ngroups_reporting_cfg4() +{ + test_ngroups_reporting cfg4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_ngroups_reporting_ctl4() +{ + test_ngroups_reporting ctl4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_ngroups_reporting_cfg6() +{ + test_ngroups_reporting cfg6 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_ngroups_reporting_ctl6() +{ + test_ngroups_reporting ctl6 port_vlan "dev $swp1 vid 10" +} + +test_ngroups_cross_vlan() +{ + local CFG=$1; shift + + local locus1="dev $swp1 vid 10" + local locus2="dev $swp1 vid 20" + + RET=0 + + local n10=$(bridge_port_vlan_ngroups_get "$locus1") + local n20=$(bridge_port_vlan_ngroups_get "$locus2") + ${CFG}_entries_add "$locus1" temp 5 111 + check_err $? "Couldn't add MDB entries to VLAN 10" + local n11=$(bridge_port_vlan_ngroups_get "$locus1") + local n21=$(bridge_port_vlan_ngroups_get "$locus2") + + ((n11 == n10 + 5)) + check_err $? "Number of groups at VLAN 10 was $n10, now is $n11, but 5 entries added on VLAN 10, $((n10 + 5)) expected" + + ((n21 == n20)) + check_err $? "Number of groups at VLAN 20 was $n20, now is $n21, but no change expected on VLAN 20" + + ${CFG}_entries_add "$locus2" temp 5 112 + check_err $? "Couldn't add MDB entries to VLAN 20" + local n12=$(bridge_port_vlan_ngroups_get "$locus1") + local n22=$(bridge_port_vlan_ngroups_get "$locus2") + + ((n12 == n11)) + check_err $? "Number of groups at VLAN 10 was $n11, now is $n12, but no change expected on VLAN 10" + + ((n22 == n21 + 5)) + check_err $? "Number of groups at VLAN 20 was $n21, now is $n22, but 5 entries added on VLAN 20, $((n21 + 5)) expected" + + ${CFG}_entries_del "$locus1" temp 5 111 + check_err $? "Couldn't delete MDB entries from VLAN 10" + ${CFG}_entries_del "$locus2" temp 5 112 + check_err $? "Couldn't delete MDB entries from VLAN 20" + local n13=$(bridge_port_vlan_ngroups_get "$locus1") + local n23=$(bridge_port_vlan_ngroups_get "$locus2") + + ((n13 == n10)) + check_err $? "Number of groups at VLAN 10 was $n10, now is $n13, but should be back to $n10" + + ((n23 == n20)) + check_err $? "Number of groups at VLAN 20 was $n20, now is $n23, but should be back to $n20" + + log_test "$CFG: port_vlan: isolation of port and per-VLAN ngroups" +} + +test_8021qvs_ngroups_cross_vlan_cfg4() +{ + test_ngroups_cross_vlan cfg4 +} + +test_8021qvs_ngroups_cross_vlan_ctl4() +{ + test_ngroups_cross_vlan ctl4 +} + +test_8021qvs_ngroups_cross_vlan_cfg6() +{ + test_ngroups_cross_vlan cfg6 +} + +test_8021qvs_ngroups_cross_vlan_ctl6() +{ + test_ngroups_cross_vlan ctl6 +} + +test_maxgroups_zero() +{ + local CFG=$1; shift + local context=$1; shift + local locus=$1; shift + + RET=0 + local max + + max=$(bridge_${context}_maxgroups_get "$locus") + ((max == 0)) + check_err $? "Max groups on $locus should be 0, but $max reported" + + bridge_${context}_maxgroups_set "$locus" 100 + check_err $? "Failed to set max to 100" + max=$(bridge_${context}_maxgroups_get "$locus") + ((max == 100)) + check_err $? "Max groups expected to be 100, but $max reported" + + bridge_${context}_maxgroups_set "$locus" 0 + check_err $? "Couldn't set maximum to 0" + + # Test that setting 0 explicitly still serves as infinity. + ${CFG}_entries_add "$locus" temp 5 + check_err $? "Adding 5 MDB entries failed but should have passed" + ${CFG}_entries_del "$locus" temp 5 + check_err $? "Couldn't delete MDB entries" + + log_test "$CFG: $context maxgroups: reporting and treatment of 0" +} + +test_8021d_maxgroups_zero_cfg4() +{ + test_maxgroups_zero cfg4 port "dev $swp1" +} + +test_8021d_maxgroups_zero_ctl4() +{ + test_maxgroups_zero ctl4 port "dev $swp1" +} + +test_8021d_maxgroups_zero_cfg6() +{ + test_maxgroups_zero cfg6 port "dev $swp1" +} + +test_8021d_maxgroups_zero_ctl6() +{ + test_maxgroups_zero ctl6 port "dev $swp1" +} + +test_8021q_maxgroups_zero_cfg4() +{ + test_maxgroups_zero cfg4 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_zero_ctl4() +{ + test_maxgroups_zero ctl4 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_zero_cfg6() +{ + test_maxgroups_zero cfg6 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_zero_ctl6() +{ + test_maxgroups_zero ctl6 port "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_zero_cfg4() +{ + test_maxgroups_zero cfg4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_zero_ctl4() +{ + test_maxgroups_zero ctl4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_zero_cfg6() +{ + test_maxgroups_zero cfg6 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_zero_ctl6() +{ + test_maxgroups_zero ctl6 port_vlan "dev $swp1 vid 10" +} + +test_maxgroups_zero_cross_vlan() +{ + local CFG=$1; shift + + local locus0="dev $swp1" + local locus1="dev $swp1 vid 10" + local locus2="dev $swp1 vid 20" + local max + + RET=0 + + bridge_port_vlan_maxgroups_set "$locus1" 100 + check_err $? "$locus1: Failed to set max to 100" + + max=$(bridge_port_maxgroups_get "$locus0") + ((max == 0)) + check_err $? "$locus0: Max groups expected to be 0, but $max reported" + + max=$(bridge_port_vlan_maxgroups_get "$locus2") + ((max == 0)) + check_err $? "$locus2: Max groups expected to be 0, but $max reported" + + bridge_port_vlan_maxgroups_set "$locus2" 100 + check_err $? "$locus2: Failed to set max to 100" + + max=$(bridge_port_maxgroups_get "$locus0") + ((max == 0)) + check_err $? "$locus0: Max groups expected to be 0, but $max reported" + + max=$(bridge_port_vlan_maxgroups_get "$locus2") + ((max == 100)) + check_err $? "$locus2: Max groups expected to be 100, but $max reported" + + bridge_port_maxgroups_set "$locus0" 100 + check_err $? "$locus0: Failed to set max to 100" + + max=$(bridge_port_maxgroups_get "$locus0") + ((max == 100)) + check_err $? "$locus0: Max groups expected to be 100, but $max reported" + + max=$(bridge_port_vlan_maxgroups_get "$locus2") + ((max == 100)) + check_err $? "$locus2: Max groups expected to be 100, but $max reported" + + bridge_port_vlan_maxgroups_set "$locus1" 0 + check_err $? "$locus1: Failed to set max to 0" + + max=$(bridge_port_maxgroups_get "$locus0") + ((max == 100)) + check_err $? "$locus0: Max groups expected to be 100, but $max reported" + + max=$(bridge_port_vlan_maxgroups_get "$locus2") + ((max == 100)) + check_err $? "$locus2: Max groups expected to be 100, but $max reported" + + bridge_port_vlan_maxgroups_set "$locus2" 0 + check_err $? "$locus2: Failed to set max to 0" + + max=$(bridge_port_maxgroups_get "$locus0") + ((max == 100)) + check_err $? "$locus0: Max groups expected to be 100, but $max reported" + + max=$(bridge_port_vlan_maxgroups_get "$locus2") + ((max == 0)) + check_err $? "$locus2: Max groups expected to be 0 but $max reported" + + bridge_port_maxgroups_set "$locus0" 0 + check_err $? "$locus0: Failed to set max to 0" + + max=$(bridge_port_maxgroups_get "$locus0") + ((max == 0)) + check_err $? "$locus0: Max groups expected to be 0, but $max reported" + + max=$(bridge_port_vlan_maxgroups_get "$locus2") + ((max == 0)) + check_err $? "$locus2: Max groups expected to be 0, but $max reported" + + log_test "$CFG: port_vlan maxgroups: isolation of port and per-VLAN maximums" +} + +test_8021qvs_maxgroups_zero_cross_vlan_cfg4() +{ + test_maxgroups_zero_cross_vlan cfg4 +} + +test_8021qvs_maxgroups_zero_cross_vlan_ctl4() +{ + test_maxgroups_zero_cross_vlan ctl4 +} + +test_8021qvs_maxgroups_zero_cross_vlan_cfg6() +{ + test_maxgroups_zero_cross_vlan cfg6 +} + +test_8021qvs_maxgroups_zero_cross_vlan_ctl6() +{ + test_maxgroups_zero_cross_vlan ctl6 +} + +test_maxgroups_too_low() +{ + local CFG=$1; shift + local context=$1; shift + local locus=$1; shift + + RET=0 + + local n=$(bridge_${context}_ngroups_get "$locus") + local msg + + ${CFG}_entries_add "$locus" temp 5 111 + check_err $? "$locus: Couldn't add MDB entries" + + bridge_${context}_maxgroups_set "$locus" $((n+2)) + check_err $? "$locus: Setting maxgroups to $((n+2)) failed" + + msg=$(${CFG}_entries_add "$locus" temp 2 112 2>&1) + check_fail $? "$locus: Adding more entries passed when max<n" + bridge_maxgroups_errmsg_check_cfg "$msg" + + ${CFG}_entries_del "$locus" temp 5 111 + check_err $? "$locus: Couldn't delete MDB entries" + + ${CFG}_entries_add "$locus" temp 2 112 + check_err $? "$locus: Adding more entries failed" + + ${CFG}_entries_del "$locus" temp 2 112 + check_err $? "$locus: Deleting more entries failed" + + bridge_${context}_maxgroups_set "$locus" 0 + check_err $? "$locus: Couldn't set maximum to 0" + + log_test "$CFG: $context maxgroups: configure below ngroups" +} + +test_8021d_maxgroups_too_low_cfg4() +{ + test_maxgroups_too_low cfg4 port "dev $swp1" +} + +test_8021d_maxgroups_too_low_ctl4() +{ + test_maxgroups_too_low ctl4 port "dev $swp1" +} + +test_8021d_maxgroups_too_low_cfg6() +{ + test_maxgroups_too_low cfg6 port "dev $swp1" +} + +test_8021d_maxgroups_too_low_ctl6() +{ + test_maxgroups_too_low ctl6 port "dev $swp1" +} + +test_8021q_maxgroups_too_low_cfg4() +{ + test_maxgroups_too_low cfg4 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_too_low_ctl4() +{ + test_maxgroups_too_low ctl4 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_too_low_cfg6() +{ + test_maxgroups_too_low cfg6 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_too_low_ctl6() +{ + test_maxgroups_too_low ctl6 port "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_low_cfg4() +{ + test_maxgroups_too_low cfg4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_low_ctl4() +{ + test_maxgroups_too_low ctl4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_low_cfg6() +{ + test_maxgroups_too_low cfg6 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_low_ctl6() +{ + test_maxgroups_too_low ctl6 port_vlan "dev $swp1 vid 10" +} + +test_maxgroups_too_many_entries() +{ + local CFG=$1; shift + local context=$1; shift + local locus=$1; shift + + RET=0 + + local n=$(bridge_${context}_ngroups_get "$locus") + local msg + + # Configure a low maximum + bridge_${context}_maxgroups_set "$locus" $((n+1)) + check_err $? "$locus: Couldn't set maximum" + + # Try to add more entries than the configured maximum + msg=$(${CFG}_entries_add "$locus" temp 5 2>&1) + check_fail $? "Adding 5 MDB entries passed, but should have failed" + bridge_maxgroups_errmsg_check_${CFG} "$msg" + + # When adding entries through the control path, as many as possible + # get created. That's consistent with the mcast_hash_max behavior. + # So there, drop the entries explicitly. + if [[ ${CFG%[46]} == ctl ]]; then + ${CFG}_entries_del "$locus" temp 17 2>&1 + fi + + local n2=$(bridge_${context}_ngroups_get "$locus") + ((n2 == n)) + check_err $? "Number of groups was $n, but after a failed attempt to add MDB entries it changed to $n2" + + bridge_${context}_maxgroups_set "$locus" 0 + check_err $? "$locus: Couldn't set maximum to 0" + + log_test "$CFG: $context maxgroups: add too many MDB entries" +} + +test_8021d_maxgroups_too_many_entries_cfg4() +{ + test_maxgroups_too_many_entries cfg4 port "dev $swp1" +} + +test_8021d_maxgroups_too_many_entries_ctl4() +{ + test_maxgroups_too_many_entries ctl4 port "dev $swp1" +} + +test_8021d_maxgroups_too_many_entries_cfg6() +{ + test_maxgroups_too_many_entries cfg6 port "dev $swp1" +} + +test_8021d_maxgroups_too_many_entries_ctl6() +{ + test_maxgroups_too_many_entries ctl6 port "dev $swp1" +} + +test_8021q_maxgroups_too_many_entries_cfg4() +{ + test_maxgroups_too_many_entries cfg4 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_too_many_entries_ctl4() +{ + test_maxgroups_too_many_entries ctl4 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_too_many_entries_cfg6() +{ + test_maxgroups_too_many_entries cfg6 port "dev $swp1 vid 10" +} + +test_8021q_maxgroups_too_many_entries_ctl6() +{ + test_maxgroups_too_many_entries ctl6 port "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_many_entries_cfg4() +{ + test_maxgroups_too_many_entries cfg4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_many_entries_ctl4() +{ + test_maxgroups_too_many_entries ctl4 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_many_entries_cfg6() +{ + test_maxgroups_too_many_entries cfg6 port_vlan "dev $swp1 vid 10" +} + +test_8021qvs_maxgroups_too_many_entries_ctl6() +{ + test_maxgroups_too_many_entries ctl6 port_vlan "dev $swp1 vid 10" +} + +test_maxgroups_too_many_cross_vlan() +{ + local CFG=$1; shift + + RET=0 + + local locus0="dev $swp1" + local locus1="dev $swp1 vid 10" + local locus2="dev $swp1 vid 20" + local n1=$(bridge_port_vlan_ngroups_get "$locus1") + local n2=$(bridge_port_vlan_ngroups_get "$locus2") + local msg + + if ((n1 > n2)); then + local tmp=$n1 + n1=$n2 + n2=$tmp + + tmp="$locus1" + locus1="$locus2" + locus2="$tmp" + fi + + # Now 0 <= n1 <= n2. + ${CFG}_entries_add "$locus2" temp 5 112 + check_err $? "Couldn't add 5 entries" + + n2=$(bridge_port_vlan_ngroups_get "$locus2") + # Now 0 <= n1 < n2-1. + + # Setting locus1'maxgroups to n2-1 should pass. The number is + # smaller than both the absolute number of MDB entries, and in + # particular than number of locus2's number of entries, but it is + # large enough to cover locus1's entries. Thus we check that + # individual VLAN's ngroups are independent. + bridge_port_vlan_maxgroups_set "$locus1" $((n2-1)) + check_err $? "Setting ${locus1}'s maxgroups to $((n2-1)) failed" + + msg=$(${CFG}_entries_add "$locus1" temp $n2 111 2>&1) + check_fail $? "$locus1: Adding $n2 MDB entries passed, but should have failed" + bridge_maxgroups_errmsg_check_${CFG} "$msg" + + bridge_port_maxgroups_set "$locus0" $((n1 + n2 + 2)) + check_err $? "$locus0: Couldn't set maximum" + + msg=$(${CFG}_entries_add "$locus1" temp 5 111 2>&1) + check_fail $? "$locus1: Adding 5 MDB entries passed, but should have failed" + bridge_maxgroups_errmsg_check_${CFG} "$msg" + + # IGMP/MLD packets can cause several entries to be added, before + # the maximum is hit and the rest is then bounced. Remove what was + # committed, if anything. + ${CFG}_entries_del "$locus1" temp 5 111 2>/dev/null + + ${CFG}_entries_add "$locus1" temp 2 111 + check_err $? "$locus1: Adding 2 MDB entries failed, but should have passed" + + ${CFG}_entries_del "$locus1" temp 2 111 + check_err $? "Couldn't delete MDB entries" + + ${CFG}_entries_del "$locus2" temp 5 112 + check_err $? "Couldn't delete MDB entries" + + bridge_port_vlan_maxgroups_set "$locus1" 0 + check_err $? "$locus1: Couldn't set maximum to 0" + + bridge_port_maxgroups_set "$locus0" 0 + check_err $? "$locus0: Couldn't set maximum to 0" + + log_test "$CFG: port_vlan maxgroups: isolation of port and per-VLAN ngroups" +} + +test_8021qvs_maxgroups_too_many_cross_vlan_cfg4() +{ + test_maxgroups_too_many_cross_vlan cfg4 +} + +test_8021qvs_maxgroups_too_many_cross_vlan_ctl4() +{ + test_maxgroups_too_many_cross_vlan ctl4 +} + +test_8021qvs_maxgroups_too_many_cross_vlan_cfg6() +{ + test_maxgroups_too_many_cross_vlan cfg6 +} + +test_8021qvs_maxgroups_too_many_cross_vlan_ctl6() +{ + test_maxgroups_too_many_cross_vlan ctl6 +} + +test_vlan_attributes() +{ + local locus=$1; shift + local expect=$1; shift + + RET=0 + + local max=$(bridge_port_vlan_maxgroups_get "$locus") + local n=$(bridge_port_vlan_ngroups_get "$locus") + + eval "[[ $max $expect ]]" + check_err $? "$locus: maxgroups attribute expected to be $expect, but was $max" + + eval "[[ $n $expect ]]" + check_err $? "$locus: ngroups attribute expected to be $expect, but was $n" + + log_test "port_vlan: presence of ngroups and maxgroups attributes" +} + +test_8021q_vlan_attributes() +{ + test_vlan_attributes "dev $swp1 vid 10" "== null" +} + +test_8021qvs_vlan_attributes() +{ + test_vlan_attributes "dev $swp1 vid 10" "-ge 0" +} + +test_toggle_vlan_snooping() +{ + local mode=$1; shift + + RET=0 + + local CFG=cfg4 + local context=port_vlan + local locus="dev $swp1 vid 10" + + ${CFG}_entries_add "$locus" $mode 5 + check_err $? "Couldn't add MDB entries" + + bridge_${context}_maxgroups_set "$locus" 100 + check_err $? "Failed to set max to 100" + + ip link set dev br0 type bridge mcast_vlan_snooping 0 + sleep 1 + ip link set dev br0 type bridge mcast_vlan_snooping 1 + + local n=$(bridge_${context}_ngroups_get "$locus") + local nn=$(bridge mdb show dev br0 | grep $swp1 | wc -l) + ((nn == n)) + check_err $? "mcast_n_groups expected to be $nn, but $n reported" + + local max=$(bridge_${context}_maxgroups_get "$locus") + ((max == 100)) + check_err $? "Max groups expected to be 100 but $max reported" + + bridge_${context}_maxgroups_set "$locus" 0 + check_err $? "Failed to set max to 0" + + log_test "$CFG: $context: $mode: mcast_vlan_snooping toggle" +} + +test_toggle_vlan_snooping_temp() +{ + test_toggle_vlan_snooping temp +} + +test_toggle_vlan_snooping_permanent() +{ + test_toggle_vlan_snooping permanent +} + +# ngroup test suites + +test_8021d_ngroups_cfg4() +{ + test_8021d_ngroups_reporting_cfg4 +} + +test_8021d_ngroups_ctl4() +{ + test_8021d_ngroups_reporting_ctl4 +} + +test_8021d_ngroups_cfg6() +{ + test_8021d_ngroups_reporting_cfg6 +} + +test_8021d_ngroups_ctl6() +{ + test_8021d_ngroups_reporting_ctl6 +} + +test_8021q_ngroups_cfg4() +{ + test_8021q_ngroups_reporting_cfg4 +} + +test_8021q_ngroups_ctl4() +{ + test_8021q_ngroups_reporting_ctl4 +} + +test_8021q_ngroups_cfg6() +{ + test_8021q_ngroups_reporting_cfg6 +} + +test_8021q_ngroups_ctl6() +{ + test_8021q_ngroups_reporting_ctl6 +} + +test_8021qvs_ngroups_cfg4() +{ + test_8021qvs_ngroups_reporting_cfg4 + test_8021qvs_ngroups_cross_vlan_cfg4 +} + +test_8021qvs_ngroups_ctl4() +{ + test_8021qvs_ngroups_reporting_ctl4 + test_8021qvs_ngroups_cross_vlan_ctl4 +} + +test_8021qvs_ngroups_cfg6() +{ + test_8021qvs_ngroups_reporting_cfg6 + test_8021qvs_ngroups_cross_vlan_cfg6 +} + +test_8021qvs_ngroups_ctl6() +{ + test_8021qvs_ngroups_reporting_ctl6 + test_8021qvs_ngroups_cross_vlan_ctl6 +} + +# maxgroups test suites + +test_8021d_maxgroups_cfg4() +{ + test_8021d_maxgroups_zero_cfg4 + test_8021d_maxgroups_too_low_cfg4 + test_8021d_maxgroups_too_many_entries_cfg4 +} + +test_8021d_maxgroups_ctl4() +{ + test_8021d_maxgroups_zero_ctl4 + test_8021d_maxgroups_too_low_ctl4 + test_8021d_maxgroups_too_many_entries_ctl4 +} + +test_8021d_maxgroups_cfg6() +{ + test_8021d_maxgroups_zero_cfg6 + test_8021d_maxgroups_too_low_cfg6 + test_8021d_maxgroups_too_many_entries_cfg6 +} + +test_8021d_maxgroups_ctl6() +{ + test_8021d_maxgroups_zero_ctl6 + test_8021d_maxgroups_too_low_ctl6 + test_8021d_maxgroups_too_many_entries_ctl6 +} + +test_8021q_maxgroups_cfg4() +{ + test_8021q_maxgroups_zero_cfg4 + test_8021q_maxgroups_too_low_cfg4 + test_8021q_maxgroups_too_many_entries_cfg4 +} + +test_8021q_maxgroups_ctl4() +{ + test_8021q_maxgroups_zero_ctl4 + test_8021q_maxgroups_too_low_ctl4 + test_8021q_maxgroups_too_many_entries_ctl4 +} + +test_8021q_maxgroups_cfg6() +{ + test_8021q_maxgroups_zero_cfg6 + test_8021q_maxgroups_too_low_cfg6 + test_8021q_maxgroups_too_many_entries_cfg6 +} + +test_8021q_maxgroups_ctl6() +{ + test_8021q_maxgroups_zero_ctl6 + test_8021q_maxgroups_too_low_ctl6 + test_8021q_maxgroups_too_many_entries_ctl6 +} + +test_8021qvs_maxgroups_cfg4() +{ + test_8021qvs_maxgroups_zero_cfg4 + test_8021qvs_maxgroups_zero_cross_vlan_cfg4 + test_8021qvs_maxgroups_too_low_cfg4 + test_8021qvs_maxgroups_too_many_entries_cfg4 + test_8021qvs_maxgroups_too_many_cross_vlan_cfg4 +} + +test_8021qvs_maxgroups_ctl4() +{ + test_8021qvs_maxgroups_zero_ctl4 + test_8021qvs_maxgroups_zero_cross_vlan_ctl4 + test_8021qvs_maxgroups_too_low_ctl4 + test_8021qvs_maxgroups_too_many_entries_ctl4 + test_8021qvs_maxgroups_too_many_cross_vlan_ctl4 +} + +test_8021qvs_maxgroups_cfg6() +{ + test_8021qvs_maxgroups_zero_cfg6 + test_8021qvs_maxgroups_zero_cross_vlan_cfg6 + test_8021qvs_maxgroups_too_low_cfg6 + test_8021qvs_maxgroups_too_many_entries_cfg6 + test_8021qvs_maxgroups_too_many_cross_vlan_cfg6 +} + +test_8021qvs_maxgroups_ctl6() +{ + test_8021qvs_maxgroups_zero_ctl6 + test_8021qvs_maxgroups_zero_cross_vlan_ctl6 + test_8021qvs_maxgroups_too_low_ctl6 + test_8021qvs_maxgroups_too_many_entries_ctl6 + test_8021qvs_maxgroups_too_many_cross_vlan_ctl6 +} + +# other test suites + +test_8021qvs_toggle_vlan_snooping() +{ + test_toggle_vlan_snooping_temp + test_toggle_vlan_snooping_permanent +} + +# test groups + +test_8021d() +{ + # Tests for vlan_filtering 0 mcast_vlan_snooping 0. + + switch_create_8021d + setup_wait + + test_8021d_ngroups_cfg4 + test_8021d_ngroups_ctl4 + test_8021d_ngroups_cfg6 + test_8021d_ngroups_ctl6 + test_8021d_maxgroups_cfg4 + test_8021d_maxgroups_ctl4 + test_8021d_maxgroups_cfg6 + test_8021d_maxgroups_ctl6 + + switch_destroy +} + +test_8021q() +{ + # Tests for vlan_filtering 1 mcast_vlan_snooping 0. + + switch_create_8021q + setup_wait + + test_8021q_vlan_attributes + test_8021q_ngroups_cfg4 + test_8021q_ngroups_ctl4 + test_8021q_ngroups_cfg6 + test_8021q_ngroups_ctl6 + test_8021q_maxgroups_cfg4 + test_8021q_maxgroups_ctl4 + test_8021q_maxgroups_cfg6 + test_8021q_maxgroups_ctl6 + + switch_destroy +} + +test_8021qvs() +{ + # Tests for vlan_filtering 1 mcast_vlan_snooping 1. + + switch_create_8021qvs + setup_wait + + test_8021qvs_vlan_attributes + test_8021qvs_ngroups_cfg4 + test_8021qvs_ngroups_ctl4 + test_8021qvs_ngroups_cfg6 + test_8021qvs_ngroups_ctl6 + test_8021qvs_maxgroups_cfg4 + test_8021qvs_maxgroups_ctl4 + test_8021qvs_maxgroups_cfg6 + test_8021qvs_maxgroups_ctl6 + test_8021qvs_toggle_vlan_snooping + + switch_destroy +} + +trap cleanup EXIT + +setup_prepare +tests_run + +exit $EXIT_STATUS diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh index 3d8e4ebda1b6..d47499ba81c7 100755 --- a/tools/testing/selftests/net/forwarding/lib.sh +++ b/tools/testing/selftests/net/forwarding/lib.sh @@ -524,27 +524,6 @@ cmd_jq() [ ! -z "$output" ] } -lldpad_app_wait_set() -{ - local dev=$1; shift - - while lldptool -t -i $dev -V APP -c app | grep -Eq "pending|unknown"; do - echo "$dev: waiting for lldpad to push pending APP updates" - sleep 5 - done -} - -lldpad_app_wait_del() -{ - # Give lldpad a chance to push down the changes. If the device is downed - # too soon, the updates will be left pending. However, they will have - # been struck off the lldpad's DB already, so we won't be able to tell - # they are pending. Then on next test iteration this would cause - # weirdness as newly-added APP rules conflict with the old ones, - # sometimes getting stuck in an "unknown" state. - sleep 5 -} - pre_cleanup() { if [ "${PAUSE_ON_CLEANUP}" = "yes" ]; then @@ -1692,3 +1671,219 @@ hw_stats_monitor_test() log_test "${type}_stats notifications" } + +ipv4_to_bytes() +{ + local IP=$1; shift + + printf '%02x:' ${IP//./ } | + sed 's/:$//' +} + +# Convert a given IPv6 address, `IP' such that the :: token, if present, is +# expanded, and each 16-bit group is padded with zeroes to be 4 hexadecimal +# digits. An optional `BYTESEP' parameter can be given to further separate +# individual bytes of each 16-bit group. +expand_ipv6() +{ + local IP=$1; shift + local bytesep=$1; shift + + local cvt_ip=${IP/::/_} + local colons=${cvt_ip//[^:]/} + local allcol=::::::: + # IP where :: -> the appropriate number of colons: + local allcol_ip=${cvt_ip/_/${allcol:${#colons}}} + + echo $allcol_ip | tr : '\n' | + sed s/^/0000/ | + sed 's/.*\(..\)\(..\)/\1'"$bytesep"'\2/' | + tr '\n' : | + sed 's/:$//' +} + +ipv6_to_bytes() +{ + local IP=$1; shift + + expand_ipv6 "$IP" : +} + +u16_to_bytes() +{ + local u16=$1; shift + + printf "%04x" $u16 | sed 's/^/000/;s/^.*\(..\)\(..\)$/\1:\2/' +} + +# Given a mausezahn-formatted payload (colon-separated bytes given as %02x), +# possibly with a keyword CHECKSUM stashed where a 16-bit checksum should be, +# calculate checksum as per RFC 1071, assuming the CHECKSUM field (if any) +# stands for 00:00. +payload_template_calc_checksum() +{ + local payload=$1; shift + + ( + # Set input radix. + echo "16i" + # Push zero for the initial checksum. + echo 0 + + # Pad the payload with a terminating 00: in case we get an odd + # number of bytes. + echo "${payload%:}:00:" | + sed 's/CHECKSUM/00:00/g' | + tr '[:lower:]' '[:upper:]' | + # Add the word to the checksum. + sed 's/\(..\):\(..\):/\1\2+\n/g' | + # Strip the extra odd byte we pushed if left unconverted. + sed 's/\(..\):$//' + + echo "10000 ~ +" # Calculate and add carry. + echo "FFFF r - p" # Bit-flip and print. + ) | + dc | + tr '[:upper:]' '[:lower:]' +} + +payload_template_expand_checksum() +{ + local payload=$1; shift + local checksum=$1; shift + + local ckbytes=$(u16_to_bytes $checksum) + + echo "$payload" | sed "s/CHECKSUM/$ckbytes/g" +} + +payload_template_nbytes() +{ + local payload=$1; shift + + payload_template_expand_checksum "${payload%:}" 0 | + sed 's/:/\n/g' | wc -l +} + +igmpv3_is_in_get() +{ + local GRP=$1; shift + local sources=("$@") + + local igmpv3 + local nsources=$(u16_to_bytes ${#sources[@]}) + + # IS_IN ( $sources ) + igmpv3=$(: + )"22:"$( : Type - Membership Report + )"00:"$( : Reserved + )"CHECKSUM:"$( : Checksum + )"00:00:"$( : Reserved + )"00:01:"$( : Number of Group Records + )"01:"$( : Record Type - IS_IN + )"00:"$( : Aux Data Len + )"${nsources}:"$( : Number of Sources + )"$(ipv4_to_bytes $GRP):"$( : Multicast Address + )"$(for src in "${sources[@]}"; do + ipv4_to_bytes $src + echo -n : + done)"$( : Source Addresses + ) + local checksum=$(payload_template_calc_checksum "$igmpv3") + + payload_template_expand_checksum "$igmpv3" $checksum +} + +igmpv2_leave_get() +{ + local GRP=$1; shift + + local payload=$(: + )"17:"$( : Type - Leave Group + )"00:"$( : Max Resp Time - not meaningful + )"CHECKSUM:"$( : Checksum + )"$(ipv4_to_bytes $GRP)"$( : Group Address + ) + local checksum=$(payload_template_calc_checksum "$payload") + + payload_template_expand_checksum "$payload" $checksum +} + +mldv2_is_in_get() +{ + local SIP=$1; shift + local GRP=$1; shift + local sources=("$@") + + local hbh + local icmpv6 + local nsources=$(u16_to_bytes ${#sources[@]}) + + hbh=$(: + )"3a:"$( : Next Header - ICMPv6 + )"00:"$( : Hdr Ext Len + )"00:00:00:00:00:00:"$( : Options and Padding + ) + + icmpv6=$(: + )"8f:"$( : Type - MLDv2 Report + )"00:"$( : Code + )"CHECKSUM:"$( : Checksum + )"00:00:"$( : Reserved + )"00:01:"$( : Number of Group Records + )"01:"$( : Record Type - IS_IN + )"00:"$( : Aux Data Len + )"${nsources}:"$( : Number of Sources + )"$(ipv6_to_bytes $GRP):"$( : Multicast address + )"$(for src in "${sources[@]}"; do + ipv6_to_bytes $src + echo -n : + done)"$( : Source Addresses + ) + + local len=$(u16_to_bytes $(payload_template_nbytes $icmpv6)) + local sudohdr=$(: + )"$(ipv6_to_bytes $SIP):"$( : SIP + )"$(ipv6_to_bytes $GRP):"$( : DIP is multicast address + )"${len}:"$( : Upper-layer length + )"00:3a:"$( : Zero and next-header + ) + local checksum=$(payload_template_calc_checksum ${sudohdr}${icmpv6}) + + payload_template_expand_checksum "$hbh$icmpv6" $checksum +} + +mldv1_done_get() +{ + local SIP=$1; shift + local GRP=$1; shift + + local hbh + local icmpv6 + + hbh=$(: + )"3a:"$( : Next Header - ICMPv6 + )"00:"$( : Hdr Ext Len + )"00:00:00:00:00:00:"$( : Options and Padding + ) + + icmpv6=$(: + )"84:"$( : Type - MLDv1 Done + )"00:"$( : Code + )"CHECKSUM:"$( : Checksum + )"00:00:"$( : Max Resp Delay - not meaningful + )"00:00:"$( : Reserved + )"$(ipv6_to_bytes $GRP):"$( : Multicast address + ) + + local len=$(u16_to_bytes $(payload_template_nbytes $icmpv6)) + local sudohdr=$(: + )"$(ipv6_to_bytes $SIP):"$( : SIP + )"$(ipv6_to_bytes $GRP):"$( : DIP is multicast address + )"${len}:"$( : Upper-layer length + )"00:3a:"$( : Zero and next-header + ) + local checksum=$(payload_template_calc_checksum ${sudohdr}${icmpv6}) + + payload_template_expand_checksum "$hbh$icmpv6" $checksum +} diff --git a/tools/testing/selftests/net/forwarding/tc_actions.sh b/tools/testing/selftests/net/forwarding/tc_actions.sh index 1e0a62f638fe..a96cff8e7219 100755 --- a/tools/testing/selftests/net/forwarding/tc_actions.sh +++ b/tools/testing/selftests/net/forwarding/tc_actions.sh @@ -3,7 +3,8 @@ ALL_TESTS="gact_drop_and_ok_test mirred_egress_redirect_test \ mirred_egress_mirror_test matchall_mirred_egress_mirror_test \ - gact_trap_test mirred_egress_to_ingress_test" + gact_trap_test mirred_egress_to_ingress_test \ + mirred_egress_to_ingress_tcp_test" NUM_NETIFS=4 source tc_common.sh source lib.sh @@ -198,6 +199,52 @@ mirred_egress_to_ingress_test() log_test "mirred_egress_to_ingress ($tcflags)" } +mirred_egress_to_ingress_tcp_test() +{ + mirred_e2i_tf1=$(mktemp) mirred_e2i_tf2=$(mktemp) + + RET=0 + dd conv=sparse status=none if=/dev/zero bs=1M count=2 of=$mirred_e2i_tf1 + tc filter add dev $h1 protocol ip pref 100 handle 100 egress flower \ + $tcflags ip_proto tcp src_ip 192.0.2.1 dst_ip 192.0.2.2 \ + action ct commit nat src addr 192.0.2.2 pipe \ + action ct clear pipe \ + action ct commit nat dst addr 192.0.2.1 pipe \ + action ct clear pipe \ + action skbedit ptype host pipe \ + action mirred ingress redirect dev $h1 + tc filter add dev $h1 protocol ip pref 101 handle 101 egress flower \ + $tcflags ip_proto icmp \ + action mirred ingress redirect dev $h1 + tc filter add dev $h1 protocol ip pref 102 handle 102 ingress flower \ + ip_proto icmp \ + action drop + + ip vrf exec v$h1 nc --recv-only -w10 -l -p 12345 -o $mirred_e2i_tf2 & + local rpid=$! + ip vrf exec v$h1 nc -w1 --send-only 192.0.2.2 12345 <$mirred_e2i_tf1 + wait -n $rpid + cmp -s $mirred_e2i_tf1 $mirred_e2i_tf2 + check_err $? "server output check failed" + + $MZ $h1 -c 10 -p 64 -a $h1mac -b $h1mac -A 192.0.2.1 -B 192.0.2.1 \ + -t icmp "ping,id=42,seq=5" -q + tc_check_packets "dev $h1 egress" 101 10 + check_err $? "didn't mirred redirect ICMP" + tc_check_packets "dev $h1 ingress" 102 10 + check_err $? "didn't drop mirred ICMP" + local overlimits=$(tc_rule_stats_get ${h1} 101 egress .overlimits) + test ${overlimits} = 10 + check_err $? "wrong overlimits, expected 10 got ${overlimits}" + + tc filter del dev $h1 egress protocol ip pref 100 handle 100 flower + tc filter del dev $h1 egress protocol ip pref 101 handle 101 flower + tc filter del dev $h1 ingress protocol ip pref 102 handle 102 flower + + rm -f $mirred_e2i_tf1 $mirred_e2i_tf2 + log_test "mirred_egress_to_ingress_tcp ($tcflags)" +} + setup_prepare() { h1=${NETIFS[p1]} @@ -223,6 +270,8 @@ setup_prepare() cleanup() { + local tf + pre_cleanup switch_destroy @@ -233,6 +282,8 @@ cleanup() ip link set $swp2 address $swp2origmac ip link set $swp1 address $swp1origmac + + for tf in $mirred_e2i_tf1 $mirred_e2i_tf2; do rm -f $tf; done } mirred_egress_redirect_test() diff --git a/tools/testing/selftests/net/ip_local_port_range.c b/tools/testing/selftests/net/ip_local_port_range.c new file mode 100644 index 000000000000..75e3fdacdf73 --- /dev/null +++ b/tools/testing/selftests/net/ip_local_port_range.c @@ -0,0 +1,447 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause +// Copyright (c) 2023 Cloudflare + +/* Test IP_LOCAL_PORT_RANGE socket option: IPv4 + IPv6, TCP + UDP. + * + * Tests assume that net.ipv4.ip_local_port_range is [40000, 49999]. + * Don't run these directly but with ip_local_port_range.sh script. + */ + +#include <fcntl.h> +#include <netinet/ip.h> + +#include "../kselftest_harness.h" + +#ifndef IP_LOCAL_PORT_RANGE +#define IP_LOCAL_PORT_RANGE 51 +#endif + +static __u32 pack_port_range(__u16 lo, __u16 hi) +{ + return (hi << 16) | (lo << 0); +} + +static void unpack_port_range(__u32 range, __u16 *lo, __u16 *hi) +{ + *lo = range & 0xffff; + *hi = range >> 16; +} + +static int get_so_domain(int fd) +{ + int domain, err; + socklen_t len; + + len = sizeof(domain); + err = getsockopt(fd, SOL_SOCKET, SO_DOMAIN, &domain, &len); + if (err) + return -1; + + return domain; +} + +static int bind_to_loopback_any_port(int fd) +{ + union { + struct sockaddr sa; + struct sockaddr_in v4; + struct sockaddr_in6 v6; + } addr; + socklen_t addr_len; + + memset(&addr, 0, sizeof(addr)); + switch (get_so_domain(fd)) { + case AF_INET: + addr.v4.sin_family = AF_INET; + addr.v4.sin_port = htons(0); + addr.v4.sin_addr.s_addr = htonl(INADDR_LOOPBACK); + addr_len = sizeof(addr.v4); + break; + case AF_INET6: + addr.v6.sin6_family = AF_INET6; + addr.v6.sin6_port = htons(0); + addr.v6.sin6_addr = in6addr_loopback; + addr_len = sizeof(addr.v6); + break; + default: + return -1; + } + + return bind(fd, &addr.sa, addr_len); +} + +static int get_sock_port(int fd) +{ + union { + struct sockaddr sa; + struct sockaddr_in v4; + struct sockaddr_in6 v6; + } addr; + socklen_t addr_len; + int err; + + addr_len = sizeof(addr); + memset(&addr, 0, sizeof(addr)); + err = getsockname(fd, &addr.sa, &addr_len); + if (err) + return -1; + + switch (addr.sa.sa_family) { + case AF_INET: + return ntohs(addr.v4.sin_port); + case AF_INET6: + return ntohs(addr.v6.sin6_port); + default: + errno = EAFNOSUPPORT; + return -1; + } +} + +static int get_ip_local_port_range(int fd, __u32 *range) +{ + socklen_t len; + __u32 val; + int err; + + len = sizeof(val); + err = getsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &val, &len); + if (err) + return -1; + + *range = val; + return 0; +} + +FIXTURE(ip_local_port_range) {}; + +FIXTURE_SETUP(ip_local_port_range) +{ +} + +FIXTURE_TEARDOWN(ip_local_port_range) +{ +} + +FIXTURE_VARIANT(ip_local_port_range) { + int so_domain; + int so_type; + int so_protocol; +}; + +FIXTURE_VARIANT_ADD(ip_local_port_range, ip4_tcp) { + .so_domain = AF_INET, + .so_type = SOCK_STREAM, + .so_protocol = 0, +}; + +FIXTURE_VARIANT_ADD(ip_local_port_range, ip4_udp) { + .so_domain = AF_INET, + .so_type = SOCK_DGRAM, + .so_protocol = 0, +}; + +FIXTURE_VARIANT_ADD(ip_local_port_range, ip4_stcp) { + .so_domain = AF_INET, + .so_type = SOCK_STREAM, + .so_protocol = IPPROTO_SCTP, +}; + +FIXTURE_VARIANT_ADD(ip_local_port_range, ip6_tcp) { + .so_domain = AF_INET6, + .so_type = SOCK_STREAM, + .so_protocol = 0, +}; + +FIXTURE_VARIANT_ADD(ip_local_port_range, ip6_udp) { + .so_domain = AF_INET6, + .so_type = SOCK_DGRAM, + .so_protocol = 0, +}; + +FIXTURE_VARIANT_ADD(ip_local_port_range, ip6_stcp) { + .so_domain = AF_INET6, + .so_type = SOCK_STREAM, + .so_protocol = IPPROTO_SCTP, +}; + +TEST_F(ip_local_port_range, invalid_option_value) +{ + __u16 val16; + __u32 val32; + __u64 val64; + int fd, err; + + fd = socket(variant->so_domain, variant->so_type, variant->so_protocol); + ASSERT_GE(fd, 0) TH_LOG("socket failed"); + + /* Too few bytes */ + val16 = 40000; + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &val16, sizeof(val16)); + EXPECT_TRUE(err) TH_LOG("expected setsockopt(IP_LOCAL_PORT_RANGE) to fail"); + EXPECT_EQ(errno, EINVAL); + + /* Empty range: low port > high port */ + val32 = pack_port_range(40222, 40111); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &val32, sizeof(val32)); + EXPECT_TRUE(err) TH_LOG("expected setsockopt(IP_LOCAL_PORT_RANGE) to fail"); + EXPECT_EQ(errno, EINVAL); + + /* Too many bytes */ + val64 = pack_port_range(40333, 40444); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &val64, sizeof(val64)); + EXPECT_TRUE(err) TH_LOG("expected setsockopt(IP_LOCAL_PORT_RANGE) to fail"); + EXPECT_EQ(errno, EINVAL); + + err = close(fd); + ASSERT_TRUE(!err) TH_LOG("close failed"); +} + +TEST_F(ip_local_port_range, port_range_out_of_netns_range) +{ + const struct test { + __u16 range_lo; + __u16 range_hi; + } tests[] = { + { 30000, 39999 }, /* socket range below netns range */ + { 50000, 59999 }, /* socket range above netns range */ + }; + const struct test *t; + + for (t = tests; t < tests + ARRAY_SIZE(tests); t++) { + /* Bind a couple of sockets, not just one, to check + * that the range wasn't clamped to a single port from + * the netns range. That is [40000, 40000] or [49999, + * 49999], respectively for each test case. + */ + int fds[2], i; + + TH_LOG("lo %5hu, hi %5hu", t->range_lo, t->range_hi); + + for (i = 0; i < ARRAY_SIZE(fds); i++) { + int fd, err, port; + __u32 range; + + fd = socket(variant->so_domain, variant->so_type, variant->so_protocol); + ASSERT_GE(fd, 0) TH_LOG("#%d: socket failed", i); + + range = pack_port_range(t->range_lo, t->range_hi); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("#%d: setsockopt(IP_LOCAL_PORT_RANGE) failed", i); + + err = bind_to_loopback_any_port(fd); + ASSERT_TRUE(!err) TH_LOG("#%d: bind failed", i); + + /* Check that socket port range outside of ephemeral range is ignored */ + port = get_sock_port(fd); + ASSERT_GE(port, 40000) TH_LOG("#%d: expected port within netns range", i); + ASSERT_LE(port, 49999) TH_LOG("#%d: expected port within netns range", i); + + fds[i] = fd; + } + + for (i = 0; i < ARRAY_SIZE(fds); i++) + ASSERT_TRUE(close(fds[i]) == 0) TH_LOG("#%d: close failed", i); + } +} + +TEST_F(ip_local_port_range, single_port_range) +{ + const struct test { + __u16 range_lo; + __u16 range_hi; + __u16 expected; + } tests[] = { + /* single port range within ephemeral range */ + { 45000, 45000, 45000 }, + /* first port in the ephemeral range (clamp from above) */ + { 0, 40000, 40000 }, + /* last port in the ephemeral range (clamp from below) */ + { 49999, 0, 49999 }, + }; + const struct test *t; + + for (t = tests; t < tests + ARRAY_SIZE(tests); t++) { + int fd, err, port; + __u32 range; + + TH_LOG("lo %5hu, hi %5hu, expected %5hu", + t->range_lo, t->range_hi, t->expected); + + fd = socket(variant->so_domain, variant->so_type, variant->so_protocol); + ASSERT_GE(fd, 0) TH_LOG("socket failed"); + + range = pack_port_range(t->range_lo, t->range_hi); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_LOCAL_PORT_RANGE) failed"); + + err = bind_to_loopback_any_port(fd); + ASSERT_TRUE(!err) TH_LOG("bind failed"); + + port = get_sock_port(fd); + ASSERT_EQ(port, t->expected) TH_LOG("unexpected local port"); + + err = close(fd); + ASSERT_TRUE(!err) TH_LOG("close failed"); + } +} + +TEST_F(ip_local_port_range, exhaust_8_port_range) +{ + __u8 port_set = 0; + int i, fd, err; + __u32 range; + __u16 port; + int fds[8]; + + for (i = 0; i < ARRAY_SIZE(fds); i++) { + fd = socket(variant->so_domain, variant->so_type, variant->so_protocol); + ASSERT_GE(fd, 0) TH_LOG("socket failed"); + + range = pack_port_range(40000, 40007); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_LOCAL_PORT_RANGE) failed"); + + err = bind_to_loopback_any_port(fd); + ASSERT_TRUE(!err) TH_LOG("bind failed"); + + port = get_sock_port(fd); + ASSERT_GE(port, 40000) TH_LOG("expected port within sockopt range"); + ASSERT_LE(port, 40007) TH_LOG("expected port within sockopt range"); + + port_set |= 1 << (port - 40000); + fds[i] = fd; + } + + /* Check that all every port from the test range is in use */ + ASSERT_EQ(port_set, 0xff) TH_LOG("expected all ports to be busy"); + + /* Check that bind() fails because the whole range is busy */ + fd = socket(variant->so_domain, variant->so_type, variant->so_protocol); + ASSERT_GE(fd, 0) TH_LOG("socket failed"); + + range = pack_port_range(40000, 40007); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_LOCAL_PORT_RANGE) failed"); + + err = bind_to_loopback_any_port(fd); + ASSERT_TRUE(err) TH_LOG("expected bind to fail"); + ASSERT_EQ(errno, EADDRINUSE); + + err = close(fd); + ASSERT_TRUE(!err) TH_LOG("close failed"); + + for (i = 0; i < ARRAY_SIZE(fds); i++) { + err = close(fds[i]); + ASSERT_TRUE(!err) TH_LOG("close failed"); + } +} + +TEST_F(ip_local_port_range, late_bind) +{ + union { + struct sockaddr sa; + struct sockaddr_in v4; + struct sockaddr_in6 v6; + } addr; + socklen_t addr_len; + const int one = 1; + int fd, err; + __u32 range; + __u16 port; + + if (variant->so_protocol == IPPROTO_SCTP) + SKIP(return, "SCTP doesn't support IP_BIND_ADDRESS_NO_PORT"); + + fd = socket(variant->so_domain, variant->so_type, 0); + ASSERT_GE(fd, 0) TH_LOG("socket failed"); + + range = pack_port_range(40100, 40199); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_LOCAL_PORT_RANGE) failed"); + + err = setsockopt(fd, SOL_IP, IP_BIND_ADDRESS_NO_PORT, &one, sizeof(one)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_BIND_ADDRESS_NO_PORT) failed"); + + err = bind_to_loopback_any_port(fd); + ASSERT_TRUE(!err) TH_LOG("bind failed"); + + port = get_sock_port(fd); + ASSERT_EQ(port, 0) TH_LOG("getsockname failed"); + + /* Invalid destination */ + memset(&addr, 0, sizeof(addr)); + switch (variant->so_domain) { + case AF_INET: + addr.v4.sin_family = AF_INET; + addr.v4.sin_port = htons(0); + addr.v4.sin_addr.s_addr = htonl(INADDR_ANY); + addr_len = sizeof(addr.v4); + break; + case AF_INET6: + addr.v6.sin6_family = AF_INET6; + addr.v6.sin6_port = htons(0); + addr.v6.sin6_addr = in6addr_any; + addr_len = sizeof(addr.v6); + break; + default: + ASSERT_TRUE(false) TH_LOG("unsupported socket domain"); + } + + /* connect() doesn't need to succeed for late bind to happen */ + connect(fd, &addr.sa, addr_len); + + port = get_sock_port(fd); + ASSERT_GE(port, 40100); + ASSERT_LE(port, 40199); + + err = close(fd); + ASSERT_TRUE(!err) TH_LOG("close failed"); +} + +TEST_F(ip_local_port_range, get_port_range) +{ + __u16 lo, hi; + __u32 range; + int fd, err; + + fd = socket(variant->so_domain, variant->so_type, variant->so_protocol); + ASSERT_GE(fd, 0) TH_LOG("socket failed"); + + /* Get range before it will be set */ + err = get_ip_local_port_range(fd, &range); + ASSERT_TRUE(!err) TH_LOG("getsockopt(IP_LOCAL_PORT_RANGE) failed"); + + unpack_port_range(range, &lo, &hi); + ASSERT_EQ(lo, 0) TH_LOG("unexpected low port"); + ASSERT_EQ(hi, 0) TH_LOG("unexpected high port"); + + range = pack_port_range(12345, 54321); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_LOCAL_PORT_RANGE) failed"); + + /* Get range after it has been set */ + err = get_ip_local_port_range(fd, &range); + ASSERT_TRUE(!err) TH_LOG("getsockopt(IP_LOCAL_PORT_RANGE) failed"); + + unpack_port_range(range, &lo, &hi); + ASSERT_EQ(lo, 12345) TH_LOG("unexpected low port"); + ASSERT_EQ(hi, 54321) TH_LOG("unexpected high port"); + + /* Unset the port range */ + range = pack_port_range(0, 0); + err = setsockopt(fd, SOL_IP, IP_LOCAL_PORT_RANGE, &range, sizeof(range)); + ASSERT_TRUE(!err) TH_LOG("setsockopt(IP_LOCAL_PORT_RANGE) failed"); + + /* Get range after it has been unset */ + err = get_ip_local_port_range(fd, &range); + ASSERT_TRUE(!err) TH_LOG("getsockopt(IP_LOCAL_PORT_RANGE) failed"); + + unpack_port_range(range, &lo, &hi); + ASSERT_EQ(lo, 0) TH_LOG("unexpected low port"); + ASSERT_EQ(hi, 0) TH_LOG("unexpected high port"); + + err = close(fd); + ASSERT_TRUE(!err) TH_LOG("close failed"); +} + +TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/net/ip_local_port_range.sh b/tools/testing/selftests/net/ip_local_port_range.sh new file mode 100755 index 000000000000..6c6ad346eaa0 --- /dev/null +++ b/tools/testing/selftests/net/ip_local_port_range.sh @@ -0,0 +1,5 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +./in_netns.sh \ + sh -c 'sysctl -q -w net.ipv4.ip_local_port_range="40000 49999" && ./ip_local_port_range' diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh index 24bcd7b9bdb2..ef628b16fe9b 100755 --- a/tools/testing/selftests/net/mptcp/diag.sh +++ b/tools/testing/selftests/net/mptcp/diag.sh @@ -17,6 +17,11 @@ flush_pids() sleep 1.1 ip netns pids "${ns}" | xargs --no-run-if-empty kill -SIGUSR1 &>/dev/null + + for _ in $(seq 10); do + [ -z "$(ip netns pids "${ns}")" ] && break + sleep 0.1 + done } cleanup() @@ -37,15 +42,20 @@ if [ $? -ne 0 ];then exit $ksft_skip fi +get_msk_inuse() +{ + ip netns exec $ns cat /proc/net/protocols | awk '$1~/^MPTCP$/{print $3}' +} + __chk_nr() { - local condition="$1" + local command="$1" local expected=$2 local msg nr shift 2 msg=$* - nr=$(ss -inmHMN $ns | $condition) + nr=$(eval $command) printf "%-50s" "$msg" if [ $nr != $expected ]; then @@ -57,9 +67,17 @@ __chk_nr() test_cnt=$((test_cnt+1)) } +__chk_msk_nr() +{ + local condition=$1 + shift 1 + + __chk_nr "ss -inmHMN $ns | $condition" $* +} + chk_msk_nr() { - __chk_nr "grep -c token:" $* + __chk_msk_nr "grep -c token:" $* } wait_msk_nr() @@ -97,12 +115,12 @@ wait_msk_nr() chk_msk_fallback_nr() { - __chk_nr "grep -c fallback" $* + __chk_msk_nr "grep -c fallback" $* } chk_msk_remote_key_nr() { - __chk_nr "grep -c remote_key" $* + __chk_msk_nr "grep -c remote_key" $* } __chk_listen() @@ -142,6 +160,26 @@ chk_msk_listen() nr=$(ss -Ml $filter | wc -l) } +chk_msk_inuse() +{ + local expected=$1 + local listen_nr + + shift 1 + + listen_nr=$(ss -N "${ns}" -Ml | grep -c LISTEN) + expected=$((expected + listen_nr)) + + for _ in $(seq 10); do + if [ $(get_msk_inuse) -eq $expected ];then + break + fi + sleep 0.1 + done + + __chk_nr get_msk_inuse $expected $* +} + # $1: ns, $2: port wait_local_port_listen() { @@ -195,8 +233,10 @@ wait_connected $ns 10000 chk_msk_nr 2 "after MPC handshake " chk_msk_remote_key_nr 2 "....chk remote_key" chk_msk_fallback_nr 0 "....chk no fallback" +chk_msk_inuse 2 "....chk 2 msk in use" flush_pids +chk_msk_inuse 0 "....chk 0 msk in use after flush" echo "a" | \ timeout ${timeout_test} \ @@ -211,8 +251,11 @@ echo "b" | \ 127.0.0.1 >/dev/null & wait_connected $ns 10001 chk_msk_fallback_nr 1 "check fallback" +chk_msk_inuse 1 "....chk 1 msk in use" flush_pids +chk_msk_inuse 0 "....chk 0 msk in use after flush" + NR_CLIENTS=100 for I in `seq 1 $NR_CLIENTS`; do echo "a" | \ @@ -232,6 +275,9 @@ for I in `seq 1 $NR_CLIENTS`; do done wait_msk_nr $((NR_CLIENTS*2)) "many msk socket present" +chk_msk_inuse $((NR_CLIENTS*2)) "....chk many msk in use" flush_pids +chk_msk_inuse 0 "....chk 0 msk in use after flush" + exit $ret diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index 8a8266957bc5..b25a31445ded 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -627,7 +627,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, char rbuf[8192]; ssize_t len; - if (fds.events == 0) + if (fds.events == 0 || quit) break; switch (poll(&fds, 1, poll_timeout)) { @@ -733,7 +733,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, } /* leave some time for late join/announce */ - if (cfg_remove) + if (cfg_remove && !quit) usleep(cfg_wait); return 0; diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index 079f8f46849d..42e3bd1a05f5 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -780,24 +780,17 @@ do_transfer() addr_nr_ns2=${addr_nr_ns2:9} fi - local local_addr - if is_v6 "${connect_addr}"; then - local_addr="::" - else - local_addr="0.0.0.0" - fi - extra_srv_args="$extra_args $extra_srv_args" if [ "$test_link_fail" -gt 1 ];then timeout ${timeout_test} \ ip netns exec ${listener_ns} \ ./mptcp_connect -t ${timeout_poll} -l -p $port -s ${srv_proto} \ - $extra_srv_args ${local_addr} < "$sinfail" > "$sout" & + $extra_srv_args "::" < "$sinfail" > "$sout" & else timeout ${timeout_test} \ ip netns exec ${listener_ns} \ ./mptcp_connect -t ${timeout_poll} -l -p $port -s ${srv_proto} \ - $extra_srv_args ${local_addr} < "$sin" > "$sout" & + $extra_srv_args "::" < "$sin" > "$sout" & fi local spid=$! @@ -2460,6 +2453,47 @@ v4mapped_tests() fi } +mixed_tests() +{ + if reset "IPv4 sockets do not use IPv6 addresses"; then + pm_nl_set_limits $ns1 0 1 + pm_nl_set_limits $ns2 1 1 + pm_nl_add_endpoint $ns1 dead:beef:2::1 flags signal + run_tests $ns1 $ns2 10.0.1.1 0 0 0 slow + chk_join_nr 0 0 0 + fi + + # Need an IPv6 mptcp socket to allow subflows of both families + if reset "simult IPv4 and IPv6 subflows"; then + pm_nl_set_limits $ns1 0 1 + pm_nl_set_limits $ns2 1 1 + pm_nl_add_endpoint $ns1 10.0.1.1 flags signal + run_tests $ns1 $ns2 dead:beef:2::1 0 0 0 slow + chk_join_nr 1 1 1 + fi + + # cross families subflows will not be created even in fullmesh mode + if reset "simult IPv4 and IPv6 subflows, fullmesh 1x1"; then + pm_nl_set_limits $ns1 0 4 + pm_nl_set_limits $ns2 1 4 + pm_nl_add_endpoint $ns2 dead:beef:2::2 flags subflow,fullmesh + pm_nl_add_endpoint $ns1 10.0.1.1 flags signal + run_tests $ns1 $ns2 dead:beef:2::1 0 0 0 slow + chk_join_nr 1 1 1 + fi + + # fullmesh still tries to create all the possibly subflows with + # matching family + if reset "simult IPv4 and IPv6 subflows, fullmesh 2x2"; then + pm_nl_set_limits $ns1 0 4 + pm_nl_set_limits $ns2 2 4 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal + pm_nl_add_endpoint $ns1 dead:beef:2::1 flags signal + run_tests $ns1 $ns2 dead:beef:1::1 0 0 fullmesh_1 slow + chk_join_nr 4 4 4 + fi +} + backup_tests() { # single subflow, backup @@ -3132,6 +3166,7 @@ all_tests_sorted=( a@add_tests 6@ipv6_tests 4@v4mapped_tests + M@mixed_tests b@backup_tests p@add_addr_ports_tests k@syncookies_tests diff --git a/tools/testing/selftests/net/mptcp/userspace_pm.sh b/tools/testing/selftests/net/mptcp/userspace_pm.sh index ab2d581f28a1..66c5be25c13d 100755 --- a/tools/testing/selftests/net/mptcp/userspace_pm.sh +++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh @@ -43,41 +43,40 @@ rndh=$(printf %x "$sec")-$(mktemp -u XXXXXX) ns1="ns1-$rndh" ns2="ns2-$rndh" +print_title() +{ + stdbuf -o0 -e0 printf "INFO: %s\n" "${1}" +} + kill_wait() { + [ $1 -eq 0 ] && return 0 + + kill -SIGUSR1 $1 > /dev/null 2>&1 kill $1 > /dev/null 2>&1 wait $1 2>/dev/null } cleanup() { - echo "cleanup" - - rm -rf $file $client_evts $server_evts + print_title "Cleanup" # Terminate the MPTCP connection and related processes - if [ $client4_pid -ne 0 ]; then - kill -SIGUSR1 $client4_pid > /dev/null 2>&1 - fi - if [ $server4_pid -ne 0 ]; then - kill_wait $server4_pid - fi - if [ $client6_pid -ne 0 ]; then - kill -SIGUSR1 $client6_pid > /dev/null 2>&1 - fi - if [ $server6_pid -ne 0 ]; then - kill_wait $server6_pid - fi - if [ $server_evts_pid -ne 0 ]; then - kill_wait $server_evts_pid - fi - if [ $client_evts_pid -ne 0 ]; then - kill_wait $client_evts_pid - fi + local pid + for pid in $client4_pid $server4_pid $client6_pid $server6_pid\ + $server_evts_pid $client_evts_pid + do + kill_wait $pid + done + local netns for netns in "$ns1" "$ns2" ;do ip netns del "$netns" done + + rm -rf $file $client_evts $server_evts + + stdbuf -o0 -e0 printf "Done\n" } trap cleanup EXIT @@ -108,6 +107,7 @@ ip -net "$ns2" addr add dead:beef:1::2/64 dev ns2eth1 nodad ip -net "$ns2" addr add dead:beef:2::2/64 dev ns2eth1 nodad ip -net "$ns2" link set ns2eth1 up +print_title "Init" stdbuf -o0 -e0 printf "Created network namespaces ns1, ns2 \t\t\t[OK]\n" make_file() @@ -193,11 +193,16 @@ make_connection() server_serverside=$(grep "type:1," "$server_evts" | sed --unbuffered -n 's/.*\(server_side:\)\([[:digit:]]*\).*$/\2/p;q') + stdbuf -o0 -e0 printf "Established IP%s MPTCP Connection ns2 => ns1 \t\t" $is_v6 if [ "$client_token" != "" ] && [ "$server_token" != "" ] && [ "$client_serverside" = 0 ] && [ "$server_serverside" = 1 ] then - stdbuf -o0 -e0 printf "Established IP%s MPTCP Connection ns2 => ns1 \t\t[OK]\n" $is_v6 + stdbuf -o0 -e0 printf "[OK]\n" else + stdbuf -o0 -e0 printf "[FAIL]\n" + stdbuf -o0 -e0 printf "\tExpected tokens (c:%s - s:%s) and server (c:%d - s:%d)\n" \ + "${client_token}" "${server_token}" \ + "${client_serverside}" "${server_serverside}" exit 1 fi @@ -217,6 +222,48 @@ make_connection() fi } +# $1: var name ; $2: prev ret +check_expected_one() +{ + local var="${1}" + local exp="e_${var}" + local prev_ret="${2}" + + if [ "${!var}" = "${!exp}" ] + then + return 0 + fi + + if [ "${prev_ret}" = "0" ] + then + stdbuf -o0 -e0 printf "[FAIL]\n" + fi + + stdbuf -o0 -e0 printf "\tExpected value for '%s': '%s', got '%s'.\n" \ + "${var}" "${!var}" "${!exp}" + return 1 +} + +# $@: all var names to check +check_expected() +{ + local ret=0 + local var + + for var in "${@}" + do + check_expected_one "${var}" "${ret}" || ret=1 + done + + if [ ${ret} -eq 0 ] + then + stdbuf -o0 -e0 printf "[OK]\n" + return 0 + fi + + exit 1 +} + verify_announce_event() { local evt=$1 @@ -242,19 +289,14 @@ verify_announce_event() fi dport=$(sed --unbuffered -n 's/.*\(dport:\)\([[:digit:]]*\).*$/\2/p;q' "$evt") id=$(sed --unbuffered -n 's/.*\(rem_id:\)\([[:digit:]]*\).*$/\2/p;q' "$evt") - if [ "$type" = "$e_type" ] && [ "$token" = "$e_token" ] && - [ "$addr" = "$e_addr" ] && [ "$dport" = "$e_dport" ] && - [ "$id" = "$e_id" ] - then - stdbuf -o0 -e0 printf "[OK]\n" - return 0 - fi - stdbuf -o0 -e0 printf "[FAIL]\n" - exit 1 + + check_expected "type" "token" "addr" "dport" "id" } test_announce() { + print_title "Announce tests" + # Capture events on the network namespace running the server :>"$server_evts" @@ -270,7 +312,7 @@ test_announce() then stdbuf -o0 -e0 printf "[OK]\n" else - stdbuf -o0 -e0 printf "[FAIL]\n" + stdbuf -o0 -e0 printf "[FAIL]\n\ttype defined: %s\n" "${type}" exit 1 fi @@ -347,18 +389,14 @@ verify_remove_event() type=$(sed --unbuffered -n 's/.*\(type:\)\([[:digit:]]*\).*$/\2/p;q' "$evt") token=$(sed --unbuffered -n 's/.*\(token:\)\([[:digit:]]*\).*$/\2/p;q' "$evt") id=$(sed --unbuffered -n 's/.*\(rem_id:\)\([[:digit:]]*\).*$/\2/p;q' "$evt") - if [ "$type" = "$e_type" ] && [ "$token" = "$e_token" ] && - [ "$id" = "$e_id" ] - then - stdbuf -o0 -e0 printf "[OK]\n" - return 0 - fi - stdbuf -o0 -e0 printf "[FAIL]\n" - exit 1 + + check_expected "type" "token" "id" } test_remove() { + print_title "Remove tests" + # Capture events on the network namespace running the server :>"$server_evts" @@ -507,20 +545,13 @@ verify_subflow_events() daddr=$(sed --unbuffered -n 's/.*\(daddr4:\)\([0-9.]*\).*$/\2/p;q' "$evt") fi - if [ "$type" = "$e_type" ] && [ "$token" = "$e_token" ] && - [ "$daddr" = "$e_daddr" ] && [ "$e_dport" = "$dport" ] && - [ "$family" = "$e_family" ] && [ "$saddr" = "$e_saddr" ] && - [ "$e_locid" = "$locid" ] && [ "$e_remid" = "$remid" ] - then - stdbuf -o0 -e0 printf "[OK]\n" - return 0 - fi - stdbuf -o0 -e0 printf "[FAIL]\n" - exit 1 + check_expected "type" "token" "daddr" "dport" "family" "saddr" "locid" "remid" } test_subflows() { + print_title "Subflows v4 or v6 only tests" + # Capture events on the network namespace running the server :>"$server_evts" @@ -754,6 +785,8 @@ test_subflows() test_subflows_v4_v6_mix() { + print_title "Subflows v4 and v6 mix tests" + # Attempt to add a listener at 10.0.2.1:<subflow-port> ip netns exec "$ns1" ./pm_nl_ctl listen 10.0.2.1\ $app6_port > /dev/null 2>&1 & @@ -800,6 +833,8 @@ test_subflows_v4_v6_mix() test_prio() { + print_title "Prio tests" + local count # Send MP_PRIO signal from client to server machine @@ -811,7 +846,7 @@ test_prio() count=$(ip netns exec "$ns2" nstat -as | grep MPTcpExtMPPrioTx | awk '{print $2}') [ -z "$count" ] && count=0 if [ $count != 1 ]; then - stdbuf -o0 -e0 printf "[FAIL]\n" + stdbuf -o0 -e0 printf "[FAIL]\n\tCount != 1: %d\n" "${count}" exit 1 else stdbuf -o0 -e0 printf "[OK]\n" @@ -822,7 +857,7 @@ test_prio() count=$(ip netns exec "$ns1" nstat -as | grep MPTcpExtMPPrioRx | awk '{print $2}') [ -z "$count" ] && count=0 if [ $count != 1 ]; then - stdbuf -o0 -e0 printf "[FAIL]\n" + stdbuf -o0 -e0 printf "[FAIL]\n\tCount != 1: %d\n" "${count}" exit 1 else stdbuf -o0 -e0 printf "[OK]\n" @@ -863,19 +898,13 @@ verify_listener_events() sed --unbuffered -n 's/.*\(saddr4:\)\([0-9.]*\).*$/\2/p;q') fi - if [ $type ] && [ $type = $e_type ] && - [ $family ] && [ $family = $e_family ] && - [ $saddr ] && [ $saddr = $e_saddr ] && - [ $sport ] && [ $sport = $e_sport ]; then - stdbuf -o0 -e0 printf "[OK]\n" - return 0 - fi - stdbuf -o0 -e0 printf "[FAIL]\n" - exit 1 + check_expected "type" "family" "saddr" "sport" } test_listener() { + print_title "Listener tests" + # Capture events on the network namespace running the client :>$client_evts @@ -902,8 +931,10 @@ test_listener() verify_listener_events $client_evts $LISTENER_CLOSED $AF_INET 10.0.2.2 $client4_port } +print_title "Make connections" make_connection make_connection "v6" + test_announce test_remove test_subflows diff --git a/tools/testing/selftests/net/bpf/nat6to4.c b/tools/testing/selftests/net/nat6to4.c index ac54c36b25fc..ac54c36b25fc 100644 --- a/tools/testing/selftests/net/bpf/nat6to4.c +++ b/tools/testing/selftests/net/nat6to4.c diff --git a/tools/testing/selftests/net/rps_default_mask.sh b/tools/testing/selftests/net/rps_default_mask.sh new file mode 100755 index 000000000000..0fd0d2db3abc --- /dev/null +++ b/tools/testing/selftests/net/rps_default_mask.sh @@ -0,0 +1,74 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +readonly ksft_skip=4 +readonly cpus=$(nproc) +ret=0 + +[ $cpus -gt 2 ] || exit $ksft_skip + +readonly INITIAL_RPS_DEFAULT_MASK=$(cat /proc/sys/net/core/rps_default_mask) +readonly TAG="$(mktemp -u XXXXXX)" +readonly VETH="veth${TAG}" +readonly NETNS="ns-${TAG}" + +setup() { + ip netns add "${NETNS}" + ip -netns "${NETNS}" link set lo up +} + +cleanup() { + echo $INITIAL_RPS_DEFAULT_MASK > /proc/sys/net/core/rps_default_mask + ip netns del $NETNS +} + +chk_rps() { + local rps_mask expected_rps_mask=$4 + local dev_name=$3 + local netns=$2 + local cmd="cat" + local msg=$1 + + [ -n "$netns" ] && cmd="ip netns exec $netns $cmd" + + rps_mask=$($cmd /sys/class/net/$dev_name/queues/rx-0/rps_cpus) + printf "%-60s" "$msg" + if [ $rps_mask -eq $expected_rps_mask ]; then + echo "[ ok ]" + else + echo "[fail] expected $expected_rps_mask found $rps_mask" + ret=1 + fi +} + +trap cleanup EXIT + +echo 0 > /proc/sys/net/core/rps_default_mask +setup +chk_rps "empty rps_default_mask" $NETNS lo 0 +cleanup + +echo 1 > /proc/sys/net/core/rps_default_mask +setup +chk_rps "changing rps_default_mask dont affect existing devices" "" lo $INITIAL_RPS_DEFAULT_MASK + +echo 3 > /proc/sys/net/core/rps_default_mask +chk_rps "changing rps_default_mask dont affect existing netns" $NETNS lo 0 + +ip link add name $VETH type veth peer netns $NETNS name $VETH +ip link set dev $VETH up +ip -n $NETNS link set dev $VETH up +chk_rps "changing rps_default_mask affect newly created devices" "" $VETH 3 +chk_rps "changing rps_default_mask don't affect newly child netns[II]" $NETNS $VETH 0 +ip netns del $NETNS + +setup +chk_rps "rps_default_mask is 0 by default in child netns" "$NETNS" lo 0 + +ip netns exec $NETNS sysctl -qw net.core.rps_default_mask=1 +ip link add name $VETH type veth peer netns $NETNS name $VETH +chk_rps "changing rps_default_mask in child ns don't affect the main one" "" lo $INITIAL_RPS_DEFAULT_MASK +chk_rps "changing rps_default_mask in child ns affects new childns devices" $NETNS $VETH 1 +chk_rps "changing rps_default_mask in child ns don't affect existing devices" $NETNS lo 0 + +exit $ret diff --git a/tools/testing/selftests/net/srv6_end_flavors_test.sh b/tools/testing/selftests/net/srv6_end_flavors_test.sh new file mode 100755 index 000000000000..50563443a4ad --- /dev/null +++ b/tools/testing/selftests/net/srv6_end_flavors_test.sh @@ -0,0 +1,869 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# author: Andrea Mayer <andrea.mayer@uniroma2.it> +# author: Paolo Lungaroni <paolo.lungaroni@uniroma2.it> +# +# This script is designed to test the support for "flavors" in the SRv6 End +# behavior. +# +# Flavors defined in RFC8986 [1] represent additional operations that can modify +# or extend the existing SRv6 End, End.X and End.T behaviors. For the sake of +# convenience, we report the list of flavors described in [1] hereafter: +# - Penultimate Segment Pop (PSP); +# - Ultimate Segment Pop (USP); +# - Ultimate Segment Decapsulation (USD). +# +# The End, End.X, and End.T behaviors can support these flavors either +# individually or in combinations. +# Currently in this selftest we consider only the PSP flavor for the SRv6 End +# behavior. However, it is possible to extend the script as soon as other +# flavors will be supported in the kernel. +# +# The purpose of the PSP flavor consists in instructing the penultimate node +# listed in the SRv6 policy to remove (i.e. pop) the outermost SRH from the IPv6 +# header. +# A PSP enabled SRv6 End behavior instance processes the SRH by: +# - decrementing the Segment Left (SL) value from 1 to 0; +# - copying the last SID from the SID List into the IPv6 Destination Address +# (DA); +# - removing the SRH from the extension headers following the IPv6 header. +# +# Once the SRH is removed, the IPv6 packet is forwarded to the destination using +# the IPv6 DA updated during the PSP operation (i.e. the IPv6 DA corresponding +# to the last SID carried by the removed SRH). +# +# Although the PSP flavor can be set for any SRv6 End behavior instance on any +# SR node, it will be active only on such behaviors bound to a penultimate SID +# for a given SRv6 policy. +# SL=2 SL=1 SL=0 +# | | | +# For example, given the SRv6 policy (SID List := <X, Y, Z>): +# - a PSP enabled SRv6 End behavior bound to SID Y will apply the PSP operation +# as Segment Left (SL) is 1, corresponding to the Penultimate Segment of the +# SID List; +# - a PSP enabled SRv6 End behavior bound to SID X will *NOT* apply the PSP +# operation as the Segment Left is 2. This behavior instance will apply the +# "standard" End packet processing, ignoring the configured PSP flavor at +# all. +# +# [1] RFC8986: https://datatracker.ietf.org/doc/html/rfc8986 +# +# Network topology +# ================ +# +# The network topology used in this selftest is depicted hereafter, composed by +# two hosts (hs-1, hs-2) and four routers (rt-1, rt-2, rt-3, rt-4). +# Hosts hs-1 and hs-2 are connected to routers rt-1 and rt-2, respectively, +# allowing them to communicate with each other. +# Traffic exchanged between hs-1 and hs-2 can follow different network paths. +# The network operator, through specific SRv6 Policies can steer traffic to one +# path rather than another. In this selftest this is implemented as follows: +# +# i) The SRv6 H.Insert behavior applies SRv6 Policies on traffic received by +# connected hosts. It pushes the Segment Routing Header (SRH) after the +# IPv6 header. The SRH contains the SID List (i.e. SRv6 Policy) needed for +# steering traffic across the segments/waypoints specified in that list; +# +# ii) The SRv6 End behavior advances the active SID in the SID List carried by +# the SRH; +# +# iii) The PSP enabled SRv6 End behavior is used to remove the SRH when such +# behavior is configured on a node bound to the Penultimate Segment carried +# by the SID List. +# +# cafe::1 cafe::2 +# +--------+ +--------+ +# | | | | +# | hs-1 | | hs-2 | +# | | | | +# +---+----+ +--- +---+ +# cafe::/64 | | cafe::/64 +# | | +# +---+----+ +----+---+ +# | | fcf0:0:1:2::/64 | | +# | rt-1 +-------------------+ rt-2 | +# | | | | +# +---+----+ +----+---+ +# | . . | +# | fcf0:0:1:3::/64 . | +# | . . | +# | . . | +# fcf0:0:1:4::/64 | . | fcf0:0:2:3::/64 +# | . . | +# | . . | +# | fcf0:0:2:4::/64 . | +# | . . | +# +---+----+ +----+---+ +# | | | | +# | rt-4 +-------------------+ rt-3 | +# | | fcf0:0:3:4::/64 | | +# +---+----+ +----+---+ +# +# Every fcf0:0:x:y::/64 network interconnects the SRv6 routers rt-x with rt-y in +# the IPv6 operator network. +# +# +# Local SID table +# =============== +# +# Each SRv6 router is configured with a Local SID table in which SIDs are +# stored. Considering the given SRv6 router rt-x, at least two SIDs are +# configured in the Local SID table: +# +# Local SID table for SRv6 router rt-x +# +---------------------------------------------------------------------+ +# |fcff:x::e is associated with the SRv6 End behavior | +# |fcff:x::ef1 is associated with the SRv6 End behavior with PSP flavor | +# +---------------------------------------------------------------------+ +# +# The fcff::/16 prefix is reserved by the operator for the SIDs. Reachability of +# SIDs is ensured by proper configuration of the IPv6 operator's network and +# SRv6 routers. +# +# +# SRv6 Policies +# ============= +# +# An SRv6 ingress router applies different SRv6 Policies to the traffic received +# from connected hosts on the basis of the destination addresses. +# In case of SRv6 H.Insert behavior, the SRv6 Policy enforcement consists of +# pushing the SRH (carrying a given SID List) after the existing IPv6 header. +# Note that in the inserting mode, there is no encapsulation at all. +# +# Before applying an SRv6 Policy using the SRv6 H.Insert behavior +# +------+---------+ +# | IPv6 | Payload | +# +------+---------+ +# +# After applying an SRv6 Policy using the SRv6 H.Insert behavior +# +------+-----+---------+ +# | IPv6 | SRH | Payload | +# +------+-----+---------+ +# +# Traffic from hs-1 to hs-2 +# ------------------------- +# +# Packets generated from hs-1 and directed towards hs-2 are +# handled by rt-1 which applies the following SRv6 Policy: +# +# i.a) IPv6 traffic, SID List=fcff:3::e,fcff:4::ef1,fcff:2::ef1,cafe::2 +# +# Router rt-1 is configured to enforce the Policy (i.a) through the SRv6 +# H.Insert behavior which pushes the SRH after the existing IPv6 header. This +# Policy steers the traffic from hs-1 across rt-3, rt-4, rt-2 and finally to the +# destination hs-2. +# +# As the packet reaches the router rt-3, the SRv6 End behavior bound to SID +# fcff:3::e is triggered. The behavior updates the Segment Left (from SL=3 to +# SL=2) in the SRH, the IPv6 DA with fcff:4::ef1 and forwards the packet to the +# next router on the path, i.e. rt-4. +# +# When router rt-4 receives the packet, the PSP enabled SRv6 End behavior bound +# to SID fcff:4::ef1 is executed. Since the SL=2, the PSP operation is *NOT* +# kicked in and the behavior applies the default End processing: the Segment +# Left is decreased (from SL=2 to SL=1), the IPv6 DA is updated with the SID +# fcff:2::ef1 and the packet is forwarded to router rt-2. +# +# The PSP enabled SRv6 End behavior on rt-2 is associated with SID fcff:2::ef1 +# and is executed as the packet is received. Because SL=1, the behavior applies +# the PSP processing on the packet as follows: i) SL is decreased, i.e. from +# SL=1 to SL=0; ii) last SID (cafe::2) is copied into the IPv6 DA; iii) the +# outermost SRH is removed from the extension headers following the IPv6 header. +# Once the PSP processing is completed, the packet is forwarded to the host hs-2 +# (destination). +# +# Traffic from hs-2 to hs-1 +# ------------------------- +# +# Packets generated from hs-2 and directed to hs-1 are handled by rt-2 which +# applies the following SRv6 Policy: +# +# i.b) IPv6 traffic, SID List=fcff:1::ef1,cafe::1 +# +# Router rt-2 is configured to enforce the Policy (i.b) through the SRv6 +# H.Insert behavior which pushes the SRH after the existing IPv6 header. This +# Policy steers the traffic from hs-2 across rt-1 and finally to the +# destination hs-1 +# +# +# When the router rt-1 receives the packet, the PSP enabled SRv6 End behavior +# associated with the SID fcff:1::ef1 is triggered. Since the SL=1, +# the PSP operation takes place: i) the SL is decremented; ii) the IPv6 DA is +# set with the last SID; iii) the SRH is removed from the extension headers +# after the IPv6 header. At this point, the packet with IPv6 DA=cafe::1 is sent +# to the destination, i.e. hs-1. + +# Kselftest framework requirement - SKIP code is 4. +readonly ksft_skip=4 + +readonly RDMSUFF="$(mktemp -u XXXXXXXX)" +readonly DUMMY_DEVNAME="dum0" +readonly RT2HS_DEVNAME="veth1" +readonly LOCALSID_TABLE_ID=90 +readonly IPv6_RT_NETWORK=fcf0:0 +readonly IPv6_HS_NETWORK=cafe +readonly IPv6_TESTS_ADDR=2001:db8::1 +readonly LOCATOR_SERVICE=fcff +readonly END_FUNC=000e +readonly END_PSP_FUNC=0ef1 + +PING_TIMEOUT_SEC=4 +PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no} + +# IDs of routers and hosts are initialized during the setup of the testing +# network +ROUTERS='' +HOSTS='' + +SETUP_ERR=1 + +ret=${ksft_skip} +nsuccess=0 +nfail=0 + +log_test() +{ + local rc="$1" + local expected="$2" + local msg="$3" + + if [ "${rc}" -eq "${expected}" ]; then + nsuccess=$((nsuccess+1)) + printf "\n TEST: %-60s [ OK ]\n" "${msg}" + else + ret=1 + nfail=$((nfail+1)) + printf "\n TEST: %-60s [FAIL]\n" "${msg}" + if [ "${PAUSE_ON_FAIL}" = "yes" ]; then + echo + echo "hit enter to continue, 'q' to quit" + read a + [ "$a" = "q" ] && exit 1 + fi + fi +} + +print_log_test_results() +{ + printf "\nTests passed: %3d\n" "${nsuccess}" + printf "Tests failed: %3d\n" "${nfail}" + + # when a test fails, the value of 'ret' is set to 1 (error code). + # Conversely, when all tests are passed successfully, the 'ret' value + # is set to 0 (success code). + if [ "${ret}" -ne 1 ]; then + ret=0 + fi +} + +log_section() +{ + echo + echo "################################################################################" + echo "TEST SECTION: $*" + echo "################################################################################" +} + +test_command_or_ksft_skip() +{ + local cmd="$1" + + if [ ! -x "$(command -v "${cmd}")" ]; then + echo "SKIP: Could not run test without \"${cmd}\" tool"; + exit "${ksft_skip}" + fi +} + +get_nodename() +{ + local name="$1" + + echo "${name}-${RDMSUFF}" +} + +get_rtname() +{ + local rtid="$1" + + get_nodename "rt-${rtid}" +} + +get_hsname() +{ + local hsid="$1" + + get_nodename "hs-${hsid}" +} + +__create_namespace() +{ + local name="$1" + + ip netns add "${name}" +} + +create_router() +{ + local rtid="$1" + local nsname + + nsname="$(get_rtname "${rtid}")" + + __create_namespace "${nsname}" +} + +create_host() +{ + local hsid="$1" + local nsname + + nsname="$(get_hsname "${hsid}")" + + __create_namespace "${nsname}" +} + +cleanup() +{ + local nsname + local i + + # destroy routers + for i in ${ROUTERS}; do + nsname="$(get_rtname "${i}")" + + ip netns del "${nsname}" &>/dev/null || true + done + + # destroy hosts + for i in ${HOSTS}; do + nsname="$(get_hsname "${i}")" + + ip netns del "${nsname}" &>/dev/null || true + done + + # check whether the setup phase was completed successfully or not. In + # case of an error during the setup phase of the testing environment, + # the selftest is considered as "skipped". + if [ "${SETUP_ERR}" -ne 0 ]; then + echo "SKIP: Setting up the testing environment failed" + exit "${ksft_skip}" + fi + + exit "${ret}" +} + +add_link_rt_pairs() +{ + local rt="$1" + local rt_neighs="$2" + local neigh + local nsname + local neigh_nsname + + nsname="$(get_rtname "${rt}")" + + for neigh in ${rt_neighs}; do + neigh_nsname="$(get_rtname "${neigh}")" + + ip link add "veth-rt-${rt}-${neigh}" netns "${nsname}" \ + type veth peer name "veth-rt-${neigh}-${rt}" \ + netns "${neigh_nsname}" + done +} + +get_network_prefix() +{ + local rt="$1" + local neigh="$2" + local p="${rt}" + local q="${neigh}" + + if [ "${p}" -gt "${q}" ]; then + p="${q}"; q="${rt}" + fi + + echo "${IPv6_RT_NETWORK}:${p}:${q}" +} + +# Given the description of a router <id:op> as an input, the function returns +# the <id> token which represents the ID of the router. +# i.e. input: "12:psp" +# output: "12" +__get_srv6_rtcfg_id() +{ + local element="$1" + + echo "${element}" | cut -d':' -f1 +} + +# Given the description of a router <id:op> as an input, the function returns +# the <op> token which represents the operation (e.g. End behavior with or +# withouth flavors) configured for the node. + +# Note that when the operation represents an End behavior with a list of +# flavors, the output is the ordered version of that list. +# i.e. input: "5:usp,psp,usd" +# output: "psp,usd,usp" +__get_srv6_rtcfg_op() +{ + local element="$1" + + # return the lexicographically ordered flavors + echo "${element}" | cut -d':' -f2 | sed 's/,/\n/g' | sort | \ + xargs | sed 's/ /,/g' +} + +# Setup the basic networking for the routers +setup_rt_networking() +{ + local rt="$1" + local rt_neighs="$2" + local nsname + local net_prefix + local devname + local neigh + + nsname="$(get_rtname "${rt}")" + + for neigh in ${rt_neighs}; do + devname="veth-rt-${rt}-${neigh}" + + net_prefix="$(get_network_prefix "${rt}" "${neigh}")" + + ip -netns "${nsname}" addr \ + add "${net_prefix}::${rt}/64" dev "${devname}" nodad + + ip -netns "${nsname}" link set "${devname}" up + done + + ip -netns "${nsname}" link set lo up + + ip -netns "${nsname}" link add ${DUMMY_DEVNAME} type dummy + ip -netns "${nsname}" link set ${DUMMY_DEVNAME} up + + ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.all.accept_dad=0 + ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.default.accept_dad=0 + ip netns exec "${nsname}" sysctl -wq net.ipv6.conf.all.forwarding=1 +} + +# Setup local SIDs for an SRv6 router +setup_rt_local_sids() +{ + local rt="$1" + local rt_neighs="$2" + local net_prefix + local devname + local nsname + local neigh + + nsname="$(get_rtname "${rt}")" + + for neigh in ${rt_neighs}; do + devname="veth-rt-${rt}-${neigh}" + + net_prefix="$(get_network_prefix "${rt}" "${neigh}")" + + # set underlay network routes for SIDs reachability + ip -netns "${nsname}" -6 route \ + add "${LOCATOR_SERVICE}:${neigh}::/32" \ + table "${LOCALSID_TABLE_ID}" \ + via "${net_prefix}::${neigh}" dev "${devname}" + done + + # Local End behavior (note that "dev" is a dummy interface chosen for + # the sake of simplicity). + ip -netns "${nsname}" -6 route \ + add "${LOCATOR_SERVICE}:${rt}::${END_FUNC}" \ + table "${LOCALSID_TABLE_ID}" \ + encap seg6local action End dev "${DUMMY_DEVNAME}" + + + # all SIDs start with a common locator. Routes and SRv6 Endpoint + # behavior instaces are grouped together in the 'localsid' table. + ip -netns "${nsname}" -6 rule \ + add to "${LOCATOR_SERVICE}::/16" \ + lookup "${LOCALSID_TABLE_ID}" prio 999 + + # set default routes to unreachable + ip -netns "${nsname}" -6 route \ + add unreachable default metric 4278198272 \ + dev "${DUMMY_DEVNAME}" +} + +# This helper function builds and installs the SID List (i.e. SRv6 Policy) +# to be applied on incoming packets at the ingress node. Moreover, it +# configures the SRv6 nodes specified in the SID List to process the traffic +# according to the operations required by the Policy itself. +# args: +# $1 - destination host (i.e. cafe::x host) +# $2 - SRv6 router configured for enforcing the SRv6 Policy +# $3 - compact way to represent a list of SRv6 routers with their operations +# (i.e. behaviors) that each of them needs to perform. Every <nodeid:op> +# element constructs a SID that is associated with the behavior <op> on +# the <nodeid> node. The list of such elements forms an SRv6 Policy. +__setup_rt_policy() +{ + local dst="$1" + local encap_rt="$2" + local policy_rts="$3" + local behavior_cfg + local in_nsname + local rt_nsname + local policy='' + local function + local fullsid + local op_type + local node + local n + + in_nsname="$(get_rtname "${encap_rt}")" + + for n in ${policy_rts}; do + node="$(__get_srv6_rtcfg_id "${n}")" + op_type="$(__get_srv6_rtcfg_op "${n}")" + rt_nsname="$(get_rtname "${node}")" + + case "${op_type}" in + "noflv") + policy="${policy}${LOCATOR_SERVICE}:${node}::${END_FUNC}," + function="${END_FUNC}" + behavior_cfg="End" + ;; + + "psp") + policy="${policy}${LOCATOR_SERVICE}:${node}::${END_PSP_FUNC}," + function="${END_PSP_FUNC}" + behavior_cfg="End flavors psp" + ;; + + *) + break + ;; + esac + + fullsid="${LOCATOR_SERVICE}:${node}::${function}" + + # add SRv6 Endpoint behavior to the selected router + if ! ip -netns "${rt_nsname}" -6 route get "${fullsid}" \ + &>/dev/null; then + ip -netns "${rt_nsname}" -6 route \ + add "${fullsid}" \ + table "${LOCALSID_TABLE_ID}" \ + encap seg6local action ${behavior_cfg} \ + dev "${DUMMY_DEVNAME}" + fi + done + + # we need to remove the trailing comma to avoid inserting an empty + # address (::0) in the SID List. + policy="${policy%,}" + + # add SRv6 policy to incoming traffic sent by connected hosts + ip -netns "${in_nsname}" -6 route \ + add "${IPv6_HS_NETWORK}::${dst}" \ + encap seg6 mode inline segs "${policy}" \ + dev "${DUMMY_DEVNAME}" + + ip -netns "${in_nsname}" -6 neigh \ + add proxy "${IPv6_HS_NETWORK}::${dst}" \ + dev "${RT2HS_DEVNAME}" +} + +# see __setup_rt_policy +setup_rt_policy_ipv6() +{ + __setup_rt_policy "$1" "$2" "$3" +} + +setup_hs() +{ + local hs="$1" + local rt="$2" + local hsname + local rtname + + hsname="$(get_hsname "${hs}")" + rtname="$(get_rtname "${rt}")" + + ip netns exec "${hsname}" sysctl -wq net.ipv6.conf.all.accept_dad=0 + ip netns exec "${hsname}" sysctl -wq net.ipv6.conf.default.accept_dad=0 + + ip -netns "${hsname}" link add veth0 type veth \ + peer name "${RT2HS_DEVNAME}" netns "${rtname}" + + ip -netns "${hsname}" addr \ + add "${IPv6_HS_NETWORK}::${hs}/64" dev veth0 nodad + + ip -netns "${hsname}" link set veth0 up + ip -netns "${hsname}" link set lo up + + ip -netns "${rtname}" addr \ + add "${IPv6_HS_NETWORK}::254/64" dev "${RT2HS_DEVNAME}" nodad + + ip -netns "${rtname}" link set "${RT2HS_DEVNAME}" up + + ip netns exec "${rtname}" \ + sysctl -wq net.ipv6.conf."${RT2HS_DEVNAME}".proxy_ndp=1 +} + +setup() +{ + local i + + # create routers + ROUTERS="1 2 3 4"; readonly ROUTERS + for i in ${ROUTERS}; do + create_router "${i}" + done + + # create hosts + HOSTS="1 2"; readonly HOSTS + for i in ${HOSTS}; do + create_host "${i}" + done + + # set up the links for connecting routers + add_link_rt_pairs 1 "2 3 4" + add_link_rt_pairs 2 "3 4" + add_link_rt_pairs 3 "4" + + # set up the basic connectivity of routers and routes required for + # reachability of SIDs. + setup_rt_networking 1 "2 3 4" + setup_rt_networking 2 "1 3 4" + setup_rt_networking 3 "1 2 4" + setup_rt_networking 4 "1 2 3" + + # set up the hosts connected to routers + setup_hs 1 1 + setup_hs 2 2 + + # set up default SRv6 Endpoints (i.e. SRv6 End behavior) + setup_rt_local_sids 1 "2 3 4" + setup_rt_local_sids 2 "1 3 4" + setup_rt_local_sids 3 "1 2 4" + setup_rt_local_sids 4 "1 2 3" + + # set up SRv6 policies + # create a connection between hosts hs-1 and hs-2. + # The path between hs-1 and hs-2 traverses SRv6 aware routers. + # For each direction two path are chosen: + # + # Direction hs-1 -> hs-2 (PSP flavor) + # - rt-1 (SRv6 H.Insert policy) + # - rt-3 (SRv6 End behavior) + # - rt-4 (SRv6 End flavor PSP with SL>1, acting as End behavior) + # - rt-2 (SRv6 End flavor PSP with SL=1) + # + # Direction hs-2 -> hs-1 (PSP flavor) + # - rt-2 (SRv6 H.Insert policy) + # - rt-1 (SRv6 End flavor PSP with SL=1) + setup_rt_policy_ipv6 2 1 "3:noflv 4:psp 2:psp" + setup_rt_policy_ipv6 1 2 "1:psp" + + # testing environment was set up successfully + SETUP_ERR=0 +} + +check_rt_connectivity() +{ + local rtsrc="$1" + local rtdst="$2" + local prefix + local rtsrc_nsname + + rtsrc_nsname="$(get_rtname "${rtsrc}")" + + prefix="$(get_network_prefix "${rtsrc}" "${rtdst}")" + + ip netns exec "${rtsrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \ + "${prefix}::${rtdst}" >/dev/null 2>&1 +} + +check_and_log_rt_connectivity() +{ + local rtsrc="$1" + local rtdst="$2" + + check_rt_connectivity "${rtsrc}" "${rtdst}" + log_test $? 0 "Routers connectivity: rt-${rtsrc} -> rt-${rtdst}" +} + +check_hs_ipv6_connectivity() +{ + local hssrc="$1" + local hsdst="$2" + local hssrc_nsname + + hssrc_nsname="$(get_hsname "${hssrc}")" + + ip netns exec "${hssrc_nsname}" ping -c 1 -W "${PING_TIMEOUT_SEC}" \ + "${IPv6_HS_NETWORK}::${hsdst}" >/dev/null 2>&1 +} + +check_and_log_hs2gw_connectivity() +{ + local hssrc="$1" + + check_hs_ipv6_connectivity "${hssrc}" 254 + log_test $? 0 "IPv6 Hosts connectivity: hs-${hssrc} -> gw" +} + +check_and_log_hs_ipv6_connectivity() +{ + local hssrc="$1" + local hsdst="$2" + + check_hs_ipv6_connectivity "${hssrc}" "${hsdst}" + log_test $? 0 "IPv6 Hosts connectivity: hs-${hssrc} -> hs-${hsdst}" +} + +check_and_log_hs_connectivity() +{ + local hssrc="$1" + local hsdst="$2" + + check_and_log_hs_ipv6_connectivity "${hssrc}" "${hsdst}" +} + +router_tests() +{ + local i + local j + + log_section "IPv6 routers connectivity test" + + for i in ${ROUTERS}; do + for j in ${ROUTERS}; do + if [ "${i}" -eq "${j}" ]; then + continue + fi + + check_and_log_rt_connectivity "${i}" "${j}" + done + done +} + +host2gateway_tests() +{ + local hs + + log_section "IPv6 connectivity test among hosts and gateways" + + for hs in ${HOSTS}; do + check_and_log_hs2gw_connectivity "${hs}" + done +} + +host_srv6_end_flv_psp_tests() +{ + log_section "SRv6 connectivity test hosts (h1 <-> h2, PSP flavor)" + + check_and_log_hs_connectivity 1 2 + check_and_log_hs_connectivity 2 1 +} + +test_iproute2_supp_or_ksft_skip() +{ + local flavor="$1" + + if ! ip route help 2>&1 | grep -qo "${flavor}"; then + echo "SKIP: Missing SRv6 ${flavor} flavor support in iproute2" + exit "${ksft_skip}" + fi +} + +test_kernel_supp_or_ksft_skip() +{ + local flavor="$1" + local test_netns + + test_netns="kflv-$(mktemp -u XXXXXXXX)" + + if ! ip netns add "${test_netns}"; then + echo "SKIP: Cannot set up netns to test kernel support for flavors" + exit "${ksft_skip}" + fi + + if ! ip -netns "${test_netns}" link \ + add "${DUMMY_DEVNAME}" type dummy; then + echo "SKIP: Cannot set up dummy dev to test kernel support for flavors" + + ip netns del "${test_netns}" + exit "${ksft_skip}" + fi + + if ! ip -netns "${test_netns}" link \ + set "${DUMMY_DEVNAME}" up; then + echo "SKIP: Cannot activate dummy dev to test kernel support for flavors" + + ip netns del "${test_netns}" + exit "${ksft_skip}" + fi + + if ! ip -netns "${test_netns}" -6 route \ + add "${IPv6_TESTS_ADDR}" encap seg6local \ + action End flavors "${flavor}" dev "${DUMMY_DEVNAME}"; then + echo "SKIP: ${flavor} flavor not supported in kernel" + + ip netns del "${test_netns}" + exit "${ksft_skip}" + fi + + ip netns del "${test_netns}" +} + +test_dummy_dev_or_ksft_skip() +{ + local test_netns + + test_netns="dummy-$(mktemp -u XXXXXXXX)" + + if ! ip netns add "${test_netns}"; then + echo "SKIP: Cannot set up netns for testing dummy dev support" + exit "${ksft_skip}" + fi + + modprobe dummy &>/dev/null || true + if ! ip -netns "${test_netns}" link \ + add "${DUMMY_DEVNAME}" type dummy; then + echo "SKIP: dummy dev not supported" + + ip netns del "${test_netns}" + exit "${ksft_skip}" + fi + + ip netns del "${test_netns}" +} + +if [ "$(id -u)" -ne 0 ]; then + echo "SKIP: Need root privileges" + exit "${ksft_skip}" +fi + +# required programs to carry out this selftest +test_command_or_ksft_skip ip +test_command_or_ksft_skip ping +test_command_or_ksft_skip sysctl +test_command_or_ksft_skip grep +test_command_or_ksft_skip cut +test_command_or_ksft_skip sed +test_command_or_ksft_skip sort +test_command_or_ksft_skip xargs + +test_dummy_dev_or_ksft_skip +test_iproute2_supp_or_ksft_skip psp +test_kernel_supp_or_ksft_skip psp + +set -e +trap cleanup EXIT + +setup +set +e + +router_tests +host2gateway_tests +host_srv6_end_flv_psp_tests + +print_log_test_results diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c index 00f837c9bc6c..46a02bbd31d0 100644 --- a/tools/testing/selftests/net/tcp_mmap.c +++ b/tools/testing/selftests/net/tcp_mmap.c @@ -137,7 +137,8 @@ static void *mmap_large_buffer(size_t need, size_t *allocated) if (buffer == (void *)-1) { sz = need; buffer = mmap(NULL, sz, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE, + -1, 0); if (buffer != (void *)-1) fprintf(stderr, "MAP_HUGETLB attempt failed, look at /sys/kernel/mm/hugepages for optimal performance\n"); } diff --git a/tools/testing/selftests/net/udpgro_frglist.sh b/tools/testing/selftests/net/udpgro_frglist.sh index c9c4b9d65839..0a6359bed0b9 100755 --- a/tools/testing/selftests/net/udpgro_frglist.sh +++ b/tools/testing/selftests/net/udpgro_frglist.sh @@ -40,8 +40,8 @@ run_one() { ip -n "${PEER_NS}" link set veth1 xdp object ${BPF_FILE} section xdp tc -n "${PEER_NS}" qdisc add dev veth1 clsact - tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol ipv6 bpf object-file ../bpf/nat6to4.o section schedcls/ingress6/nat_6 direct-action - tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol ip bpf object-file ../bpf/nat6to4.o section schedcls/egress4/snat4 direct-action + tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol ipv6 bpf object-file nat6to4.o section schedcls/ingress6/nat_6 direct-action + tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol ip bpf object-file nat6to4.o section schedcls/egress4/snat4 direct-action echo ${rx_args} ip netns exec "${PEER_NS}" ./udpgso_bench_rx ${rx_args} -r & @@ -88,8 +88,8 @@ if [ ! -f ${BPF_FILE} ]; then exit -1 fi -if [ ! -f bpf/nat6to4.o ]; then - echo "Missing nat6to4 helper. Build bpfnat6to4.o selftest first" +if [ ! -f nat6to4.o ]; then + echo "Missing nat6to4 helper. Build bpf nat6to4.o selftest first" exit -1 fi diff --git a/tools/testing/selftests/net/udpgso_bench_rx.c b/tools/testing/selftests/net/udpgso_bench_rx.c index 4058c7451e70..f35a924d4a30 100644 --- a/tools/testing/selftests/net/udpgso_bench_rx.c +++ b/tools/testing/selftests/net/udpgso_bench_rx.c @@ -214,11 +214,10 @@ static void do_verify_udp(const char *data, int len) static int recv_msg(int fd, char *buf, int len, int *gso_size) { - char control[CMSG_SPACE(sizeof(uint16_t))] = {0}; + char control[CMSG_SPACE(sizeof(int))] = {0}; struct msghdr msg = {0}; struct iovec iov = {0}; struct cmsghdr *cmsg; - uint16_t *gsosizeptr; int ret; iov.iov_base = buf; @@ -237,8 +236,7 @@ static int recv_msg(int fd, char *buf, int len, int *gso_size) cmsg = CMSG_NXTHDR(&msg, cmsg)) { if (cmsg->cmsg_level == SOL_UDP && cmsg->cmsg_type == UDP_GRO) { - gsosizeptr = (uint16_t *) CMSG_DATA(cmsg); - *gso_size = *gsosizeptr; + *gso_size = *(int *)CMSG_DATA(cmsg); break; } } diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/rsvp.json b/tools/testing/selftests/tc-testing/tc-tests/filters/rsvp.json deleted file mode 100644 index bdcbaa4c5663..000000000000 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/rsvp.json +++ /dev/null @@ -1,203 +0,0 @@ -[ - { - "id": "2141", - "name": "Add rsvp filter with tcp proto and specific IP address", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto tcp session 198.168.10.64", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*session 198.168.10.64 ipproto tcp", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "5267", - "name": "Add rsvp filter with udp proto and specific IP address", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*session 1.1.1.1 ipproto udp", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "2819", - "name": "Add rsvp filter with src ip and src port", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1 sender 2.2.2.2/5021 classid 1:1", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*flowid 1:1 session 1.1.1.1 ipproto udp sender 2.2.2.2/5021", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "c967", - "name": "Add rsvp filter with tunnelid and continue action", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1 tunnelid 2 classid 1:1 action continue", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*flowid 1:1 session 1.1.1.1 ipproto udp tunnelid 2.*action order [0-9]+: gact action continue", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "5463", - "name": "Add rsvp filter with tunnel and pipe action", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1 tunnel 2 skip 1 action pipe", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*tunnel 2 skip 1 session 1.1.1.1 ipproto udp.*action order [0-9]+: gact action pipe", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "2332", - "name": "Add rsvp filter with miltiple actions", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 7 rsvp ipproto udp session 1.1.1.1 classid 1:1 action skbedit mark 7 pipe action gact drop", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*flowid 1:1 session 1.1.1.1 ipproto udp.*action order [0-9]+: skbedit mark 7 pipe.*action order [0-9]+: gact action drop", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "8879", - "name": "Add rsvp filter with tunnel and skp flag", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1 tunnel 2 skip 1 action pipe", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*tunnel 2 skip 1 session 1.1.1.1 ipproto udp.*action order [0-9]+: gact action pipe", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "8261", - "name": "List rsvp filters", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress", - "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1/1234 classid 1:1", - "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto tcp session 2.2.2.2/1234 classid 2:1" - ], - "cmdUnderTest": "$TC filter show dev $DEV1 parent ffff:", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "^filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh", - "matchCount": "2", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "8989", - "name": "Delete rsvp filter", - "category": [ - "filter", - "rsvp" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress", - "$TC filter add dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1/1234 tunnelid 9 classid 2:1" - ], - "cmdUnderTest": "$TC filter del dev $DEV1 parent ffff: protocol ip prio 1 rsvp ipproto udp session 1.1.1.1/1234 tunnelid 9 classid 2:1", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "filter protocol ip pref [0-9]+ rsvp chain [0-9]+ fh 0x.*flowid 2:1 session 1.1.1.1/1234 ipproto udp tunnelid 9", - "matchCount": "0", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - } -] diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/tcindex.json b/tools/testing/selftests/tc-testing/tc-tests/filters/tcindex.json deleted file mode 100644 index 44901db70376..000000000000 --- a/tools/testing/selftests/tc-testing/tc-tests/filters/tcindex.json +++ /dev/null @@ -1,227 +0,0 @@ -[ - { - "id": "8293", - "name": "Add tcindex filter with default action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "^filter parent ffff: protocol ip pref 1 tcindex chain 0 handle 0x0001 classid 1:1", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "7281", - "name": "Add tcindex filter with hash size and pass action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex hash 32 fall_through classid 1:1 action pass", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "^filter parent ffff: protocol ip pref.*tcindex chain [0-9]+ handle 0x0001 classid 1:1.*action order [0-9]+: gact action pass", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "b294", - "name": "Add tcindex filter with mask shift and reclassify action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex hash 32 mask 1 shift 2 fall_through classid 1:1 action reclassify", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "^filter parent ffff: protocol ip pref.*tcindex chain [0-9]+ handle 0x0001 classid 1:1.*action order [0-9]+: gact action reclassify", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "0532", - "name": "Add tcindex filter with pass_on and continue actions", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex hash 32 mask 1 shift 2 pass_on classid 1:1 action continue", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "^filter parent ffff: protocol ip pref.*tcindex chain [0-9]+ handle 0x0001 classid 1:1.*action order [0-9]+: gact action continue", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "d473", - "name": "Add tcindex filter with pipe action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex hash 32 mask 1 shift 2 fall_through classid 1:1 action pipe", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "^filter parent ffff: protocol ip pref.*tcindex chain [0-9]+ handle 0x0001 classid 1:1.*action order [0-9]+: gact action pipe", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "2940", - "name": "Add tcindex filter with miltiple actions", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress" - ], - "cmdUnderTest": "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 7 tcindex hash 32 mask 1 shift 2 fall_through classid 1:1 action skbedit mark 7 pipe action gact drop", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 7 protocol ip tcindex", - "matchPattern": "^filter parent ffff: protocol ip pref 7 tcindex.*handle 0x0001.*action.*skbedit.*mark 7 pipe.*action.*gact action drop", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "1893", - "name": "List tcindex filters", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress", - "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1", - "$TC filter add dev $DEV1 parent ffff: handle 2 protocol ip prio 1 tcindex classid 1:1" - ], - "cmdUnderTest": "$TC filter show dev $DEV1 parent ffff:", - "expExitCode": "0", - "verifyCmd": "$TC filter show dev $DEV1 parent ffff:", - "matchPattern": "handle 0x000[0-9]+ classid 1:1", - "matchCount": "2", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "2041", - "name": "Change tcindex filter with pass action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress", - "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1 action drop" - ], - "cmdUnderTest": "$TC filter change dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1 action pass", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "handle 0x0001 classid 1:1.*action order [0-9]+: gact action pass", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "9203", - "name": "Replace tcindex filter with pass action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress", - "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1 action drop" - ], - "cmdUnderTest": "$TC filter replace dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1 action pass", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "handle 0x0001 classid 1:1.*action order [0-9]+: gact action pass", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - }, - { - "id": "7957", - "name": "Delete tcindex filter with drop action", - "category": [ - "filter", - "tcindex" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$TC qdisc add dev $DEV1 ingress", - "$TC filter add dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1 action drop" - ], - "cmdUnderTest": "$TC filter del dev $DEV1 parent ffff: handle 1 protocol ip prio 1 tcindex classid 1:1 action drop", - "expExitCode": "0", - "verifyCmd": "$TC filter get dev $DEV1 parent ffff: handle 1 prio 1 protocol ip tcindex", - "matchPattern": "handle 0x0001 classid 1:1.*action order [0-9]+: gact action drop", - "matchCount": "0", - "teardown": [ - "$TC qdisc del dev $DEV1 ingress" - ] - } -] diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/atm.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/atm.json deleted file mode 100644 index f5bc8670a67d..000000000000 --- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/atm.json +++ /dev/null @@ -1,94 +0,0 @@ -[ - { - "id": "7628", - "name": "Create ATM with default setting", - "category": [ - "qdisc", - "atm" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root atm", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc atm 1: root refcnt", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "390a", - "name": "Delete ATM with valid handle", - "category": [ - "qdisc", - "atm" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true", - "$TC qdisc add dev $DUMMY handle 1: root atm" - ], - "cmdUnderTest": "$TC qdisc del dev $DUMMY handle 1: root", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc atm 1: root refcnt", - "matchCount": "0", - "teardown": [ - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "32a0", - "name": "Show ATM class", - "category": [ - "qdisc", - "atm" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root atm", - "expExitCode": "0", - "verifyCmd": "$TC class show dev $DUMMY", - "matchPattern": "class atm 1: parent 1:", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "6310", - "name": "Dump ATM stats", - "category": [ - "qdisc", - "atm" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root atm", - "expExitCode": "0", - "verifyCmd": "$TC -s qdisc show dev $DUMMY", - "matchPattern": "qdisc atm 1: root refcnt", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - } -] diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/cbq.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/cbq.json deleted file mode 100644 index 1ab21c83a122..000000000000 --- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/cbq.json +++ /dev/null @@ -1,184 +0,0 @@ -[ - { - "id": "3460", - "name": "Create CBQ with default setting", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "0592", - "name": "Create CBQ with mpu", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000 mpu 1000", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "4684", - "name": "Create CBQ with valid cell num", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000 cell 128", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "4345", - "name": "Create CBQ with invalid cell num", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000 cell 100", - "expExitCode": "1", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "0", - "teardown": [ - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "4525", - "name": "Create CBQ with valid ewma", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000 ewma 16", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "6784", - "name": "Create CBQ with invalid ewma", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000 ewma 128", - "expExitCode": "1", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "0", - "teardown": [ - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "5468", - "name": "Delete CBQ with handle", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true", - "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000" - ], - "cmdUnderTest": "$TC qdisc del dev $DUMMY handle 1: root", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc cbq 1: root refcnt [0-9]+ rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "0", - "teardown": [ - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "492a", - "name": "Show CBQ class", - "category": [ - "qdisc", - "cbq" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root cbq bandwidth 10000 avpkt 9000", - "expExitCode": "0", - "verifyCmd": "$TC class show dev $DUMMY", - "matchPattern": "class cbq 1: root rate 10Kbit \\(bounded,isolated\\) prio no-transmit", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - } -] diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/dsmark.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/dsmark.json deleted file mode 100644 index c030795f9c37..000000000000 --- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/dsmark.json +++ /dev/null @@ -1,140 +0,0 @@ -[ - { - "id": "6345", - "name": "Create DSMARK with default setting", - "category": [ - "qdisc", - "dsmark" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root dsmark indices 1024", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc dsmark 1: root refcnt [0-9]+ indices 0x0400", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "3462", - "name": "Create DSMARK with default_index setting", - "category": [ - "qdisc", - "dsmark" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root dsmark indices 1024 default_index 512", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc dsmark 1: root refcnt [0-9]+ indices 0x0400 default_index 0x0200", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "ca95", - "name": "Create DSMARK with set_tc_index flag", - "category": [ - "qdisc", - "dsmark" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root dsmark indices 1024 set_tc_index", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc dsmark 1: root refcnt [0-9]+ indices 0x0400 set_tc_index", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "a950", - "name": "Create DSMARK with multiple setting", - "category": [ - "qdisc", - "dsmark" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root dsmark indices 1024 default_index 1024 set_tc_index", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc dsmark 1: root refcnt [0-9]+ indices 0x0400 default_index 0x0400 set_tc_index", - "matchCount": "1", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "4092", - "name": "Delete DSMARK with handle", - "category": [ - "qdisc", - "dsmark" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true", - "$TC qdisc add dev $DUMMY handle 1: root dsmark indices 1024 default_index 1024" - ], - "cmdUnderTest": "$TC qdisc del dev $DUMMY handle 1: root", - "expExitCode": "0", - "verifyCmd": "$TC qdisc show dev $DUMMY", - "matchPattern": "qdisc dsmark 1: root refcnt [0-9]+ indices 0x0400", - "matchCount": "0", - "teardown": [ - "$IP link del dev $DUMMY type dummy" - ] - }, - { - "id": "5930", - "name": "Show DSMARK class", - "category": [ - "qdisc", - "dsmark" - ], - "plugins": { - "requires": "nsPlugin" - }, - "setup": [ - "$IP link add dev $DUMMY type dummy || /bin/true" - ], - "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root dsmark indices 1024", - "expExitCode": "0", - "verifyCmd": "$TC class show dev $DUMMY", - "matchPattern": "class dsmark 1:", - "matchCount": "0", - "teardown": [ - "$TC qdisc del dev $DUMMY handle 1: root", - "$IP link del dev $DUMMY type dummy" - ] - } -] diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile index f8293c6910c9..43a254f0e14d 100644 --- a/tools/testing/vsock/Makefile +++ b/tools/testing/vsock/Makefile @@ -1,8 +1,9 @@ # SPDX-License-Identifier: GPL-2.0-only -all: test +all: test vsock_perf test: vsock_test vsock_diag_test vsock_test: vsock_test.o timeout.o control.o util.o vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o +vsock_perf: vsock_perf.o CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE .PHONY: all test clean diff --git a/tools/testing/vsock/README b/tools/testing/vsock/README index 4d5045e7d2c3..84ee217ba8ee 100644 --- a/tools/testing/vsock/README +++ b/tools/testing/vsock/README @@ -35,3 +35,37 @@ Invoke test binaries in both directions as follows: --control-port=$GUEST_IP \ --control-port=1234 \ --peer-cid=3 + +vsock_perf utility +------------------- +'vsock_perf' is a simple tool to measure vsock performance. It works in +sender/receiver modes: sender connect to peer at the specified port and +starts data transmission to the receiver. After data processing is done, +it prints several metrics(see below). + +Usage: +# run as sender +# connect to CID 2, port 1234, send 1G of data, tx buf size is 1M +./vsock_perf --sender 2 --port 1234 --bytes 1G --buf-size 1M + +Output: +tx performance: A Gbits/s + +Output explanation: +A is calculated as "number of bits to send" / "time in tx loop" + +# run as receiver +# listen port 1234, rx buf size is 1M, socket buf size is 1G, SO_RCVLOWAT is 64K +./vsock_perf --port 1234 --buf-size 1M --vsk-size 1G --rcvlowat 64K + +Output: +rx performance: A Gbits/s +total in 'read()': B sec +POLLIN wakeups: C +average in 'read()': D ns + +Output explanation: +A is calculated as "number of received bits" / "time in rx loop". +B is time, spent in 'read()' system call(excluding 'poll()') +C is number of 'poll()' wake ups with POLLIN bit set. +D is B / C, e.g. average amount of time, spent in single 'read()'. diff --git a/tools/testing/vsock/control.c b/tools/testing/vsock/control.c index 4874872fc5a3..d2deb4b15b94 100644 --- a/tools/testing/vsock/control.c +++ b/tools/testing/vsock/control.c @@ -141,6 +141,34 @@ void control_writeln(const char *str) timeout_end(); } +void control_writeulong(unsigned long value) +{ + char str[32]; + + if (snprintf(str, sizeof(str), "%lu", value) >= sizeof(str)) { + perror("snprintf"); + exit(EXIT_FAILURE); + } + + control_writeln(str); +} + +unsigned long control_readulong(void) +{ + unsigned long value; + char *str; + + str = control_readln(); + + if (!str) + exit(EXIT_FAILURE); + + value = strtoul(str, NULL, 10); + free(str); + + return value; +} + /* Return the next line from the control socket (without the trailing newline). * * The program terminates if a timeout occurs. diff --git a/tools/testing/vsock/control.h b/tools/testing/vsock/control.h index 51814b4f9ac1..c1f77fdb2c7a 100644 --- a/tools/testing/vsock/control.h +++ b/tools/testing/vsock/control.h @@ -9,7 +9,9 @@ void control_init(const char *control_host, const char *control_port, void control_cleanup(void); void control_writeln(const char *str); char *control_readln(void); +unsigned long control_readulong(void); void control_expectln(const char *str); bool control_cmpln(char *line, const char *str, bool fail); +void control_writeulong(unsigned long value); #endif /* CONTROL_H */ diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c index 2acbb7703c6a..01b636d3039a 100644 --- a/tools/testing/vsock/util.c +++ b/tools/testing/vsock/util.c @@ -395,3 +395,16 @@ void skip_test(struct test_case *test_cases, size_t test_cases_len, test_cases[test_id].skip = true; } + +unsigned long hash_djb2(const void *data, size_t len) +{ + unsigned long hash = 5381; + int i = 0; + + while (i < len) { + hash = ((hash << 5) + hash) + ((unsigned char *)data)[i]; + i++; + } + + return hash; +} diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h index a3375ad2fb7f..fb99208a95ea 100644 --- a/tools/testing/vsock/util.h +++ b/tools/testing/vsock/util.h @@ -49,4 +49,5 @@ void run_tests(const struct test_case *test_cases, void list_tests(const struct test_case *test_cases); void skip_test(struct test_case *test_cases, size_t test_cases_len, const char *test_id_str); +unsigned long hash_djb2(const void *data, size_t len); #endif /* UTIL_H */ diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c new file mode 100644 index 000000000000..a72520338f84 --- /dev/null +++ b/tools/testing/vsock/vsock_perf.c @@ -0,0 +1,427 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * vsock_perf - benchmark utility for vsock. + * + * Copyright (C) 2022 SberDevices. + * + * Author: Arseniy Krasnov <AVKrasnov@sberdevices.ru> + */ +#include <getopt.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> +#include <string.h> +#include <errno.h> +#include <unistd.h> +#include <time.h> +#include <stdint.h> +#include <poll.h> +#include <sys/socket.h> +#include <linux/vm_sockets.h> + +#define DEFAULT_BUF_SIZE_BYTES (128 * 1024) +#define DEFAULT_TO_SEND_BYTES (64 * 1024) +#define DEFAULT_VSOCK_BUF_BYTES (256 * 1024) +#define DEFAULT_RCVLOWAT_BYTES 1 +#define DEFAULT_PORT 1234 + +#define BYTES_PER_GB (1024 * 1024 * 1024ULL) +#define NSEC_PER_SEC (1000000000ULL) + +static unsigned int port = DEFAULT_PORT; +static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES; +static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES; + +static void error(const char *s) +{ + perror(s); + exit(EXIT_FAILURE); +} + +static time_t current_nsec(void) +{ + struct timespec ts; + + if (clock_gettime(CLOCK_REALTIME, &ts)) + error("clock_gettime"); + + return (ts.tv_sec * NSEC_PER_SEC) + ts.tv_nsec; +} + +/* From lib/cmdline.c. */ +static unsigned long memparse(const char *ptr) +{ + char *endptr; + + unsigned long long ret = strtoull(ptr, &endptr, 0); + + switch (*endptr) { + case 'E': + case 'e': + ret <<= 10; + case 'P': + case 'p': + ret <<= 10; + case 'T': + case 't': + ret <<= 10; + case 'G': + case 'g': + ret <<= 10; + case 'M': + case 'm': + ret <<= 10; + case 'K': + case 'k': + ret <<= 10; + endptr++; + default: + break; + } + + return ret; +} + +static void vsock_increase_buf_size(int fd) +{ + if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE, + &vsock_buf_bytes, sizeof(vsock_buf_bytes))) + error("setsockopt(SO_VM_SOCKETS_BUFFER_MAX_SIZE)"); + + if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE, + &vsock_buf_bytes, sizeof(vsock_buf_bytes))) + error("setsockopt(SO_VM_SOCKETS_BUFFER_SIZE)"); +} + +static int vsock_connect(unsigned int cid, unsigned int port) +{ + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = port, + .svm_cid = cid, + }, + }; + int fd; + + fd = socket(AF_VSOCK, SOCK_STREAM, 0); + + if (fd < 0) { + perror("socket"); + return -1; + } + + if (connect(fd, &addr.sa, sizeof(addr.svm)) < 0) { + perror("connect"); + close(fd); + return -1; + } + + return fd; +} + +static float get_gbps(unsigned long bits, time_t ns_delta) +{ + return ((float)bits / 1000000000ULL) / + ((float)ns_delta / NSEC_PER_SEC); +} + +static void run_receiver(unsigned long rcvlowat_bytes) +{ + unsigned int read_cnt; + time_t rx_begin_ns; + time_t in_read_ns; + size_t total_recv; + int client_fd; + char *data; + int fd; + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } addr = { + .svm = { + .svm_family = AF_VSOCK, + .svm_port = port, + .svm_cid = VMADDR_CID_ANY, + }, + }; + union { + struct sockaddr sa; + struct sockaddr_vm svm; + } clientaddr; + + socklen_t clientaddr_len = sizeof(clientaddr.svm); + + printf("Run as receiver\n"); + printf("Listen port %u\n", port); + printf("RX buffer %lu bytes\n", buf_size_bytes); + printf("vsock buffer %lu bytes\n", vsock_buf_bytes); + printf("SO_RCVLOWAT %lu bytes\n", rcvlowat_bytes); + + fd = socket(AF_VSOCK, SOCK_STREAM, 0); + + if (fd < 0) + error("socket"); + + if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) + error("bind"); + + if (listen(fd, 1) < 0) + error("listen"); + + client_fd = accept(fd, &clientaddr.sa, &clientaddr_len); + + if (client_fd < 0) + error("accept"); + + vsock_increase_buf_size(client_fd); + + if (setsockopt(client_fd, SOL_SOCKET, SO_RCVLOWAT, + &rcvlowat_bytes, + sizeof(rcvlowat_bytes))) + error("setsockopt(SO_RCVLOWAT)"); + + data = malloc(buf_size_bytes); + + if (!data) { + fprintf(stderr, "'malloc()' failed\n"); + exit(EXIT_FAILURE); + } + + read_cnt = 0; + in_read_ns = 0; + total_recv = 0; + rx_begin_ns = current_nsec(); + + while (1) { + struct pollfd fds = { 0 }; + + fds.fd = client_fd; + fds.events = POLLIN | POLLERR | + POLLHUP | POLLRDHUP; + + if (poll(&fds, 1, -1) < 0) + error("poll"); + + if (fds.revents & POLLERR) { + fprintf(stderr, "'poll()' error\n"); + exit(EXIT_FAILURE); + } + + if (fds.revents & POLLIN) { + ssize_t bytes_read; + time_t t; + + t = current_nsec(); + bytes_read = read(fds.fd, data, buf_size_bytes); + in_read_ns += (current_nsec() - t); + read_cnt++; + + if (!bytes_read) + break; + + if (bytes_read < 0) { + perror("read"); + exit(EXIT_FAILURE); + } + + total_recv += bytes_read; + } + + if (fds.revents & (POLLHUP | POLLRDHUP)) + break; + } + + printf("total bytes received: %zu\n", total_recv); + printf("rx performance: %f Gbits/s\n", + get_gbps(total_recv * 8, current_nsec() - rx_begin_ns)); + printf("total time in 'read()': %f sec\n", (float)in_read_ns / NSEC_PER_SEC); + printf("average time in 'read()': %f ns\n", (float)in_read_ns / read_cnt); + printf("POLLIN wakeups: %i\n", read_cnt); + + free(data); + close(client_fd); + close(fd); +} + +static void run_sender(int peer_cid, unsigned long to_send_bytes) +{ + time_t tx_begin_ns; + time_t tx_total_ns; + size_t total_send; + void *data; + int fd; + + printf("Run as sender\n"); + printf("Connect to %i:%u\n", peer_cid, port); + printf("Send %lu bytes\n", to_send_bytes); + printf("TX buffer %lu bytes\n", buf_size_bytes); + + fd = vsock_connect(peer_cid, port); + + if (fd < 0) + exit(EXIT_FAILURE); + + data = malloc(buf_size_bytes); + + if (!data) { + fprintf(stderr, "'malloc()' failed\n"); + exit(EXIT_FAILURE); + } + + memset(data, 0, buf_size_bytes); + total_send = 0; + tx_begin_ns = current_nsec(); + + while (total_send < to_send_bytes) { + ssize_t sent; + + sent = write(fd, data, buf_size_bytes); + + if (sent <= 0) + error("write"); + + total_send += sent; + } + + tx_total_ns = current_nsec() - tx_begin_ns; + + printf("total bytes sent: %zu\n", total_send); + printf("tx performance: %f Gbits/s\n", + get_gbps(total_send * 8, tx_total_ns)); + printf("total time in 'write()': %f sec\n", + (float)tx_total_ns / NSEC_PER_SEC); + + close(fd); + free(data); +} + +static const char optstring[] = ""; +static const struct option longopts[] = { + { + .name = "help", + .has_arg = no_argument, + .val = 'H', + }, + { + .name = "sender", + .has_arg = required_argument, + .val = 'S', + }, + { + .name = "port", + .has_arg = required_argument, + .val = 'P', + }, + { + .name = "bytes", + .has_arg = required_argument, + .val = 'M', + }, + { + .name = "buf-size", + .has_arg = required_argument, + .val = 'B', + }, + { + .name = "vsk-size", + .has_arg = required_argument, + .val = 'V', + }, + { + .name = "rcvlowat", + .has_arg = required_argument, + .val = 'R', + }, + {}, +}; + +static void usage(void) +{ + printf("Usage: ./vsock_perf [--help] [options]\n" + "\n" + "This is benchmarking utility, to test vsock performance.\n" + "It runs in two modes: sender or receiver. In sender mode, it\n" + "connects to the specified CID and starts data transmission.\n" + "\n" + "Options:\n" + " --help This message\n" + " --sender <cid> Sender mode (receiver default)\n" + " <cid> of the receiver to connect to\n" + " --port <port> Port (default %d)\n" + " --bytes <bytes>KMG Bytes to send (default %d)\n" + " --buf-size <bytes>KMG Data buffer size (default %d). In sender mode\n" + " it is the buffer size, passed to 'write()'. In\n" + " receiver mode it is the buffer size passed to 'read()'.\n" + " --vsk-size <bytes>KMG Socket buffer size (default %d)\n" + " --rcvlowat <bytes>KMG SO_RCVLOWAT value (default %d)\n" + "\n", DEFAULT_PORT, DEFAULT_TO_SEND_BYTES, + DEFAULT_BUF_SIZE_BYTES, DEFAULT_VSOCK_BUF_BYTES, + DEFAULT_RCVLOWAT_BYTES); + exit(EXIT_FAILURE); +} + +static long strtolx(const char *arg) +{ + long value; + char *end; + + value = strtol(arg, &end, 10); + + if (end != arg + strlen(arg)) + usage(); + + return value; +} + +int main(int argc, char **argv) +{ + unsigned long to_send_bytes = DEFAULT_TO_SEND_BYTES; + unsigned long rcvlowat_bytes = DEFAULT_RCVLOWAT_BYTES; + int peer_cid = -1; + bool sender = false; + + while (1) { + int opt = getopt_long(argc, argv, optstring, longopts, NULL); + + if (opt == -1) + break; + + switch (opt) { + case 'V': /* Peer buffer size. */ + vsock_buf_bytes = memparse(optarg); + break; + case 'R': /* SO_RCVLOWAT value. */ + rcvlowat_bytes = memparse(optarg); + break; + case 'P': /* Port to connect to. */ + port = strtolx(optarg); + break; + case 'M': /* Bytes to send. */ + to_send_bytes = memparse(optarg); + break; + case 'B': /* Size of rx/tx buffer. */ + buf_size_bytes = memparse(optarg); + break; + case 'S': /* Sender mode. CID to connect to. */ + peer_cid = strtolx(optarg); + sender = true; + break; + case 'H': /* Help. */ + usage(); + break; + default: + usage(); + } + } + + if (!sender) + run_receiver(rcvlowat_bytes); + else + run_sender(peer_cid, to_send_bytes); + + return 0; +} diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index bb6d691cb30d..67e9f9df3a8c 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -284,10 +284,14 @@ static void test_stream_msg_peek_server(const struct test_opts *opts) close(fd); } -#define MESSAGES_CNT 7 -#define MSG_EOR_IDX (MESSAGES_CNT / 2) +#define SOCK_BUF_SIZE (2 * 1024 * 1024) +#define MAX_MSG_SIZE (32 * 1024) + static void test_seqpacket_msg_bounds_client(const struct test_opts *opts) { + unsigned long curr_hash; + int page_size; + int msg_count; int fd; fd = vsock_seqpacket_connect(opts->peer_cid, 1234); @@ -296,18 +300,79 @@ static void test_seqpacket_msg_bounds_client(const struct test_opts *opts) exit(EXIT_FAILURE); } - /* Send several messages, one with MSG_EOR flag */ - for (int i = 0; i < MESSAGES_CNT; i++) - send_byte(fd, 1, (i == MSG_EOR_IDX) ? MSG_EOR : 0); + /* Wait, until receiver sets buffer size. */ + control_expectln("SRVREADY"); + + curr_hash = 0; + page_size = getpagesize(); + msg_count = SOCK_BUF_SIZE / MAX_MSG_SIZE; + + for (int i = 0; i < msg_count; i++) { + ssize_t send_size; + size_t buf_size; + int flags; + void *buf; + + /* Use "small" buffers and "big" buffers. */ + if (i & 1) + buf_size = page_size + + (rand() % (MAX_MSG_SIZE - page_size)); + else + buf_size = 1 + (rand() % page_size); + + buf = malloc(buf_size); + + if (!buf) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + memset(buf, rand() & 0xff, buf_size); + /* Set at least one MSG_EOR + some random. */ + if (i == (msg_count / 2) || (rand() & 1)) { + flags = MSG_EOR; + curr_hash++; + } else { + flags = 0; + } + + send_size = send(fd, buf, buf_size, flags); + + if (send_size < 0) { + perror("send"); + exit(EXIT_FAILURE); + } + + if (send_size != buf_size) { + fprintf(stderr, "Invalid send size\n"); + exit(EXIT_FAILURE); + } + + /* + * Hash sum is computed at both client and server in + * the same way: + * H += hash('message data') + * Such hash "controls" both data integrity and message + * bounds. After data exchange, both sums are compared + * using control socket, and if message bounds wasn't + * broken - two values must be equal. + */ + curr_hash += hash_djb2(buf, buf_size); + free(buf); + } control_writeln("SENDDONE"); + control_writeulong(curr_hash); close(fd); } static void test_seqpacket_msg_bounds_server(const struct test_opts *opts) { + unsigned long sock_buf_size; + unsigned long remote_hash; + unsigned long curr_hash; int fd; - char buf[16]; + char buf[MAX_MSG_SIZE]; struct msghdr msg = {0}; struct iovec iov = {0}; @@ -317,25 +382,57 @@ static void test_seqpacket_msg_bounds_server(const struct test_opts *opts) exit(EXIT_FAILURE); } + sock_buf_size = SOCK_BUF_SIZE; + + if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE, + &sock_buf_size, sizeof(sock_buf_size))) { + perror("setsockopt(SO_VM_SOCKETS_BUFFER_MAX_SIZE)"); + exit(EXIT_FAILURE); + } + + if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE, + &sock_buf_size, sizeof(sock_buf_size))) { + perror("setsockopt(SO_VM_SOCKETS_BUFFER_SIZE)"); + exit(EXIT_FAILURE); + } + + /* Ready to receive data. */ + control_writeln("SRVREADY"); + /* Wait, until peer sends whole data. */ control_expectln("SENDDONE"); iov.iov_base = buf; iov.iov_len = sizeof(buf); msg.msg_iov = &iov; msg.msg_iovlen = 1; - for (int i = 0; i < MESSAGES_CNT; i++) { - if (recvmsg(fd, &msg, 0) != 1) { - perror("message bound violated"); - exit(EXIT_FAILURE); - } + curr_hash = 0; + + while (1) { + ssize_t recv_size; - if ((i == MSG_EOR_IDX) ^ !!(msg.msg_flags & MSG_EOR)) { - perror("MSG_EOR"); + recv_size = recvmsg(fd, &msg, 0); + + if (!recv_size) + break; + + if (recv_size < 0) { + perror("recvmsg"); exit(EXIT_FAILURE); } + + if (msg.msg_flags & MSG_EOR) + curr_hash++; + + curr_hash += hash_djb2(msg.msg_iov[0].iov_base, recv_size); } close(fd); + remote_hash = control_readulong(); + + if (curr_hash != remote_hash) { + fprintf(stderr, "Message bounds broken\n"); + exit(EXIT_FAILURE); + } } #define MESSAGE_TRUNC_SZ 32 @@ -427,7 +524,7 @@ static void test_seqpacket_timeout_client(const struct test_opts *opts) tv.tv_usec = 0; if (setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, (void *)&tv, sizeof(tv)) == -1) { - perror("setsockopt 'SO_RCVTIMEO'"); + perror("setsockopt(SO_RCVTIMEO)"); exit(EXIT_FAILURE); } @@ -472,6 +569,70 @@ static void test_seqpacket_timeout_server(const struct test_opts *opts) close(fd); } +static void test_seqpacket_bigmsg_client(const struct test_opts *opts) +{ + unsigned long sock_buf_size; + ssize_t send_size; + socklen_t len; + void *data; + int fd; + + len = sizeof(sock_buf_size); + + fd = vsock_seqpacket_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE, + &sock_buf_size, &len)) { + perror("getsockopt"); + exit(EXIT_FAILURE); + } + + sock_buf_size++; + + data = malloc(sock_buf_size); + if (!data) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + send_size = send(fd, data, sock_buf_size, 0); + if (send_size != -1) { + fprintf(stderr, "expected 'send(2)' failure, got %zi\n", + send_size); + exit(EXIT_FAILURE); + } + + if (errno != EMSGSIZE) { + fprintf(stderr, "expected EMSGSIZE in 'errno', got %i\n", + errno); + exit(EXIT_FAILURE); + } + + control_writeln("CLISENT"); + + free(data); + close(fd); +} + +static void test_seqpacket_bigmsg_server(const struct test_opts *opts) +{ + int fd; + + fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + control_expectln("CLISENT"); + + close(fd); +} + #define BUF_PATTERN_1 'a' #define BUF_PATTERN_2 'b' @@ -644,7 +805,7 @@ static void test_stream_poll_rcvlowat_client(const struct test_opts *opts) if (setsockopt(fd, SOL_SOCKET, SO_RCVLOWAT, &lowat_val, sizeof(lowat_val))) { - perror("setsockopt"); + perror("setsockopt(SO_RCVLOWAT)"); exit(EXIT_FAILURE); } @@ -754,6 +915,11 @@ static struct test_case test_cases[] = { .run_client = test_stream_poll_rcvlowat_client, .run_server = test_stream_poll_rcvlowat_server, }, + { + .name = "SOCK_SEQPACKET big message", + .run_client = test_seqpacket_bigmsg_client, + .run_server = test_seqpacket_bigmsg_server, + }, {}, }; @@ -837,6 +1003,7 @@ int main(int argc, char **argv) .peer_cid = VMADDR_CID_ANY, }; + srand(time(NULL)); init_signals(); for (;;) { |