diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-11 13:45:37 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-08-11 13:45:37 -0700 |
commit | 7ebfc85e2cd7b08f518b526173e9a33b56b3913b (patch) | |
tree | ecace4ab3d39c70e0832882e1b27af65094e3294 | |
parent | e091ba5cf82714c8691d978781696cd1fc2dec70 (diff) | |
parent | c2e75634cbe368065f140dd30bf8b1a0355158fd (diff) |
Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, bpf, can and netfilter.
A little larger than usual but it's all fixes, no late features. It's
large partially because of timing, and partially because of follow ups
to stuff that got merged a week or so before the merge window and
wasn't as widely tested. Maybe the Bluetooth fixes are a little
alarming so we'll address that, but the rest seems okay and not scary.
Notably we're including a fix for the netfilter Kconfig [1], your WiFi
warning [2] and a bluetooth fix which should unblock syzbot [3].
Current release - regressions:
- Bluetooth:
- don't try to cancel uninitialized works [3]
- L2CAP: fix use-after-free caused by l2cap_chan_put
- tls: rx: fix device offload after recent rework
- devlink: fix UAF on failed reload and leftover locks in mlxsw
Current release - new code bugs:
- netfilter:
- flowtable: fix incorrect Kconfig dependencies [1]
- nf_tables: fix crash when nf_trace is enabled
- bpf:
- use proper target btf when exporting attach_btf_obj_id
- arm64: fixes for bpf trampoline support
- Bluetooth:
- ISO: unlock on error path in iso_sock_setsockopt()
- ISO: fix info leak in iso_sock_getsockopt()
- ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
- ISO: fix memory corruption on iso_pinfo.base
- ISO: fix not using the correct QoS
- hci_conn: fix updating ISO QoS PHY
- phy: dp83867: fix get nvmem cell fail
Previous releases - regressions:
- wifi: cfg80211: fix validating BSS pointers in
__cfg80211_connect_result [2]
- atm: bring back zatm uAPI after ATM had been removed
- properly fix old bug making bonding ARP monitor mode not being able
to work with software devices with lockless Tx
- tap: fix null-deref on skb->dev in dev_parse_header_protocol
- revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
devices and breaks others
- netfilter:
- nf_tables: many fixes rejecting cross-object linking which may
lead to UAFs
- nf_tables: fix null deref due to zeroed list head
- nf_tables: validate variable length element extension
- bgmac: fix a BUG triggered by wrong bytes_compl
- bcmgenet: indicate MAC is in charge of PHY PM
Previous releases - always broken:
- bpf:
- fix bad pointer deref in bpf_sys_bpf() injected via test infra
- disallow non-builtin bpf programs calling the prog_run command
- don't reinit map value in prealloc_lru_pop
- fix UAFs during the read of map iterator fd
- fix invalidity check for values in sk local storage map
- reject sleepable program for non-resched map iterator
- mptcp:
- move subflow cleanup in mptcp_destroy_common()
- do not queue data on closed subflows
- virtio_net: fix memory leak inside XDP_TX with mergeable
- vsock: fix memory leak when multiple threads try to connect()
- rework sk_user_data sharing to prevent psock leaks
- geneve: fix TOS inheriting for ipv4
- tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
- phy: c45 baset1: do not skip aneg configuration if clock role is
not specified
- rose: avoid overflow when /proc displays timer information
- x25: fix call timeouts in blocking connects
- can: mcp251x: fix race condition on receive interrupt
- can: j1939:
- replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
- fix memory leak of skbs in j1939_session_destroy()
Misc:
- docs: bpf: clarify that many things are not uAPI
- seg6: initialize induction variable to first valid array index (to
silence clang vs objtool warning)
- can: ems_usb: fix clang 14's -Wunaligned-access warning"
* tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
net: atm: bring back zatm uAPI
dpaa2-eth: trace the allocated address instead of page struct
net: add missing kdoc for struct genl_multicast_group::flags
nfp: fix use-after-free in area_cache_get()
MAINTAINERS: use my korg address for mt7601u
mlxsw: minimal: Fix deadlock in ports creation
bonding: fix reference count leak in balance-alb mode
net: usb: qmi_wwan: Add support for Cinterion MV32
bpf: Shut up kern_sys_bpf warning.
net/tls: Use RCU API to access tls_ctx->netdev
tls: rx: device: don't try to copy too much on detach
tls: rx: device: bound the frag walk
net_sched: cls_route: remove from list when handle is 0
selftests: forwarding: Fix failing tests with old libnet
net: refactor bpf_sk_reuseport_detach()
net: fix refcount bug in sk_psock_get (2)
selftests/bpf: Ensure sleepable program is rejected by hash map iter
selftests/bpf: Add write tests for sk local storage map iterator
selftests/bpf: Add tests for reading a dangling map iter fd
bpf: Only allow sleepable program for resched-able iterator
...
116 files changed, 1865 insertions, 646 deletions
diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst index 437de2a7a5de..a210b8a4df00 100644 --- a/Documentation/bpf/bpf_design_QA.rst +++ b/Documentation/bpf/bpf_design_QA.rst @@ -214,6 +214,12 @@ A: NO. Tracepoints are tied to internal implementation details hence they are subject to change and can break with newer kernels. BPF programs need to change accordingly when this happens. +Q: Are places where kprobes can attach part of the stable ABI? +-------------------------------------------------------------- +A: NO. The places to which kprobes can attach are internal implementation +details, which means that they are subject to change and can break with +newer kernels. BPF programs need to change accordingly when this happens. + Q: How much stack space a BPF program uses? ------------------------------------------- A: Currently all program types are limited to 512 bytes of stack @@ -273,3 +279,22 @@ cc (congestion-control) implementations. If any of these kernel functions has changed, both the in-tree and out-of-tree kernel tcp cc implementations have to be changed. The same goes for the bpf programs and they have to be adjusted accordingly. + +Q: Attaching to arbitrary kernel functions is an ABI? +----------------------------------------------------- +Q: BPF programs can be attached to many kernel functions. Do these +kernel functions become part of the ABI? + +A: NO. + +The kernel function prototypes will change, and BPF programs attaching to +them will need to change. The BPF compile-once-run-everywhere (CO-RE) +should be used in order to make it easier to adapt your BPF programs to +different versions of the kernel. + +Q: Marking a function with BTF_ID makes that function an ABI? +------------------------------------------------------------- +A: NO. + +The BTF_ID macro does not cause a function to become part of the ABI +any more than does the EXPORT_SYMBOL_GPL macro. diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst index 53a18ff7cf23..7823a069a903 100644 --- a/Documentation/networking/bonding.rst +++ b/Documentation/networking/bonding.rst @@ -1982,15 +1982,6 @@ uses the response as an indication that the link is operating. This gives some assurance that traffic is actually flowing to and from one or more peers on the local network. -The ARP monitor relies on the device driver itself to verify -that traffic is flowing. In particular, the driver must keep up to -date the last receive time, dev->last_rx. Drivers that use NETIF_F_LLTX -flag must also update netdev_queue->trans_start. If they do not, then the -ARP monitor will immediately fail any slaves using that driver, and -those slaves will stay down. If networking monitoring (tcpdump, etc) -shows the ARP requests and replies on the network, then it may be that -your device driver is not updating last_rx and trans_start. - 7.2 Configuring Multiple ARP Targets ------------------------------------ diff --git a/MAINTAINERS b/MAINTAINERS index 9b6c0f0bda14..f2d64020399b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9694,7 +9694,7 @@ F: arch/powerpc/platforms/powernv/copy-paste.h F: arch/powerpc/platforms/powernv/vas* IBM Power Virtual Ethernet Device Driver -M: Cristobal Forno <cforno12@linux.ibm.com> +M: Nick Child <nnac123@linux.ibm.com> L: netdev@vger.kernel.org S: Supported F: drivers/net/ethernet/ibm/ibmveth.* @@ -12843,7 +12843,7 @@ F: Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml F: drivers/net/wireless/mediatek/mt76/ MEDIATEK MT7601U WIRELESS LAN DRIVER -M: Jakub Kicinski <kubakici@wp.pl> +M: Jakub Kicinski <kuba@kernel.org> L: linux-wireless@vger.kernel.org S: Maintained F: drivers/net/wireless/mediatek/mt7601u/ diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 7ca8779ae34f..389623ae5a91 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -1496,7 +1496,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) memset(&ctx, 0, sizeof(ctx)); ctx.prog = prog; - ctx.offset = kcalloc(prog->len + 1, sizeof(int), GFP_KERNEL); + ctx.offset = kvcalloc(prog->len + 1, sizeof(int), GFP_KERNEL); if (ctx.offset == NULL) { prog = orig_prog; goto out_off; @@ -1601,7 +1601,7 @@ skip_init_ctx: ctx.offset[i] *= AARCH64_INSN_SIZE; bpf_prog_fill_jited_linfo(prog, ctx.offset + 1); out_off: - kfree(ctx.offset); + kvfree(ctx.offset); kfree(jit_data); prog->aux->jit_data = NULL; } @@ -1643,7 +1643,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, int args_off, int retval_off, int run_ctx_off, bool save_ret) { - u32 *branch; + __le32 *branch; u64 enter_prog; u64 exit_prog; struct bpf_prog *p = l->link.prog; @@ -1698,7 +1698,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, if (ctx->image) { int offset = &ctx->image[ctx->idx] - branch; - *branch = A64_CBZ(1, A64_R(0), offset); + *branch = cpu_to_le32(A64_CBZ(1, A64_R(0), offset)); } /* arg1: prog */ @@ -1713,7 +1713,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, int args_off, int retval_off, int run_ctx_off, - u32 **branches) + __le32 **branches) { int i; @@ -1784,7 +1784,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT]; struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN]; bool save_ret; - u32 **branches = NULL; + __le32 **branches = NULL; /* trampoline stack layout: * [ parent ip ] @@ -1892,7 +1892,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, flags & BPF_TRAMP_F_RET_FENTRY_RET); if (fmod_ret->nr_links) { - branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), + branches = kcalloc(fmod_ret->nr_links, sizeof(__le32 *), GFP_KERNEL); if (!branches) return -ENOMEM; @@ -1916,7 +1916,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, /* update the branches saved in invoke_bpf_mod_ret with cbnz */ for (i = 0; i < fmod_ret->nr_links && ctx->image != NULL; i++) { int offset = &ctx->image[ctx->idx] - branches[i]; - *branches[i] = A64_CBNZ(1, A64_R(10), offset); + *branches[i] = cpu_to_le32(A64_CBNZ(1, A64_R(10), offset)); } for (i = 0; i < fexit->nr_links; i++) diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c index 81ce81a75fc6..681cb3786794 100644 --- a/drivers/atm/idt77252.c +++ b/drivers/atm/idt77252.c @@ -3752,6 +3752,7 @@ static void __exit idt77252_exit(void) card = idt77252_chain; dev = card->atmdev; idt77252_chain = card->next; + del_timer_sync(&card->tst_timer); if (dev->phy->stop) dev->phy->stop(dev); diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c index 007d43e46dcb..b9dbad3a8af8 100644 --- a/drivers/net/bonding/bond_alb.c +++ b/drivers/net/bonding/bond_alb.c @@ -653,6 +653,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond) { struct slave *tx_slave = NULL; + struct net_device *dev; struct arp_pkt *arp; if (!pskb_network_may_pull(skb, sizeof(*arp))) @@ -665,6 +666,15 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond) if (!bond_slave_has_mac_rx(bond, arp->mac_src)) return NULL; + dev = ip_dev_find(dev_net(bond->dev), arp->ip_src); + if (dev) { + if (netif_is_bridge_master(dev)) { + dev_put(dev); + return NULL; + } + dev_put(dev); + } + if (arp->op_code == htons(ARPOP_REPLY)) { /* the arp must be sent on the selected rx channel */ tx_slave = rlb_choose_channel(skb, bond, arp); diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index e75acb14d066..50e60843020c 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2001,6 +2001,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, for (i = 0; i < BOND_MAX_ARP_TARGETS; i++) new_slave->target_last_arp_rx[i] = new_slave->last_rx; + new_slave->last_tx = new_slave->last_rx; + if (bond->params.miimon && !bond->params.use_carrier) { link_reporting = bond_check_dev_link(bond, slave_dev, 1); @@ -2884,8 +2886,11 @@ static void bond_arp_send(struct slave *slave, int arp_op, __be32 dest_ip, return; } - if (bond_handle_vlan(slave, tags, skb)) + if (bond_handle_vlan(slave, tags, skb)) { + slave_update_last_tx(slave); arp_xmit(skb); + } + return; } @@ -3074,8 +3079,7 @@ static int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond, curr_active_slave->last_link_up)) bond_validate_arp(bond, slave, tip, sip); else if (curr_arp_slave && (arp->ar_op == htons(ARPOP_REPLY)) && - bond_time_in_interval(bond, - dev_trans_start(curr_arp_slave->dev), 1)) + bond_time_in_interval(bond, slave_last_tx(curr_arp_slave), 1)) bond_validate_arp(bond, slave, sip, tip); out_unlock: @@ -3103,8 +3107,10 @@ static void bond_ns_send(struct slave *slave, const struct in6_addr *daddr, } addrconf_addr_solict_mult(daddr, &mcaddr); - if (bond_handle_vlan(slave, tags, skb)) + if (bond_handle_vlan(slave, tags, skb)) { + slave_update_last_tx(slave); ndisc_send_skb(skb, &mcaddr, saddr); + } } static void bond_ns_send_all(struct bonding *bond, struct slave *slave) @@ -3246,8 +3252,7 @@ static int bond_na_rcv(const struct sk_buff *skb, struct bonding *bond, curr_active_slave->last_link_up)) bond_validate_ns(bond, slave, saddr, daddr); else if (curr_arp_slave && - bond_time_in_interval(bond, - dev_trans_start(curr_arp_slave->dev), 1)) + bond_time_in_interval(bond, slave_last_tx(curr_arp_slave), 1)) bond_validate_ns(bond, slave, saddr, daddr); out: @@ -3335,12 +3340,12 @@ static void bond_loadbalance_arp_mon(struct bonding *bond) * so it can wait */ bond_for_each_slave_rcu(bond, slave, iter) { - unsigned long trans_start = dev_trans_start(slave->dev); + unsigned long last_tx = slave_last_tx(slave); bond_propose_link_state(slave, BOND_LINK_NOCHANGE); if (slave->link != BOND_LINK_UP) { - if (bond_time_in_interval(bond, trans_start, 1) && + if (bond_time_in_interval(bond, last_tx, 1) && bond_time_in_interval(bond, slave->last_rx, 1)) { bond_propose_link_state(slave, BOND_LINK_UP); @@ -3365,7 +3370,7 @@ static void bond_loadbalance_arp_mon(struct bonding *bond) * when the source ip is 0, so don't take the link down * if we don't know our ip yet */ - if (!bond_time_in_interval(bond, trans_start, bond->params.missed_max) || + if (!bond_time_in_interval(bond, last_tx, bond->params.missed_max) || !bond_time_in_interval(bond, slave->last_rx, bond->params.missed_max)) { bond_propose_link_state(slave, BOND_LINK_DOWN); @@ -3431,7 +3436,7 @@ re_arm: */ static int bond_ab_arp_inspect(struct bonding *bond) { - unsigned long trans_start, last_rx; + unsigned long last_tx, last_rx; struct list_head *iter; struct slave *slave; int commit = 0; @@ -3482,9 +3487,9 @@ static int bond_ab_arp_inspect(struct bonding *bond) * - (more than missed_max*delta since receive AND * the bond has an IP address) */ - trans_start = dev_trans_start(slave->dev); + last_tx = slave_last_tx(slave); if (bond_is_active_slave(slave) && - (!bond_time_in_interval(bond, trans_start, bond->params.missed_max) || + (!bond_time_in_interval(bond, last_tx, bond->params.missed_max) || !bond_time_in_interval(bond, last_rx, bond->params.missed_max))) { bond_propose_link_state(slave, BOND_LINK_DOWN); commit++; @@ -3501,8 +3506,8 @@ static int bond_ab_arp_inspect(struct bonding *bond) */ static void bond_ab_arp_commit(struct bonding *bond) { - unsigned long trans_start; struct list_head *iter; + unsigned long last_tx; struct slave *slave; bond_for_each_slave(bond, slave, iter) { @@ -3511,10 +3516,10 @@ static void bond_ab_arp_commit(struct bonding *bond) continue; case BOND_LINK_UP: - trans_start = dev_trans_start(slave->dev); + last_tx = slave_last_tx(slave); if (rtnl_dereference(bond->curr_active_slave) != slave || (!rtnl_dereference(bond->curr_active_slave) && - bond_time_in_interval(bond, trans_start, 1))) { + bond_time_in_interval(bond, last_tx, 1))) { struct slave *current_arp_slave; current_arp_slave = rtnl_dereference(bond->current_arp_slave); @@ -5333,8 +5338,14 @@ static struct net_device *bond_sk_get_lower_dev(struct net_device *dev, static netdev_tx_t bond_tls_device_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *dev) { - if (likely(bond_get_slave_by_dev(bond, tls_get_ctx(skb->sk)->netdev))) - return bond_dev_queue_xmit(bond, skb, tls_get_ctx(skb->sk)->netdev); + struct net_device *tls_netdev = rcu_dereference(tls_get_ctx(skb->sk)->netdev); + + /* tls_netdev might become NULL, even if tls_is_sk_tx_device_offloaded + * was true, if tls_device_down is running in parallel, but it's OK, + * because bond_get_slave_by_dev has a NULL check. + */ + if (likely(bond_get_slave_by_dev(bond, tls_netdev))) + return bond_dev_queue_xmit(bond, skb, tls_netdev); return bond_tx_drop(dev, skb); } #endif diff --git a/drivers/net/can/spi/mcp251x.c b/drivers/net/can/spi/mcp251x.c index e750d13c8841..c320de474f40 100644 --- a/drivers/net/can/spi/mcp251x.c +++ b/drivers/net/can/spi/mcp251x.c @@ -1070,9 +1070,6 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) mcp251x_read_2regs(spi, CANINTF, &intf, &eflag); - /* mask out flags we don't care about */ - intf &= CANINTF_RX | CANINTF_TX | CANINTF_ERR; - /* receive buffer 0 */ if (intf & CANINTF_RX0IF) { mcp251x_hw_rx(spi, 0); @@ -1082,6 +1079,18 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) if (mcp251x_is_2510(spi)) mcp251x_write_bits(spi, CANINTF, CANINTF_RX0IF, 0x00); + + /* check if buffer 1 is already known to be full, no need to re-read */ + if (!(intf & CANINTF_RX1IF)) { + u8 intf1, eflag1; + + /* intf needs to be read again to avoid a race condition */ + mcp251x_read_2regs(spi, CANINTF, &intf1, &eflag1); + + /* combine flags from both operations for error handling */ + intf |= intf1; + eflag |= eflag1; + } } /* receive buffer 1 */ @@ -1092,6 +1101,9 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) clear_intf |= CANINTF_RX1IF; } + /* mask out flags we don't care about */ + intf &= CANINTF_RX | CANINTF_TX | CANINTF_ERR; + /* any error or tx interrupt we need to clear? */ if (intf & (CANINTF_ERR | CANINTF_TX)) clear_intf |= intf & (CANINTF_ERR | CANINTF_TX); diff --git a/drivers/net/can/usb/ems_usb.c b/drivers/net/can/usb/ems_usb.c index d1e1a459c045..d31191686a54 100644 --- a/drivers/net/can/usb/ems_usb.c +++ b/drivers/net/can/usb/ems_usb.c @@ -195,7 +195,7 @@ struct __packed ems_cpc_msg { __le32 ts_sec; /* timestamp in seconds */ __le32 ts_nsec; /* timestamp in nano seconds */ - union { + union __packed { u8 generic[64]; struct cpc_can_msg can_msg; struct cpc_can_params can_params; diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index 859196898a7d..aadb0bd7c24f 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -610,6 +610,9 @@ static int felix_change_tag_protocol(struct dsa_switch *ds, old_proto_ops = felix->tag_proto_ops; + if (proto_ops == old_proto_ops) + return 0; + err = proto_ops->setup(ds); if (err) goto setup_failed; diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c index 61ed317602e7..b4034b78c0ca 100644 --- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -1137,6 +1137,7 @@ static void vsc9959_tas_min_gate_lengths(struct tc_taprio_qopt_offload *taprio, { struct tc_taprio_sched_entry *entry; u64 gate_len[OCELOT_NUM_TC]; + u8 gates_ever_opened = 0; int tc, i, n; /* Initialize arrays */ @@ -1164,16 +1165,28 @@ static void vsc9959_tas_min_gate_lengths(struct tc_taprio_qopt_offload *taprio, for (tc = 0; tc < OCELOT_NUM_TC; tc++) { if (entry->gate_mask & BIT(tc)) { gate_len[tc] += entry->interval; + gates_ever_opened |= BIT(tc); } else { /* Gate closes now, record a potential new * minimum and reinitialize length */ - if (min_gate_len[tc] > gate_len[tc]) + if (min_gate_len[tc] > gate_len[tc] && + gate_len[tc]) min_gate_len[tc] = gate_len[tc]; gate_len[tc] = 0; } } } + + /* min_gate_len[tc] actually tracks minimum *open* gate time, so for + * permanently closed gates, min_gate_len[tc] will still be U64_MAX. + * Therefore they are currently indistinguishable from permanently + * open gates. Overwrite the gate len with 0 when we know they're + * actually permanently closed, i.e. after the loop above. + */ + for (tc = 0; tc < OCELOT_NUM_TC; tc++) + if (!(gates_ever_opened & BIT(tc))) + min_gate_len[tc] = 0; } /* Update QSYS_PORT_MAX_SDU to make sure the static guard bands added by the diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c index e11cc29d3264..06508eebb585 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c @@ -265,12 +265,10 @@ static void aq_nic_service_timer_cb(struct timer_list *t) static void aq_nic_polling_timer_cb(struct timer_list *t) { struct aq_nic_s *self = from_timer(self, t, polling_timer); - struct aq_vec_s *aq_vec = NULL; unsigned int i = 0U; - for (i = 0U, aq_vec = self->aq_vec[0]; - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) - aq_vec_isr(i, (void *)aq_vec); + for (i = 0U; self->aq_vecs > i; ++i) + aq_vec_isr(i, (void *)self->aq_vec[i]); mod_timer(&self->polling_timer, jiffies + AQ_CFG_POLLING_TIMER_INTERVAL); @@ -1014,7 +1012,6 @@ int aq_nic_get_regs_count(struct aq_nic_s *self) u64 *aq_nic_get_stats(struct aq_nic_s *self, u64 *data) { - struct aq_vec_s *aq_vec = NULL; struct aq_stats_s *stats; unsigned int count = 0U; unsigned int i = 0U; @@ -1064,11 +1061,11 @@ u64 *aq_nic_get_stats(struct aq_nic_s *self, u64 *data) data += i; for (tc = 0U; tc < self->aq_nic_cfg.tcs; tc++) { - for (i = 0U, aq_vec = self->aq_vec[0]; - aq_vec && self->aq_vecs > i; - ++i, aq_vec = self->aq_vec[i]) { + for (i = 0U; self->aq_vecs > i; ++i) { + if (!self->aq_vec[i]) + break; data += count; - count = aq_vec_get_sw_stats(aq_vec, tc, data); + count = aq_vec_get_sw_stats(self->aq_vec[i], tc, data); } } @@ -1382,7 +1379,6 @@ int aq_nic_set_loopback(struct aq_nic_s *self) int aq_nic_stop(struct aq_nic_s *self) { - struct aq_vec_s *aq_vec = NULL; unsigned int i = 0U; netif_tx_disable(self->ndev); @@ -1400,9 +1396,8 @@ int aq_nic_stop(struct aq_nic_s *self) aq_ptp_irq_free(self); - for (i = 0U, aq_vec = self->aq_vec[0]; - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) - aq_vec_stop(aq_vec); + for (i = 0U; self->aq_vecs > i; ++i) + aq_vec_stop(self->aq_vec[i]); aq_ptp_ring_stop(self); diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index 2dfc1e32bbb3..93580484a3f4 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -189,8 +189,8 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac, } slot->skb = skb; - ring->end += nr_frags + 1; netdev_sent_queue(net_dev, skb->len); + ring->end += nr_frags + 1; wmb(); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c index 14df8cfc2946..059f96f7a96f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c @@ -21,7 +21,6 @@ #include "bnxt_ptp.h" #include "bnxt_coredump.h" #include "bnxt_nvm_defs.h" -#include "bnxt_ethtool.h" static void __bnxt_fw_recover(struct bnxt *bp) { diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c index c888ddee1fc4..7ded559842e8 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmmii.c +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c @@ -393,6 +393,9 @@ int bcmgenet_mii_probe(struct net_device *dev) if (priv->internal_phy && !GENET_IS_V5(priv)) dev->phydev->irq = PHY_MAC_INTERRUPT; + /* Indicate that the MAC is responsible for PHY PM */ + dev->phydev->mac_managed_pm = true; + return 0; } diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c index bfee0e4e54b1..da9973b711f4 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c @@ -1932,6 +1932,7 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev) int data_len, qidx, ret = 0, mss; struct tls_record_info *record; struct chcr_ktls_info *tx_info; + struct net_device *tls_netdev; struct tls_context *tls_ctx; struct sge_eth_txq *q; struct adapter *adap; @@ -1945,7 +1946,12 @@ static int chcr_ktls_xmit(struct sk_buff *skb, struct net_device *dev) mss = skb_is_gso(skb) ? skb_shinfo(skb)->gso_size : data_len; tls_ctx = tls_get_ctx(skb->sk); - if (unlikely(tls_ctx->netdev != dev)) + tls_netdev = rcu_dereference_bh(tls_ctx->netdev); + /* Don't quit on NULL: if tls_device_down is running in parallel, + * netdev might become NULL, even if tls_is_sk_tx_device_offloaded was + * true. Rather continue processing this packet. + */ + if (unlikely(tls_netdev && tls_netdev != dev)) goto out; tx_ctx = chcr_get_ktls_tx_context(tls_ctx); diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c index cb069a0af7b9..a5f7152a1716 100644 --- a/drivers/net/ethernet/engleder/tsnep_main.c +++ b/drivers/net/ethernet/engleder/tsnep_main.c @@ -340,14 +340,14 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count) return 0; } -static void tsnep_tx_unmap(struct tsnep_tx *tx, int count) +static void tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count) { struct device *dmadev = tx->adapter->dmadev; struct tsnep_tx_entry *entry; int i; for (i = 0; i < count; i++) { - entry = &tx->entry[(tx->read + i) % TSNEP_RING_SIZE]; + entry = &tx->entry[(index + i) % TSNEP_RING_SIZE]; if (entry->len) { if (i == 0) @@ -395,7 +395,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb, retval = tsnep_tx_map(skb, tx, count); if (retval != 0) { - tsnep_tx_unmap(tx, count); + tsnep_tx_unmap(tx, tx->write, count); dev_kfree_skb_any(entry->skb); entry->skb = NULL; @@ -464,7 +464,7 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget) if (skb_shinfo(entry->skb)->nr_frags > 0) count += skb_shinfo(entry->skb)->nr_frags; - tsnep_tx_unmap(tx, count); + tsnep_tx_unmap(tx, tx->read, count); if ((skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) && (__le32_to_cpu(entry->desc_wb->properties) & @@ -1282,7 +1282,7 @@ MODULE_DEVICE_TABLE(of, tsnep_of_match); static struct platform_driver tsnep_driver = { .driver = { .name = TSNEP, - .of_match_table = of_match_ptr(tsnep_of_match), + .of_match_table = tsnep_of_match, }, .probe = tsnep_probe, .remove = tsnep_remove, diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index cd9ec80522e7..75d51572693d 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -1660,8 +1660,8 @@ static int dpaa2_eth_add_bufs(struct dpaa2_eth_priv *priv, buf_array[i] = addr; /* tracing point */ - trace_dpaa2_eth_buf_seed(priv->net_dev, - page, DPAA2_ETH_RX_BUF_RAW_SIZE, + trace_dpaa2_eth_buf_seed(priv->net_dev, page_address(page), + DPAA2_ETH_RX_BUF_RAW_SIZE, addr, priv->rx_buf_size, bpid); } diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c index 6809b8b4c556..7282a826d81e 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c @@ -2580,6 +2580,12 @@ static void __rvu_flr_handler(struct rvu *rvu, u16 pcifunc) rvu_blklf_teardown(rvu, pcifunc, BLKADDR_NPA); rvu_reset_lmt_map_tbl(rvu, pcifunc); rvu_detach_rsrcs(rvu, NULL, pcifunc); + /* In scenarios where PF/VF drivers detach NIXLF without freeing MCAM + * entries, check and free the MCAM entries explicitly to avoid leak. + * Since LF is detached use LF number as -1. + */ + rvu_npc_free_mcam_entries(rvu, pcifunc, -1); + mutex_unlock(&rvu->flr_lock); } diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c index 583ead4dd246..1e348fd0d930 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c @@ -1097,6 +1097,9 @@ static void npc_enadis_default_entries(struct rvu *rvu, u16 pcifunc, void rvu_npc_disable_default_entries(struct rvu *rvu, u16 pcifunc, int nixlf) { + if (nixlf < 0) + return; + npc_enadis_default_entries(rvu, pcifunc, nixlf, false); /* Delete multicast and promisc MCAM entries */ @@ -1136,6 +1139,9 @@ bool rvu_npc_enable_mcam_by_entry_index(struct rvu *rvu, int entry, int intf, bo void rvu_npc_enable_default_entries(struct rvu *rvu, u16 pcifunc, int nixlf) { + if (nixlf < 0) + return; + /* Enables only broadcast match entry. Promisc/Allmulti are enabled * in set_rx_mode mbox handler. */ @@ -1675,7 +1681,7 @@ static void npc_load_kpu_profile(struct rvu *rvu) * Firmware database method. * Default KPU profile. */ - if (!request_firmware(&fw, kpu_profile, rvu->dev)) { + if (!request_firmware_direct(&fw, kpu_profile, rvu->dev)) { dev_info(rvu->dev, "Loading KPU profile from firmware: %s\n", kpu_profile); rvu->kpu_fwdata = kzalloc(fw->size, GFP_KERNEL); @@ -1939,6 +1945,7 @@ static void rvu_npc_hw_init(struct rvu *rvu, int blkaddr) static void rvu_npc_setup_interfaces(struct rvu *rvu, int blkaddr) { + struct npc_mcam_kex *mkex = rvu->kpu.mkex; struct npc_mcam *mcam = &rvu->hw->mcam; struct rvu_hwinfo *hw = rvu->hw; u64 nibble_ena, rx_kex, tx_kex; @@ -1951,15 +1958,15 @@ static void rvu_npc_setup_interfaces(struct rvu *rvu, int blkaddr) mcam->counters.max--; mcam->rx_miss_act_cntr = mcam->counters.max; - rx_kex = npc_mkex_default.keyx_cfg[NIX_INTF_RX]; - tx_kex = npc_mkex_default.keyx_cfg[NIX_INTF_TX]; + rx_kex = mkex->keyx_cfg[NIX_INTF_RX]; + tx_kex = mkex->keyx_cfg[NIX_INTF_TX]; nibble_ena = FIELD_GET(NPC_PARSE_NIBBLE, rx_kex); nibble_ena = rvu_npc_get_tx_nibble_cfg(rvu, nibble_ena); if (nibble_ena) { tx_kex &= ~NPC_PARSE_NIBBLE; tx_kex |= FIELD_PREP(NPC_PARSE_NIBBLE, nibble_ena); - npc_mkex_default.keyx_cfg[NIX_INTF_TX] = tx_kex; + mkex->keyx_cfg[NIX_INTF_TX] = tx_kex; } /* Configure RX interfaces */ diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c index a400aa22da79..7c4e1acd0f77 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_fs.c @@ -467,7 +467,8 @@ do { \ NPC_SCAN_HDR(NPC_VLAN_TAG1, NPC_LID_LB, NPC_LT_LB_CTAG, 2, 2); NPC_SCAN_HDR(NPC_VLAN_TAG2, NPC_LID_LB, NPC_LT_LB_STAG_QINQ, 2, 2); NPC_SCAN_HDR(NPC_DMAC, NPC_LID_LA, la_ltype, la_start, 6); - NPC_SCAN_HDR(NPC_SMAC, NPC_LID_LA, la_ltype, la_start, 6); + /* SMAC follows the DMAC(which is 6 bytes) */ + NPC_SCAN_HDR(NPC_SMAC, NPC_LID_LA, la_ltype, la_start + 6, 6); /* PF_FUNC is 2 bytes at 0th byte of NPC_LT_LA_IH_NIX_ETHER */ NPC_SCAN_HDR(NPC_PF_FUNC, NPC_LID_LA, NPC_LT_LA_IH_NIX_ETHER, 0, 2); } diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c index fb8db5888d2f..d686c7b6252f 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c @@ -632,6 +632,12 @@ int otx2_txschq_config(struct otx2_nic *pfvf, int lvl) req->num_regs++; req->reg[1] = NIX_AF_TL3X_SCHEDULE(schq); req->regval[1] = dwrr_val; + if (lvl == hw->txschq_link_cfg_lvl) { + req->num_regs++; + req->reg[2] = NIX_AF_TL3_TL2X_LINKX_CFG(schq, hw->tx_link); + /* Enable this queue and backpressure */ + req->regval[2] = BIT_ULL(13) | BIT_ULL(12); + } } else if (lvl == NIX_TXSCH_LVL_TL2) { parent = hw->txschq_list[NIX_TXSCH_LVL_TL1][0]; req->reg[0] = NIX_AF_TL2X_PARENT(schq); @@ -641,11 +647,12 @@ int otx2_txschq_config(struct otx2_nic *pfvf, int lvl) req->reg[1] = NIX_AF_TL2X_SCHEDULE(schq); req->regval[1] = TXSCH_TL1_DFLT_RR_PRIO << 24 | dwrr_val; - req->num_regs++; - req->reg[2] = NIX_AF_TL3_TL2X_LINKX_CFG(schq, hw->tx_link); - /* Enable this queue and backpressure */ - req->regval[2] = BIT_ULL(13) | BIT_ULL(12); - + if (lvl == hw->txschq_link_cfg_lvl) { + req->num_regs++; + req->reg[2] = NIX_AF_TL3_TL2X_LINKX_CFG(schq, hw->tx_link); + /* Enable this queue and backpressure */ + req->regval[2] = BIT_ULL(13) | BIT_ULL(12); + } } else if (lvl == NIX_TXSCH_LVL_TL1) { /* Default config for TL1. * For VF this is always ignored. @@ -1591,6 +1598,8 @@ void mbox_handler_nix_txsch_alloc(struct otx2_nic *pf, for (schq = 0; schq < rsp->schq[lvl]; schq++) pf->hw.txschq_list[lvl][schq] = rsp->schq_list[lvl][schq]; + + pf->hw.txschq_link_cfg_lvl = rsp->link_cfg_lvl; } EXPORT_SYMBOL(mbox_handler_nix_txsch_alloc); diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h index e795f9ee76dd..b28029cc4316 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h @@ -195,6 +195,7 @@ struct otx2_hw { u16 sqb_size; /* NIX */ + u8 txschq_link_cfg_lvl; u16 txschq_list[NIX_TXSCH_LVL_CNT][MAX_TXSCHQ_PER_FUNC]; u16 matchall_ipolicer; u32 dwrr_mtu; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index d87bbb0be7c8..e6f64d890fb3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -506,7 +506,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, int err; attr.ttl = tun_key->ttl; - attr.fl.fl6.flowlabel = ip6_make_flowinfo(RT_TOS(tun_key->tos), tun_key->label); + attr.fl.fl6.flowlabel = ip6_make_flowinfo(tun_key->tos, tun_key->label); attr.fl.fl6.daddr = tun_key->u.ipv6.dst; attr.fl.fl6.saddr = tun_key->u.ipv6.src; @@ -620,7 +620,7 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, attr.ttl = tun_key->ttl; - attr.fl.fl6.flowlabel = ip6_make_flowinfo(RT_TOS(tun_key->tos), tun_key->label); + attr.fl.fl6.flowlabel = ip6_make_flowinfo(tun_key->tos, tun_key->label); attr.fl.fl6.daddr = tun_key->u.ipv6.dst; attr.fl.fl6.saddr = tun_key->u.ipv6.src; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c index 6b6c7044b64a..0aef69527226 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c @@ -808,6 +808,7 @@ bool mlx5e_ktls_handle_tx_skb(struct net_device *netdev, struct mlx5e_txqsq *sq, { struct mlx5e_ktls_offload_context_tx *priv_tx; struct mlx5e_sq_stats *stats = sq->stats; + struct net_device *tls_netdev; struct tls_context *tls_ctx; int datalen; u32 seq; @@ -819,7 +820,12 @@ bool mlx5e_ktls_handle_tx_skb(struct net_device *netdev, struct mlx5e_txqsq *sq, mlx5e_tx_mpwqe_ensure_complete(sq); tls_ctx = tls_get_ctx(skb->sk); - if (WARN_ON_ONCE(tls_ctx->netdev != netdev)) + tls_netdev = rcu_dereference_bh(tls_ctx->netdev); + /* Don't WARN on NULL: if tls_device_down is running in parallel, + * netdev might become NULL, even if tls_is_sk_tx_device_offloaded was + * true. Rather continue processing this packet. + */ + if (WARN_ON_ONCE(tls_netdev && tls_netdev != netdev)) goto err_out; priv_tx = mlx5e_get_ktls_tx_priv_ctx(tls_ctx); diff --git a/drivers/net/ethernet/mellanox/mlxsw/minimal.c b/drivers/net/ethernet/mellanox/mlxsw/minimal.c index d9bf584234a6..bb1cd4bae82e 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/minimal.c +++ b/drivers/net/ethernet/mellanox/mlxsw/minimal.c @@ -328,7 +328,6 @@ static void mlxsw_m_port_module_unmap(struct mlxsw_m *mlxsw_m, u8 module) static int mlxsw_m_ports_create(struct mlxsw_m *mlxsw_m) { unsigned int max_ports = mlxsw_core_max_ports(mlxsw_m->core); - struct devlink *devlink = priv_to_devlink(mlxsw_m->core); u8 last_module = max_ports; int i; int err; @@ -357,7 +356,6 @@ static int mlxsw_m_ports_create(struct mlxsw_m *mlxsw_m) } /* Create port objects for each valid entry */ - devl_lock(devlink); for (i = 0; i < mlxsw_m->max_ports; i++) { if (mlxsw_m->module_to_port[i] > 0) { err = mlxsw_m_port_create(mlxsw_m, @@ -367,7 +365,6 @@ static int mlxsw_m_ports_create(struct mlxsw_m *mlxsw_m) goto err_module_to_port_create; } } - devl_unlock(devlink); return 0; @@ -377,7 +374,6 @@ err_module_to_port_create: mlxsw_m_port_remove(mlxsw_m, mlxsw_m->module_to_port[i]); } - devl_unlock(devlink); i = max_ports; err_module_to_port_map: for (i--; i > 0; i--) @@ -390,10 +386,8 @@ err_module_to_port_alloc: static void mlxsw_m_ports_remove(struct mlxsw_m *mlxsw_m) { - struct devlink *devlink = priv_to_devlink(mlxsw_m->core); int i; - devl_lock(devlink); for (i = 0; i < mlxsw_m->max_ports; i++) { if (mlxsw_m->module_to_port[i] > 0) { mlxsw_m_port_remove(mlxsw_m, @@ -401,7 +395,6 @@ static void mlxsw_m_ports_remove(struct mlxsw_m *mlxsw_m) mlxsw_m_port_module_unmap(mlxsw_m, i); } } - devl_unlock(devlink); kfree(mlxsw_m->module_to_port); kfree(mlxsw_m->ports); diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c index c922dfab8080..eeb1455a4e5d 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c @@ -1395,6 +1395,8 @@ nfp_port_get_module_info(struct net_device *netdev, u8 data; port = nfp_port_from_netdev(netdev); + /* update port state to get latest interface */ + set_bit(NFP_PORT_CHANGED, &port->flags); eth_port = nfp_port_get_eth_port(port); if (!eth_port) return -EOPNOTSUPP; diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c index 34c0d2ddf9ef..a8286d0032d1 100644 --- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c +++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c @@ -874,7 +874,6 @@ area_cache_get(struct nfp_cpp *cpp, u32 id, } /* Adjust the start address to be cache size aligned */ - cache->id = id; cache->addr = addr & ~(u64)(cache->size - 1); /* Re-init to the new ID and address */ @@ -894,6 +893,8 @@ area_cache_get(struct nfp_cpp *cpp, u32 id, return NULL; } + cache->id = id; + exit: /* Adjust offset */ *offset = addr - cache->addr; diff --git a/drivers/net/ethernet/wangxun/Kconfig b/drivers/net/ethernet/wangxun/Kconfig index baa1f0a5cc37..b4a4fa0a58f8 100644 --- a/drivers/net/ethernet/wangxun/Kconfig +++ b/drivers/net/ethernet/wangxun/Kconfig @@ -7,12 +7,12 @@ config NET_VENDOR_WANGXUN bool "Wangxun devices" default y help - If you have a network (Ethernet) card belonging to this class, say Y. + If you have a network (Ethernet) card from Wangxun(R), say Y. Note that the answer to this question doesn't directly affect the kernel: saying N will just cause the configurator to skip all - the questions about Intel cards. If you say Y, you will be asked for - your specific card in the following questions. + the questions about Wangxun(R) cards. If you say Y, you will + be asked for your specific card in the following questions. if NET_VENDOR_WANGXUN diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 018d365f9deb..7962c37b3f14 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -797,7 +797,8 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb, struct geneve_sock *gs4, struct flowi4 *fl4, const struct ip_tunnel_info *info, - __be16 dport, __be16 sport) + __be16 dport, __be16 sport, + __u8 *full_tos) { bool use_cache = ip_tunnel_dst_cache_usable(skb, info); struct geneve_dev *geneve = netdev_priv(dev); @@ -823,6 +824,8 @@ static struct rtable *geneve_get_v4_rt(struct sk_buff *skb, use_cache = false; } fl4->flowi4_tos = RT_TOS(tos); + if (full_tos) + *full_tos = tos; dst_cache = (struct dst_cache *)&info->dst_cache; if (use_cache) { @@ -876,8 +879,7 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb, use_cache = false; } - fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio), - info->key.label); + fl6->flowlabel = ip6_make_flowinfo(prio, info->key.label); dst_cache = (struct dst_cache *)&info->dst_cache; if (use_cache) { dst = dst_cache_get_ip6(dst_cache, &fl6->saddr); @@ -911,6 +913,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, const struct ip_tunnel_key *key = &info->key; struct rtable *rt; struct flowi4 fl4; + __u8 full_tos; __u8 tos, ttl; __be16 df = 0; __be16 sport; @@ -921,7 +924,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true); rt = geneve_get_v4_rt(skb, dev, gs4, &fl4, info, - geneve->cfg.info.key.tp_dst, sport); + geneve->cfg.info.key.tp_dst, sport, &full_tos); if (IS_ERR(rt)) return PTR_ERR(rt); @@ -965,7 +968,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0; } else { - tos = ip_tunnel_ecn_encap(fl4.flowi4_tos, ip_hdr(skb), skb); + tos = ip_tunnel_ecn_encap(full_tos, ip_hdr(skb), skb); if (geneve->cfg.ttl_inherit) ttl = ip_tunnel_get_ttl(ip_hdr(skb), skb); else @@ -1149,7 +1152,7 @@ static int geneve_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb) 1, USHRT_MAX, true); rt = geneve_get_v4_rt(skb, dev, gs4, &fl4, info, - geneve->cfg.info.key.tp_dst, sport); + geneve->cfg.info.key.tp_dst, sport, NULL); if (IS_ERR(rt)) return PTR_ERR(rt); diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index f1683ce6b561..ee6087e7b2bf 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -162,6 +162,19 @@ static struct macsec_rx_sa *macsec_rxsa_get(struct macsec_rx_sa __rcu *ptr) return sa; } +static struct macsec_rx_sa *macsec_active_rxsa_get(struct macsec_rx_sc *rx_sc) +{ + struct macsec_rx_sa *sa = NULL; + int an; + + for (an = 0; an < MACSEC_NUM_AN; an++) { + sa = macsec_rxsa_get(rx_sc->sa[an]); + if (sa) + break; + } + return sa; +} + static void free_rx_sc_rcu(struct rcu_head *head) { struct macsec_rx_sc *rx_sc = container_of(head, struct macsec_rx_sc, rcu_head); @@ -500,18 +513,28 @@ static void macsec_encrypt_finish(struct sk_buff *skb, struct net_device *dev) skb->protocol = eth_hdr(skb)->h_proto; } +static unsigned int macsec_msdu_len(struct sk_buff *skb) +{ + struct macsec_dev *macsec = macsec_priv(skb->dev); + struct macsec_secy *secy = &macsec->secy; + bool sci_present = macsec_skb_cb(skb)->has_sci; + + return skb->len - macsec_hdr_len(sci_present) - secy->icv_len; +} + static void macsec_count_tx(struct sk_buff *skb, struct macsec_tx_sc *tx_sc, struct macsec_tx_sa *tx_sa) { + unsigned int msdu_len = macsec_msdu_len(skb); struct pcpu_tx_sc_stats *txsc_stats = this_cpu_ptr(tx_sc->stats); u64_stats_update_begin(&txsc_stats->syncp); if (tx_sc->encrypt) { - txsc_stats->stats.OutOctetsEncrypted += skb->len; + txsc_stats->stats.OutOctetsEncrypted += msdu_len; txsc_stats->stats.OutPktsEncrypted++; this_cpu_inc(tx_sa->stats->OutPktsEncrypted); } else { - txsc_stats->stats.OutOctetsProtected += skb->len; + txsc_stats->stats.OutOctetsProtected += msdu_len; txsc_stats->stats.OutPktsProtected++; this_cpu_inc(tx_sa->stats->OutPktsProtected); } @@ -541,9 +564,10 @@ static void macsec_encrypt_done(struct crypto_async_request *base, int err) aead_request_free(macsec_skb_cb(skb)->req); rcu_read_lock_bh(); - macsec_encrypt_finish(skb, dev); macsec_count_tx(skb, &macsec->secy.tx_sc, macsec_skb_cb(skb)->tx_sa); - len = skb->len; + /* packet is encrypted/protected so tx_bytes must be calculated */ + len = macsec_msdu_len(skb) + 2 * ETH_ALEN; + macsec_encrypt_finish(skb, dev); ret = dev_queue_xmit(skb); count_tx(dev, ret, len); rcu_read_unlock_bh(); @@ -702,6 +726,7 @@ static struct sk_buff *macsec_encrypt(struct sk_buff *skb, macsec_skb_cb(skb)->req = req; macsec_skb_cb(skb)->tx_sa = tx_sa; + macsec_skb_cb(skb)->has_sci = sci_present; aead_request_set_callback(req, 0, macsec_encrypt_done, skb); dev_hold(skb->dev); @@ -743,15 +768,17 @@ static bool macsec_post_decrypt(struct sk_buff *skb, struct macsec_secy *secy, u u64_stats_update_begin(&rxsc_stats->syncp); rxsc_stats->stats.InPktsLate++; u64_stats_update_end(&rxsc_stats->syncp); + secy->netdev->stats.rx_dropped++; return false; } if (secy->validate_frames != MACSEC_VALIDATE_DISABLED) { + unsigned int msdu_len = macsec_msdu_len(skb); u64_stats_update_begin(&rxsc_stats->syncp); if (hdr->tci_an & MACSEC_TCI_E) - rxsc_stats->stats.InOctetsDecrypted += skb->len; + rxsc_stats->stats.InOctetsDecrypted += msdu_len; else - rxsc_stats->stats.InOctetsValidated += skb->len; + rxsc_stats->stats.InOctetsValidated += msdu_len; u64_stats_update_end(&rxsc_stats->syncp); } @@ -764,6 +791,8 @@ static bool macsec_post_decrypt(struct sk_buff *skb, struct macsec_secy *secy, u u64_stats_update_begin(&rxsc_stats->syncp); rxsc_stats->stats.InPktsNotValid++; u64_stats_update_end(&rxsc_stats->syncp); + this_cpu_inc(rx_sa->stats->InPktsNotValid); + secy->netdev->stats.rx_errors++; return false; } @@ -856,9 +885,9 @@ static void macsec_decrypt_done(struct crypto_async_request *base, int err) macsec_finalize_skb(skb, macsec->secy.icv_len, macsec_extra_len(macsec_skb_cb(skb)->has_sci)); + len = skb->len; macsec_reset_skb(skb, macsec->secy.netdev); - len = skb->len; if (gro_cells_receive(&macsec->gro_cells, skb) == NET_RX_SUCCESS) count_rx(dev, len); @@ -1049,6 +1078,7 @@ static enum rx_handler_result handle_not_macsec(struct sk_buff *skb) u64_stats_update_begin(&secy_stats->syncp); secy_stats->stats.InPktsNoTag++; u64_stats_update_end(&secy_stats->syncp); + macsec->secy.netdev->stats.rx_dropped++; continue; } @@ -1158,6 +1188,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) u64_stats_update_begin(&secy_stats->syncp); secy_stats->stats.InPktsBadTag++; u64_stats_update_end(&secy_stats->syncp); + secy->netdev->stats.rx_errors++; goto drop_nosa; } @@ -1168,11 +1199,15 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) /* If validateFrames is Strict or the C bit in the * SecTAG is set, discard */ + struct macsec_rx_sa *active_rx_sa = macsec_active_rxsa_get(rx_sc); if (hdr->tci_an & MACSEC_TCI_C || secy->validate_frames == MACSEC_VALIDATE_STRICT) { u64_stats_update_begin(&rxsc_stats->syncp); rxsc_stats->stats.InPktsNotUsingSA++; u64_stats_update_end(&rxsc_stats->syncp); + secy->netdev->stats.rx_errors++; + if (active_rx_sa) + this_cpu_inc(active_rx_sa->stats->InPktsNotUsingSA); goto drop_nosa; } @@ -1182,6 +1217,8 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) u64_stats_update_begin(&rxsc_stats->syncp); rxsc_stats->stats.InPktsUnusedSA++; u64_stats_update_end(&rxsc_stats->syncp); + if (active_rx_sa) + this_cpu_inc(active_rx_sa->stats->InPktsUnusedSA); goto deliver; } @@ -1202,6 +1239,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) u64_stats_update_begin(&rxsc_stats->syncp); rxsc_stats->stats.InPktsLate++; u64_stats_update_end(&rxsc_stats->syncp); + macsec->secy.netdev->stats.rx_dropped++; goto drop; } } @@ -1230,6 +1268,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) deliver: macsec_finalize_skb(skb, secy->icv_len, macsec_extra_len(macsec_skb_cb(skb)->has_sci)); + len = skb->len; macsec_reset_skb(skb, secy->netdev); if (rx_sa) @@ -1237,7 +1276,6 @@ deliver: macsec_rxsc_put(rx_sc); skb_orphan(skb); - len = skb->len; ret = gro_cells_receive(&macsec->gro_cells, skb); if (ret == NET_RX_SUCCESS) count_rx(dev, len); @@ -1279,6 +1317,7 @@ nosci: u64_stats_update_begin(&secy_stats->syncp); secy_stats->stats.InPktsNoSCI++; u64_stats_update_end(&secy_stats->syncp); + macsec->secy.netdev->stats.rx_errors++; continue; } @@ -3404,6 +3443,7 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, return NETDEV_TX_OK; } + len = skb->len; skb = macsec_encrypt(skb, dev); if (IS_ERR(skb)) { if (PTR_ERR(skb) != -EINPROGRESS) @@ -3414,7 +3454,6 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, macsec_count_tx(skb, &macsec->secy.tx_sc, macsec_skb_cb(skb)->tx_sa); macsec_encrypt_finish(skb, dev); - len = skb->len; ret = dev_queue_xmit(skb); count_tx(dev, ret, len); return ret; @@ -3662,6 +3701,7 @@ static void macsec_get_stats64(struct net_device *dev, s->rx_dropped = dev->stats.rx_dropped; s->tx_dropped = dev->stats.tx_dropped; + s->rx_errors = dev->stats.rx_errors; } static int macsec_get_iflink(const struct net_device *dev) diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c index 1e38039c5c56..6939563d3b7c 100644 --- a/drivers/net/phy/dp83867.c +++ b/drivers/net/phy/dp83867.c @@ -535,7 +535,7 @@ static int dp83867_of_init_io_impedance(struct phy_device *phydev) cell = of_nvmem_cell_get(of_node, "io_impedance_ctrl"); if (IS_ERR(cell)) { ret = PTR_ERR(cell); - if (ret != -ENOENT) + if (ret != -ENOENT && ret != -EOPNOTSUPP) return phydev_err_probe(phydev, ret, "failed to get nvmem cell io_impedance_ctrl\n"); diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c index 29b1df03f3e8..a87a4b3ffce4 100644 --- a/drivers/net/phy/phy-c45.c +++ b/drivers/net/phy/phy-c45.c @@ -190,44 +190,42 @@ EXPORT_SYMBOL_GPL(genphy_c45_pma_setup_forced); */ static int genphy_c45_baset1_an_config_aneg(struct phy_device *phydev) { + u16 adv_l_mask, adv_l = 0; + u16 adv_m_mask, adv_m = 0; int changed = 0; - u16 adv_l = 0; - u16 adv_m = 0; int ret; + adv_l_mask = MDIO_AN_T1_ADV_L_FORCE_MS | MDIO_AN_T1_ADV_L_PAUSE_CAP | + MDIO_AN_T1_ADV_L_PAUSE_ASYM; + adv_m_mask = MDIO_AN_T1_ADV_M_MST | MDIO_AN_T1_ADV_M_B10L; + switch (phydev->master_slave_set) { case MASTER_SLAVE_CFG_MASTER_FORCE: + adv_m |= MDIO_AN_T1_ADV_M_MST; + fallthrough; case MASTER_SLAVE_CFG_SLAVE_FORCE: adv_l |= MDIO_AN_T1_ADV_L_FORCE_MS; break; case MASTER_SLAVE_CFG_MASTER_PREFERRED: + adv_m |= MDIO_AN_T1_ADV_M_MST; + fallthrough; case MASTER_SLAVE_CFG_SLAVE_PREFERRED: break; case MASTER_SLAVE_CFG_UNKNOWN: case MASTER_SLAVE_CFG_UNSUPPORTED: - return 0; + /* if master/slave role is not specified, do not overwrite it */ + adv_l_mask &= ~MDIO_AN_T1_ADV_L_FORCE_MS; + adv_m_mask &= ~MDIO_AN_T1_ADV_M_MST; + break; default: phydev_warn(phydev, "Unsupported Master/Slave mode\n"); return -EOPNOTSUPP; } - switch (phydev->master_slave_set) { - case MASTER_SLAVE_CFG_MASTER_FORCE: - case MASTER_SLAVE_CFG_MASTER_PREFERRED: - adv_m |= MDIO_AN_T1_ADV_M_MST; - break; - case MASTER_SLAVE_CFG_SLAVE_FORCE: - case MASTER_SLAVE_CFG_SLAVE_PREFERRED: - break; - default: - break; - } - adv_l |= linkmode_adv_to_mii_t1_adv_l_t(phydev->advertising); ret = phy_modify_mmd_changed(phydev, MDIO_MMD_AN, MDIO_AN_T1_ADV_L, - (MDIO_AN_T1_ADV_L_FORCE_MS | MDIO_AN_T1_ADV_L_PAUSE_CAP - | MDIO_AN_T1_ADV_L_PAUSE_ASYM), adv_l); + adv_l_mask, adv_l); if (ret < 0) return ret; if (ret > 0) @@ -236,7 +234,7 @@ static int genphy_c45_baset1_an_config_aneg(struct phy_device *phydev) adv_m |= linkmode_adv_to_mii_t1_adv_m_t(phydev->advertising); ret = phy_modify_mmd_changed(phydev, MDIO_MMD_AN, MDIO_AN_T1_ADV_M, - MDIO_AN_T1_ADV_M_MST | MDIO_AN_T1_ADV_M_B10L, adv_m); + adv_m_mask, adv_m); if (ret < 0) return ret; if (ret > 0) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index a74b320f5b27..0c6efd792690 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -316,6 +316,12 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev) phydev->suspended_by_mdio_bus = 0; + /* If we managed to get here with the PHY state machine in a state other + * than PHY_HALTED this is an indication that something went wrong and + * we should most likely be using MAC managed PM and we are not. + */ + WARN_ON(phydev->state != PHY_HALTED && !phydev->mac_managed_pm); + ret = phy_init_hw(phydev); if (ret < 0) return ret; diff --git a/drivers/net/plip/plip.c b/drivers/net/plip/plip.c index dafd3e9ebbf8..c8791e9b451d 100644 --- a/drivers/net/plip/plip.c +++ b/drivers/net/plip/plip.c @@ -1111,7 +1111,7 @@ plip_open(struct net_device *dev) /* Any address will do - we take the first. We already have the first two bytes filled with 0xfc, from plip_init_dev(). */ - const struct in_ifaddr *ifa = rcu_dereference(in_dev->ifa_list); + const struct in_ifaddr *ifa = rtnl_dereference(in_dev->ifa_list); if (ifa != NULL) { dev_addr_mod(dev, 2, &ifa->ifa_local, 4); } diff --git a/drivers/net/tap.c b/drivers/net/tap.c index c3d42062559d..9e75ed3f08ce 100644 --- a/drivers/net/tap.c +++ b/drivers/net/tap.c @@ -716,10 +716,20 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control, skb_reset_mac_header(skb); skb->protocol = eth_hdr(skb)->h_proto; + rcu_read_lock(); + tap = rcu_dereference(q->tap); + if (!tap) { + kfree_skb(skb); + rcu_read_unlock(); + return total_len; + } + skb->dev = tap->dev; + if (vnet_hdr_len) { err = virtio_net_hdr_to_skb(skb, &vnet_hdr, tap_is_little_endian(q)); if (err) { + rcu_read_unlock(); drop_reason = SKB_DROP_REASON_DEV_HDR; goto err_kfree; } @@ -732,8 +742,6 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control, __vlan_get_protocol(skb, skb->protocol, &depth) != 0) skb_set_network_header(skb, depth); - rcu_read_lock(); - tap = rcu_dereference(q->tap); /* copy skb_ubuf_info for callback when skb has no error */ if (zerocopy) { skb_zcopy_init(skb, msg_control); @@ -742,14 +750,8 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control, uarg->callback(NULL, uarg, false); } - if (tap) { - skb->dev = tap->dev; - dev_queue_xmit(skb); - } else { - kfree_skb(skb); - } + dev_queue_xmit(skb); rcu_read_unlock(); - return total_len; err_kfree: diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c index 0ad468a00064..aff39bf3161d 100644 --- a/drivers/net/usb/ax88179_178a.c +++ b/drivers/net/usb/ax88179_178a.c @@ -1680,7 +1680,7 @@ static const struct driver_info ax88179_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1693,7 +1693,7 @@ static const struct driver_info ax88178a_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1706,7 +1706,7 @@ static const struct driver_info cypress_GX3_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1719,7 +1719,7 @@ static const struct driver_info dlink_dub1312_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1732,7 +1732,7 @@ static const struct driver_info sitecom_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1745,7 +1745,7 @@ static const struct driver_info samsung_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1758,7 +1758,7 @@ static const struct driver_info lenovo_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1771,7 +1771,7 @@ static const struct driver_info belkin_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1784,7 +1784,7 @@ static const struct driver_info toshiba_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1797,7 +1797,7 @@ static const struct driver_info mct_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1810,7 +1810,7 @@ static const struct driver_info at_umc2000_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1823,7 +1823,7 @@ static const struct driver_info at_umc200_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; @@ -1836,7 +1836,7 @@ static const struct driver_info at_umc2000sp_info = { .link_reset = ax88179_link_reset, .reset = ax88179_reset, .stop = ax88179_stop, - .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP, + .flags = FLAG_ETHER | FLAG_FRAMING_AX, .rx_fixup = ax88179_rx_fixup, .tx_fixup = ax88179_tx_fixup, }; diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 571a399c195d..709e3c59e340 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1390,6 +1390,8 @@ static const struct usb_device_id products[] = { {QMI_QUIRK_SET_DTR(0x1e2d, 0x00b0, 4)}, /* Cinterion CLS8 */ {QMI_FIXED_INTF(0x1e2d, 0x00b7, 0)}, /* Cinterion MV31 RmNet */ {QMI_FIXED_INTF(0x1e2d, 0x00b9, 0)}, /* Cinterion MV31 RmNet based on new baseline */ + {QMI_FIXED_INTF(0x1e2d, 0x00f3, 0)}, /* Cinterion MV32-W-A RmNet */ + {QMI_FIXED_INTF(0x1e2d, 0x00f4, 0)}, /* Cinterion MV32-W-B RmNet */ {QMI_FIXED_INTF(0x413c, 0x81a2, 8)}, /* Dell Wireless 5806 Gobi(TM) 4G LTE Mobile Broadband Card */ {QMI_FIXED_INTF(0x413c, 0x81a3, 8)}, /* Dell Wireless 5570 HSPA+ (42Mbps) Mobile Broadband Card */ {QMI_FIXED_INTF(0x413c, 0x81a4, 8)}, /* Dell Wireless 5570e HSPA+ (42Mbps) Mobile Broadband Card */ diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 2cb833b3006a..466da01ba2e3 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -312,7 +312,6 @@ static bool veth_skb_is_eligible_for_gro(const struct net_device *dev, static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) { struct veth_priv *rcv_priv, *priv = netdev_priv(dev); - struct netdev_queue *queue = NULL; struct veth_rq *rq = NULL; struct net_device *rcv; int length = skb->len; @@ -330,7 +329,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) rxq = skb_get_queue_mapping(skb); if (rxq < rcv->real_num_rx_queues) { rq = &rcv_priv->rq[rxq]; - queue = netdev_get_tx_queue(dev, rxq); /* The napi pointer is available when an XDP program is * attached or when GRO is enabled @@ -342,8 +340,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) skb_tx_timestamp(skb); if (likely(veth_forward_skb(rcv, skb, rq, use_napi) == NET_RX_SUCCESS)) { - if (queue) - txq_trans_cond_update(queue); if (!use_napi) dev_lstats_add(dev, length); } else { diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ec8e1b3108c3..3b3eebad3977 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1057,8 +1057,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, case XDP_TX: stats->xdp_tx++; xdpf = xdp_convert_buff_to_frame(&xdp); - if (unlikely(!xdpf)) + if (unlikely(!xdpf)) { + if (unlikely(xdp_page != page)) + put_page(xdp_page); goto err_xdp; + } err = virtnet_xdp_xmit(dev, 1, &xdpf, 0); if (unlikely(!err)) { xdp_return_frame_rx_napi(xdpf); diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 90811ab851fd..c3285242f74f 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2321,7 +2321,7 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan, fl6.flowi6_oif = oif; fl6.daddr = *daddr; fl6.saddr = *saddr; - fl6.flowlabel = ip6_make_flowinfo(RT_TOS(tos), label); + fl6.flowlabel = ip6_make_flowinfo(tos, label); fl6.flowi6_mark = skb->mark; fl6.flowi6_proto = IPPROTO_UDP; fl6.fl6_dport = dport; diff --git a/drivers/net/wireless/microchip/wilc1000/hif.c b/drivers/net/wireless/microchip/wilc1000/hif.c index b89519ab9205..eb1d1ba3a443 100644 --- a/drivers/net/wireless/microchip/wilc1000/hif.c +++ b/drivers/net/wireless/microchip/wilc1000/hif.c @@ -635,7 +635,7 @@ static inline void host_int_parse_assoc_resp_info(struct wilc_vif *vif, conn_info->req_ies_len = 0; } -inline void wilc_handle_disconnect(struct wilc_vif *vif) +void wilc_handle_disconnect(struct wilc_vif *vif) { struct host_if_drv *hif_drv = vif->hif_drv; diff --git a/drivers/net/wireless/microchip/wilc1000/hif.h b/drivers/net/wireless/microchip/wilc1000/hif.h index 69ba1d469e9f..baa2881f4465 100644 --- a/drivers/net/wireless/microchip/wilc1000/hif.h +++ b/drivers/net/wireless/microchip/wilc1000/hif.h @@ -215,5 +215,6 @@ void wilc_gnrl_async_info_received(struct wilc *wilc, u8 *buffer, u32 length); void *wilc_parse_join_bss_param(struct cfg80211_bss *bss, struct cfg80211_crypto_settings *crypto); int wilc_set_default_mgmt_key_index(struct wilc_vif *vif, u8 index); -inline void wilc_handle_disconnect(struct wilc_vif *vif); +void wilc_handle_disconnect(struct wilc_vif *vif); + #endif diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index 35d4b398c197..8bd9fd51208c 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -763,6 +763,49 @@ static void qeth_issue_ipa_msg(struct qeth_ipa_cmd *cmd, int rc, ipa_name, com, CARD_DEVID(card)); } +static void qeth_default_link_info(struct qeth_card *card) +{ + struct qeth_link_info *link_info = &card->info.link_info; + + QETH_CARD_TEXT(card, 2, "dftlinfo"); + link_info->duplex = DUPLEX_FULL; + + if (IS_IQD(card) || IS_VM_NIC(card)) { + link_info->speed = SPEED_10000; + link_info->port = PORT_FIBRE; + link_info->link_mode = QETH_LINK_MODE_FIBRE_SHORT; + } else { + switch (card->info.link_type) { + case QETH_LINK_TYPE_FAST_ETH: + case QETH_LINK_TYPE_LANE_ETH100: + link_info->speed = SPEED_100; + link_info->port = PORT_TP; + break; + case QETH_LINK_TYPE_GBIT_ETH: + case QETH_LINK_TYPE_LANE_ETH1000: + link_info->speed = SPEED_1000; + link_info->port = PORT_FIBRE; + break; + case QETH_LINK_TYPE_10GBIT_ETH: + link_info->speed = SPEED_10000; + link_info->port = PORT_FIBRE; + break; + case QETH_LINK_TYPE_25GBIT_ETH: + link_info->speed = SPEED_25000; + link_info->port = PORT_FIBRE; + break; + default: + dev_info(&card->gdev->dev, + "Unknown link type %x\n", + card->info.link_type); + link_info->speed = SPEED_UNKNOWN; + link_info->port = PORT_OTHER; + } + + link_info->link_mode = QETH_LINK_MODE_UNKNOWN; + } +} + static struct qeth_ipa_cmd *qeth_check_ipa_data(struct qeth_card *card, struct qeth_ipa_cmd *cmd) { @@ -790,6 +833,7 @@ static struct qeth_ipa_cmd *qeth_check_ipa_data(struct qeth_card *card, netdev_name(card->dev), card->info.chpid); qeth_issue_ipa_msg(cmd, cmd->hdr.return_code, card); netif_carrier_off(card->dev); + qeth_default_link_info(card); } return NULL; case IPA_CMD_STARTLAN: @@ -4744,92 +4788,6 @@ out_free: return rc; } -static int qeth_query_card_info_cb(struct qeth_card *card, - struct qeth_reply *reply, unsigned long data) -{ - struct qeth_ipa_cmd *cmd = (struct qeth_ipa_cmd *)data; - struct qeth_link_info *link_info = reply->param; - struct qeth_query_card_info *card_info; - - QETH_CARD_TEXT(card, 2, "qcrdincb"); - if (qeth_setadpparms_inspect_rc(cmd)) - return -EIO; - - card_info = &cmd->data.setadapterparms.data.card_info; - netdev_dbg(card->dev, - "card info: card_type=0x%02x, port_mode=0x%04x, port_speed=0x%08x\n", - card_info->card_type, card_info->port_mode, - card_info->port_speed); - - switch (card_info->port_mode) { - case CARD_INFO_PORTM_FULLDUPLEX: - link_info->duplex = DUPLEX_FULL; - break; - case CARD_INFO_PORTM_HALFDUPLEX: - link_info->duplex = DUPLEX_HALF; - break; - default: - link_info->duplex = DUPLEX_UNKNOWN; - } - - switch (card_info->card_type) { - case CARD_INFO_TYPE_1G_COPPER_A: - case CARD_INFO_TYPE_1G_COPPER_B: - link_info->speed = SPEED_1000; - link_info->port = PORT_TP; - break; - case CARD_INFO_TYPE_1G_FIBRE_A: - case CARD_INFO_TYPE_1G_FIBRE_B: - link_info->speed = SPEED_1000; - link_info->port = PORT_FIBRE; - break; - case CARD_INFO_TYPE_10G_FIBRE_A: - case CARD_INFO_TYPE_10G_FIBRE_B: - link_info->speed = SPEED_10000; - link_info->port = PORT_FIBRE; - break; - default: - switch (card_info->port_speed) { - case CARD_INFO_PORTS_10M: - link_info->speed = SPEED_10; - break; - case CARD_INFO_PORTS_100M: - link_info->speed = SPEED_100; - break; - case CARD_INFO_PORTS_1G: - link_info->speed = SPEED_1000; - break; - case CARD_INFO_PORTS_10G: - link_info->speed = SPEED_10000; - break; - case CARD_INFO_PORTS_25G: - link_info->speed = SPEED_25000; - break; - default: - link_info->speed = SPEED_UNKNOWN; - } - - link_info->port = PORT_OTHER; - } - - return 0; -} - -int qeth_query_card_info(struct qeth_card *card, - struct qeth_link_info *link_info) -{ - struct qeth_cmd_buffer *iob; - - QETH_CARD_TEXT(card, 2, "qcrdinfo"); - if (!qeth_adp_supported(card, IPA_SETADP_QUERY_CARD_INFO)) - return -EOPNOTSUPP; - iob = qeth_get_adapter_cmd(card, IPA_SETADP_QUERY_CARD_INFO, 0); - if (!iob) - return -ENOMEM; - - return qeth_send_ipa_cmd(card, iob, qeth_query_card_info_cb, link_info); -} - static int qeth_init_link_info_oat_cb(struct qeth_card *card, struct qeth_reply *reply_priv, unsigned long data) @@ -4839,6 +4797,7 @@ static int qeth_init_link_info_oat_cb(struct qeth_card *card, struct qeth_query_oat_physical_if *phys_if; struct qeth_query_oat_reply *reply; + QETH_CARD_TEXT(card, 2, "qoatincb"); if (qeth_setadpparms_inspect_rc(cmd)) return -EIO; @@ -4918,41 +4877,7 @@ static int qeth_init_link_info_oat_cb(struct qeth_card *card, static void qeth_init_link_info(struct qeth_card *card) { - card->info.link_info.duplex = DUPLEX_FULL; - - if (IS_IQD(card) || IS_VM_NIC(card)) { - card->info.link_info.speed = SPEED_10000; - card->info.link_info.port = PORT_FIBRE; - card->info.link_info.link_mode = QETH_LINK_MODE_FIBRE_SHORT; - } else { - switch (card->info.link_type) { - case QETH_LINK_TYPE_FAST_ETH: - case QETH_LINK_TYPE_LANE_ETH100: - card->info.link_info.speed = SPEED_100; - card->info.link_info.port = PORT_TP; - break; - case QETH_LINK_TYPE_GBIT_ETH: - case QETH_LINK_TYPE_LANE_ETH1000: - card->info.link_info.speed = SPEED_1000; - card->info.link_info.port = PORT_FIBRE; - break; - case QETH_LINK_TYPE_10GBIT_ETH: - card->info.link_info.speed = SPEED_10000; - card->info.link_info.port = PORT_FIBRE; - break; - case QETH_LINK_TYPE_25GBIT_ETH: - card->info.link_info.speed = SPEED_25000; - card->info.link_info.port = PORT_FIBRE; - break; - default: - dev_info(&card->gdev->dev, "Unknown link type %x\n", - card->info.link_type); - card->info.link_info.speed = SPEED_UNKNOWN; - card->info.link_info.port = PORT_OTHER; - } - - card->info.link_info.link_mode = QETH_LINK_MODE_UNKNOWN; - } + qeth_default_link_info(card); /* Get more accurate data via QUERY OAT: */ if (qeth_adp_supported(card, IPA_SETADP_QUERY_OAT)) { @@ -5461,6 +5386,7 @@ int qeth_set_offline(struct qeth_card *card, const struct qeth_discipline *disc, qeth_clear_working_pool_list(card); qeth_flush_local_addrs(card); card->info.promisc_mode = 0; + qeth_default_link_info(card); rc = qeth_stop_channel(&card->data); rc2 = qeth_stop_channel(&card->write); diff --git a/drivers/s390/net/qeth_ethtool.c b/drivers/s390/net/qeth_ethtool.c index b0b36b2132fe..9eba0a32e9f9 100644 --- a/drivers/s390/net/qeth_ethtool.c +++ b/drivers/s390/net/qeth_ethtool.c @@ -428,8 +428,8 @@ static int qeth_get_link_ksettings(struct net_device *netdev, struct ethtool_link_ksettings *cmd) { struct qeth_card *card = netdev->ml_priv; - struct qeth_link_info link_info; + QETH_CARD_TEXT(card, 4, "ethtglks"); cmd->base.speed = card->info.link_info.speed; cmd->base.duplex = card->info.link_info.duplex; cmd->base.port = card->info.link_info.port; @@ -439,16 +439,6 @@ static int qeth_get_link_ksettings(struct net_device *netdev, cmd->base.eth_tp_mdix = ETH_TP_MDI_INVALID; cmd->base.eth_tp_mdix_ctrl = ETH_TP_MDI_INVALID; - /* Check if we can obtain more accurate information. */ - if (!qeth_query_card_info(card, &link_info)) { - if (link_info.speed != SPEED_UNKNOWN) - cmd->base.speed = link_info.speed; - if (link_info.duplex != DUPLEX_UNKNOWN) - cmd->base.duplex = link_info.duplex; - if (link_info.port != PORT_OTHER) - cmd->base.port = link_info.port; - } - qeth_set_ethtool_link_modes(cmd, card->info.link_info.link_mode); return 0; diff --git a/include/linux/bpfptr.h b/include/linux/bpfptr.h index 46e1757d06a3..79b2f78eec1a 100644 --- a/include/linux/bpfptr.h +++ b/include/linux/bpfptr.h @@ -49,7 +49,9 @@ static inline void bpfptr_add(bpfptr_t *bpfptr, size_t val) static inline int copy_from_bpfptr_offset(void *dst, bpfptr_t src, size_t offset, size_t size) { - return copy_from_sockptr_offset(dst, (sockptr_t) src, offset, size); + if (!bpfptr_is_kernel(src)) + return copy_from_user(dst, src.user + offset, size); + return copy_from_kernel_nofault(dst, src.kernel + offset, size); } static inline int copy_from_bpfptr(void *dst, bpfptr_t src, size_t size) @@ -78,7 +80,9 @@ static inline void *kvmemdup_bpfptr(bpfptr_t src, size_t len) static inline long strncpy_from_bpfptr(char *dst, bpfptr_t src, size_t count) { - return strncpy_from_sockptr(dst, (sockptr_t) src, count); + if (bpfptr_is_kernel(src)) + return strncpy_from_kernel_nofault(dst, src.kernel, count); + return strncpy_from_user(dst, src.user, count); } #endif /* _LINUX_BPFPTR_H */ diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 153b6dec9b6a..48f4b645193b 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -278,7 +278,8 @@ static inline void sk_msg_sg_copy_clear(struct sk_msg *msg, u32 start) static inline struct sk_psock *sk_psock(const struct sock *sk) { - return rcu_dereference_sk_user_data(sk); + return __rcu_dereference_sk_user_data_with_flags(sk, + SK_USER_DATA_PSOCK); } static inline void sk_psock_set_state(struct sk_psock *psock, diff --git a/include/net/ax88796.h b/include/net/ax88796.h index b658471f97f0..303100f08ab8 100644 --- a/include/net/ax88796.h +++ b/include/net/ax88796.h @@ -34,8 +34,8 @@ struct ax_plat_data { const unsigned char *buf, int star_page); void (*block_input)(struct net_device *dev, int count, struct sk_buff *skb, int ring_offset); - /* returns nonzero if a pending interrupt request might by caused by - * the ax88786. Handles all interrupts if set to NULL + /* returns nonzero if a pending interrupt request might be caused by + * the ax88796. Handles all interrupts if set to NULL */ int (*check_irq)(struct platform_device *pdev); }; diff --git a/include/net/bonding.h b/include/net/bonding.h index 6e78d657aa05..afd606df149a 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -161,8 +161,9 @@ struct slave { struct net_device *dev; /* first - useful for panic debug */ struct bonding *bond; /* our master */ int delay; - /* all three in jiffies */ + /* all 4 in jiffies */ unsigned long last_link_up; + unsigned long last_tx; unsigned long last_rx; unsigned long target_last_arp_rx[BOND_MAX_ARP_TARGETS]; s8 link; /* one of BOND_LINK_XXXX */ @@ -540,6 +541,16 @@ static inline unsigned long slave_last_rx(struct bonding *bond, return slave->last_rx; } +static inline void slave_update_last_tx(struct slave *slave) +{ + WRITE_ONCE(slave->last_tx, jiffies); +} + +static inline unsigned long slave_last_tx(struct slave *slave) +{ + return READ_ONCE(slave->last_tx); +} + #ifdef CONFIG_NET_POLL_CONTROLLER static inline netdev_tx_t bond_netpoll_send_skb(const struct slave *slave, struct sk_buff *skb) diff --git a/include/net/genetlink.h b/include/net/genetlink.h index 7cb3fa8310ed..56a50e1c51b9 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -11,6 +11,7 @@ /** * struct genl_multicast_group - generic netlink multicast group * @name: name of the multicast group, names are per-family + * @flags: GENL_* flags (%GENL_ADMIN_PERM or %GENL_UNS_ADMIN_PERM) */ struct genl_multicast_group { char name[GENL_NAMSIZ]; @@ -116,7 +117,7 @@ enum genl_validate_flags { * struct genl_small_ops - generic netlink operations (small version) * @cmd: command identifier * @internal_flags: flags used by the family - * @flags: flags + * @flags: GENL_* flags (%GENL_ADMIN_PERM or %GENL_UNS_ADMIN_PERM) * @validate: validation flags from enum genl_validate_flags * @doit: standard command callback * @dumpit: callback for dumpers @@ -137,7 +138,7 @@ struct genl_small_ops { * struct genl_ops - generic netlink operations * @cmd: command identifier * @internal_flags: flags used by the family - * @flags: flags + * @flags: GENL_* flags (%GENL_ADMIN_PERM or %GENL_UNS_ADMIN_PERM) * @maxattr: maximum number of attributes supported * @policy: netlink policy (takes precedence over family policy) * @validate: validation flags from enum genl_validate_flags diff --git a/include/net/mptcp.h b/include/net/mptcp.h index ac9cf7271d46..412479ebf5ad 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -291,4 +291,8 @@ struct mptcp_sock *bpf_mptcp_sock_from_subflow(struct sock *sk); static inline struct mptcp_sock *bpf_mptcp_sock_from_subflow(struct sock *sk) { return NULL; } #endif +#if !IS_ENABLED(CONFIG_MPTCP) +struct mptcp_sock { }; +#endif + #endif /* __NET_MPTCP_H */ diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 8bfb9c74afbf..99aae36c04b9 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -221,13 +221,18 @@ struct nft_ctx { bool report; }; +enum nft_data_desc_flags { + NFT_DATA_DESC_SETELEM = (1 << 0), +}; + struct nft_data_desc { enum nft_data_types type; + unsigned int size; unsigned int len; + unsigned int flags; }; -int nft_data_init(const struct nft_ctx *ctx, - struct nft_data *data, unsigned int size, +int nft_data_init(const struct nft_ctx *ctx, struct nft_data *data, struct nft_data_desc *desc, const struct nlattr *nla); void nft_data_hold(const struct nft_data *data, enum nft_data_types type); void nft_data_release(const struct nft_data *data, enum nft_data_types type); @@ -651,6 +656,7 @@ extern const struct nft_set_ext_type nft_set_ext_types[]; struct nft_set_ext_tmpl { u16 len; u8 offset[NFT_SET_EXT_NUM]; + u8 ext_len[NFT_SET_EXT_NUM]; }; /** @@ -680,7 +686,8 @@ static inline int nft_set_ext_add_length(struct nft_set_ext_tmpl *tmpl, u8 id, return -EINVAL; tmpl->offset[id] = tmpl->len; - tmpl->len += nft_set_ext_types[id].len + len; + tmpl->ext_len[id] = nft_set_ext_types[id].len + len; + tmpl->len += tmpl->ext_len[id]; return 0; } diff --git a/include/net/sock.h b/include/net/sock.h index a7273b289188..05a1bbdf5805 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -545,14 +545,26 @@ enum sk_pacing { SK_PACING_FQ = 2, }; -/* Pointer stored in sk_user_data might not be suitable for copying - * when cloning the socket. For instance, it can point to a reference - * counted object. sk_user_data bottom bit is set if pointer must not - * be copied. +/* flag bits in sk_user_data + * + * - SK_USER_DATA_NOCOPY: Pointer stored in sk_user_data might + * not be suitable for copying when cloning the socket. For instance, + * it can point to a reference counted object. sk_user_data bottom + * bit is set if pointer must not be copied. + * + * - SK_USER_DATA_BPF: Mark whether sk_user_data field is + * managed/owned by a BPF reuseport array. This bit should be set + * when sk_user_data's sk is added to the bpf's reuseport_array. + * + * - SK_USER_DATA_PSOCK: Mark whether pointer stored in + * sk_user_data points to psock type. This bit should be set + * when sk_user_data is assigned to a psock object. */ #define SK_USER_DATA_NOCOPY 1UL -#define SK_USER_DATA_BPF 2UL /* Managed by BPF */ -#define SK_USER_DATA_PTRMASK ~(SK_USER_DATA_NOCOPY | SK_USER_DATA_BPF) +#define SK_USER_DATA_BPF 2UL +#define SK_USER_DATA_PSOCK 4UL +#define SK_USER_DATA_PTRMASK ~(SK_USER_DATA_NOCOPY | SK_USER_DATA_BPF |\ + SK_USER_DATA_PSOCK) /** * sk_user_data_is_nocopy - Test if sk_user_data pointer must not be copied @@ -565,24 +577,40 @@ static inline bool sk_user_data_is_nocopy(const struct sock *sk) #define __sk_user_data(sk) ((*((void __rcu **)&(sk)->sk_user_data))) +/** + * __rcu_dereference_sk_user_data_with_flags - return the pointer + * only if argument flags all has been set in sk_user_data. Otherwise + * return NULL + * + * @sk: socket + * @flags: flag bits + */ +static inline void * +__rcu_dereference_sk_user_data_with_flags(const struct sock *sk, + uintptr_t flags) +{ + uintptr_t sk_user_data = (uintptr_t)rcu_dereference(__sk_user_data(sk)); + + WARN_ON_ONCE(flags & SK_USER_DATA_PTRMASK); + + if ((sk_user_data & flags) == flags) + return (void *)(sk_user_data & SK_USER_DATA_PTRMASK); + return NULL; +} + #define rcu_dereference_sk_user_data(sk) \ + __rcu_dereference_sk_user_data_with_flags(sk, 0) +#define __rcu_assign_sk_user_data_with_flags(sk, ptr, flags) \ ({ \ - void *__tmp = rcu_dereference(__sk_user_data((sk))); \ - (void *)((uintptr_t)__tmp & SK_USER_DATA_PTRMASK); \ -}) -#define rcu_assign_sk_user_data(sk, ptr) \ -({ \ - uintptr_t __tmp = (uintptr_t)(ptr); \ - WARN_ON_ONCE(__tmp & ~SK_USER_DATA_PTRMASK); \ - rcu_assign_pointer(__sk_user_data((sk)), __tmp); \ -}) -#define rcu_assign_sk_user_data_nocopy(sk, ptr) \ -({ \ - uintptr_t __tmp = (uintptr_t)(ptr); \ - WARN_ON_ONCE(__tmp & ~SK_USER_DATA_PTRMASK); \ + uintptr_t __tmp1 = (uintptr_t)(ptr), \ + __tmp2 = (uintptr_t)(flags); \ + WARN_ON_ONCE(__tmp1 & ~SK_USER_DATA_PTRMASK); \ + WARN_ON_ONCE(__tmp2 & SK_USER_DATA_PTRMASK); \ rcu_assign_pointer(__sk_user_data((sk)), \ - __tmp | SK_USER_DATA_NOCOPY); \ + __tmp1 | __tmp2); \ }) +#define rcu_assign_sk_user_data(sk, ptr) \ + __rcu_assign_sk_user_data_with_flags(sk, ptr, 0) static inline struct net *sock_net(const struct sock *sk) diff --git a/include/net/tls.h b/include/net/tls.h index b75b5727abdb..cb205f9d9473 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -237,7 +237,7 @@ struct tls_context { void *priv_ctx_tx; void *priv_ctx_rx; - struct net_device *netdev; + struct net_device __rcu *netdev; /* rw cache line */ struct cipher_context tx; diff --git a/include/uapi/linux/atm_zatm.h b/include/uapi/linux/atm_zatm.h new file mode 100644 index 000000000000..5135027b93c1 --- /dev/null +++ b/include/uapi/linux/atm_zatm.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* atm_zatm.h - Driver-specific declarations of the ZATM driver (for use by + driver-specific utilities) */ + +/* Written 1995-1999 by Werner Almesberger, EPFL LRC/ICA */ + + +#ifndef LINUX_ATM_ZATM_H +#define LINUX_ATM_ZATM_H + +/* + * Note: non-kernel programs including this file must also include + * sys/types.h for struct timeval + */ + +#include <linux/atmapi.h> +#include <linux/atmioc.h> + +#define ZATM_GETPOOL _IOW('a',ATMIOC_SARPRV+1,struct atmif_sioc) + /* get pool statistics */ +#define ZATM_GETPOOLZ _IOW('a',ATMIOC_SARPRV+2,struct atmif_sioc) + /* get statistics and zero */ +#define ZATM_SETPOOL _IOW('a',ATMIOC_SARPRV+3,struct atmif_sioc) + /* set pool parameters */ + +struct zatm_pool_info { + int ref_count; /* free buffer pool usage counters */ + int low_water,high_water; /* refill parameters */ + int rqa_count,rqu_count; /* queue condition counters */ + int offset,next_off; /* alignment optimizations: offset */ + int next_cnt,next_thres; /* repetition counter and threshold */ +}; + +struct zatm_pool_req { + int pool_num; /* pool number */ + struct zatm_pool_info info; /* actual information */ +}; + +#define ZATM_OAM_POOL 0 /* free buffer pool for OAM cells */ +#define ZATM_AAL0_POOL 1 /* free buffer pool for AAL0 cells */ +#define ZATM_AAL5_POOL_BASE 2 /* first AAL5 free buffer pool */ +#define ZATM_LAST_POOL ZATM_AAL5_POOL_BASE+10 /* max. 64 kB */ + +#define ZATM_TIMER_HISTORY_SIZE 16 /* number of timer adjustments to + record; must be 2^n */ + +#endif diff --git a/include/uapi/linux/genetlink.h b/include/uapi/linux/genetlink.h index d83f214b4134..ddba3ca01e39 100644 --- a/include/uapi/linux/genetlink.h +++ b/include/uapi/linux/genetlink.h @@ -87,6 +87,8 @@ enum { __CTRL_ATTR_MCAST_GRP_MAX, }; +#define CTRL_ATTR_MCAST_GRP_MAX (__CTRL_ATTR_MCAST_GRP_MAX - 1) + enum { CTRL_ATTR_POLICY_UNSPEC, CTRL_ATTR_POLICY_DO, @@ -96,7 +98,6 @@ enum { CTRL_ATTR_POLICY_DUMP_MAX = __CTRL_ATTR_POLICY_DUMP_MAX - 1 }; -#define CTRL_ATTR_MCAST_GRP_MAX (__CTRL_ATTR_MCAST_GRP_MAX - 1) - +#define CTRL_ATTR_POLICY_MAX (__CTRL_ATTR_POLICY_DUMP_MAX - 1) #endif /* _UAPI__LINUX_GENERIC_NETLINK_H */ diff --git a/include/uapi/linux/netfilter_ipv6/ip6t_LOG.h b/include/uapi/linux/netfilter_ipv6/ip6t_LOG.h index 23e91a9c2583..0b7b16dbdec2 100644 --- a/include/uapi/linux/netfilter_ipv6/ip6t_LOG.h +++ b/include/uapi/linux/netfilter_ipv6/ip6t_LOG.h @@ -17,4 +17,4 @@ struct ip6t_log_info { char prefix[30]; }; -#endif /*_IPT_LOG_H*/ +#endif /* _IP6T_LOG_H */ diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index d3e734bf8056..624527401d4d 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -649,6 +649,11 @@ static int bpf_iter_init_array_map(void *priv_data, seq_info->percpu_value_buf = value_buf; } + /* bpf_iter_attach_map() acquires a map uref, and the uref may be + * released before or in the middle of iterating map elements, so + * acquire an extra map uref for iterator. + */ + bpf_map_inc_with_uref(map); seq_info->map = map; return 0; } @@ -657,6 +662,7 @@ static void bpf_iter_fini_array_map(void *priv_data) { struct bpf_iter_seq_array_map_info *seq_info = priv_data; + bpf_map_put_with_uref(seq_info->map); kfree(seq_info->percpu_value_buf); } diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index 2726a5950cfa..24b755eca0b3 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -68,13 +68,18 @@ static void bpf_iter_done_stop(struct seq_file *seq) iter_priv->done_stop = true; } +static inline bool bpf_iter_target_support_resched(const struct bpf_iter_target_info *tinfo) +{ + return tinfo->reg_info->feature & BPF_ITER_RESCHED; +} + static bool bpf_iter_support_resched(struct seq_file *seq) { struct bpf_iter_priv_data *iter_priv; iter_priv = container_of(seq->private, struct bpf_iter_priv_data, target_private); - return iter_priv->tinfo->reg_info->feature & BPF_ITER_RESCHED; + return bpf_iter_target_support_resched(iter_priv->tinfo); } /* maximum visited objects before bailing out */ @@ -537,6 +542,10 @@ int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr, if (!tinfo) return -ENOENT; + /* Only allow sleepable program for resched-able iterator */ + if (prog->aux->sleepable && !bpf_iter_target_support_resched(tinfo)) + return -EINVAL; + link = kzalloc(sizeof(*link), GFP_USER | __GFP_NOWARN); if (!link) return -ENOMEM; diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index da7578426a46..6c530a5e560a 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -311,12 +311,8 @@ static struct htab_elem *prealloc_lru_pop(struct bpf_htab *htab, void *key, struct htab_elem *l; if (node) { - u32 key_size = htab->map.key_size; - l = container_of(node, struct htab_elem, lru_node); - memcpy(l->key, key, key_size); - check_and_init_map_value(&htab->map, - l->key + round_up(key_size, 8)); + memcpy(l->key, key, htab->map.key_size); return l; } @@ -2064,6 +2060,7 @@ static int bpf_iter_init_hash_map(void *priv_data, seq_info->percpu_value_buf = value_buf; } + bpf_map_inc_with_uref(map); seq_info->map = map; seq_info->htab = container_of(map, struct bpf_htab, map); return 0; @@ -2073,6 +2070,7 @@ static void bpf_iter_fini_hash_map(void *priv_data) { struct bpf_iter_seq_hash_map_info *seq_info = priv_data; + bpf_map_put_with_uref(seq_info->map); kfree(seq_info->percpu_value_buf); } diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c index e2618fb5870e..85fa9dbfa8bf 100644 --- a/kernel/bpf/reuseport_array.c +++ b/kernel/bpf/reuseport_array.c @@ -21,14 +21,11 @@ static struct reuseport_array *reuseport_array(struct bpf_map *map) /* The caller must hold the reuseport_lock */ void bpf_sk_reuseport_detach(struct sock *sk) { - uintptr_t sk_user_data; + struct sock __rcu **socks; write_lock_bh(&sk->sk_callback_lock); - sk_user_data = (uintptr_t)sk->sk_user_data; - if (sk_user_data & SK_USER_DATA_BPF) { - struct sock __rcu **socks; - - socks = (void *)(sk_user_data & SK_USER_DATA_PTRMASK); + socks = __rcu_dereference_sk_user_data_with_flags(sk, SK_USER_DATA_BPF); + if (socks) { WRITE_ONCE(sk->sk_user_data, NULL); /* * Do not move this NULL assignment outside of diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 83c7136c5788..a4d40d98428a 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3886,6 +3886,7 @@ static int bpf_prog_get_info_by_fd(struct file *file, union bpf_attr __user *uattr) { struct bpf_prog_info __user *uinfo = u64_to_user_ptr(attr->info.info); + struct btf *attach_btf = bpf_prog_get_target_btf(prog); struct bpf_prog_info info; u32 info_len = attr->info.info_len; struct bpf_prog_kstats stats; @@ -4088,10 +4089,8 @@ static int bpf_prog_get_info_by_fd(struct file *file, if (prog->aux->btf) info.btf_id = btf_obj_id(prog->aux->btf); info.attach_btf_id = prog->aux->attach_btf_id; - if (prog->aux->attach_btf) - info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf); - else if (prog->aux->dst_prog) - info.attach_btf_obj_id = btf_obj_id(prog->aux->dst_prog->aux->attach_btf); + if (attach_btf) + info.attach_btf_obj_id = btf_obj_id(attach_btf); ulen = info.nr_func_info; info.nr_func_info = prog->aux->func_info_cnt; @@ -5072,9 +5071,6 @@ static bool syscall_prog_is_valid_access(int off, int size, BPF_CALL_3(bpf_sys_bpf, int, cmd, union bpf_attr *, attr, u32, attr_size) { - struct bpf_prog * __maybe_unused prog; - struct bpf_tramp_run_ctx __maybe_unused run_ctx; - switch (cmd) { case BPF_MAP_CREATE: case BPF_MAP_UPDATE_ELEM: @@ -5084,6 +5080,26 @@ BPF_CALL_3(bpf_sys_bpf, int, cmd, union bpf_attr *, attr, u32, attr_size) case BPF_LINK_CREATE: case BPF_RAW_TRACEPOINT_OPEN: break; + default: + return -EINVAL; + } + return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size); +} + + +/* To shut up -Wmissing-prototypes. + * This function is used by the kernel light skeleton + * to load bpf programs when modules are loaded or during kernel boot. + * See tools/lib/bpf/skel_internal.h + */ +int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size); + +int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size) +{ + struct bpf_prog * __maybe_unused prog; + struct bpf_tramp_run_ctx __maybe_unused run_ctx; + + switch (cmd) { #ifdef CONFIG_BPF_JIT /* __bpf_prog_enter_sleepable used by trampoline and JIT */ case BPF_PROG_TEST_RUN: if (attr->test.data_in || attr->test.data_out || @@ -5114,11 +5130,10 @@ BPF_CALL_3(bpf_sys_bpf, int, cmd, union bpf_attr *, attr, u32, attr_size) return 0; #endif default: - return -EINVAL; + return ____bpf_sys_bpf(cmd, attr, size); } - return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size); } -EXPORT_SYMBOL(bpf_sys_bpf); +EXPORT_SYMBOL(kern_sys_bpf); static const struct bpf_func_proto bpf_sys_bpf_proto = { .func = bpf_sys_bpf, diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index 0f532e6a717f..ff87e38af8a7 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -841,7 +841,10 @@ void bpf_trampoline_put(struct bpf_trampoline *tr) * multiple rcu callbacks. */ hlist_del(&tr->hlist); - kfree(tr->fops); + if (tr->fops) { + ftrace_free_filter(tr->fops); + kfree(tr->fops); + } kfree(tr); out: mutex_unlock(&trampoline_mutex); diff --git a/net/ax25/ax25_timer.c b/net/ax25/ax25_timer.c index 85865ebfdfa2..9f7cb0a7c73f 100644 --- a/net/ax25/ax25_timer.c +++ b/net/ax25/ax25_timer.c @@ -108,10 +108,12 @@ int ax25_t1timer_running(ax25_cb *ax25) unsigned long ax25_display_timer(struct timer_list *timer) { + long delta = timer->expires - jiffies; + if (!timer_pending(timer)) return 0; - return timer->expires - jiffies; + return max(0L, delta); } EXPORT_SYMBOL(ax25_display_timer); diff --git a/net/bluetooth/aosp.c b/net/bluetooth/aosp.c index 432ae3aac9e3..1d67836e95e1 100644 --- a/net/bluetooth/aosp.c +++ b/net/bluetooth/aosp.c @@ -54,7 +54,10 @@ void aosp_do_open(struct hci_dev *hdev) /* LE Get Vendor Capabilities Command */ skb = __hci_cmd_sync(hdev, hci_opcode_pack(0x3f, 0x153), 0, NULL, HCI_CMD_TIMEOUT); - if (IS_ERR(skb)) { + if (IS_ERR_OR_NULL(skb)) { + if (!skb) + skb = ERR_PTR(-EIO); + bt_dev_err(hdev, "AOSP get vendor capabilities (%ld)", PTR_ERR(skb)); return; @@ -152,7 +155,10 @@ static int enable_quality_report(struct hci_dev *hdev) skb = __hci_cmd_sync(hdev, BQR_OPCODE, sizeof(cp), &cp, HCI_CMD_TIMEOUT); - if (IS_ERR(skb)) { + if (IS_ERR_OR_NULL(skb)) { + if (!skb) + skb = ERR_PTR(-EIO); + bt_dev_err(hdev, "Enabling Android BQR failed (%ld)", PTR_ERR(skb)); return PTR_ERR(skb); @@ -171,7 +177,10 @@ static int disable_quality_report(struct hci_dev *hdev) skb = __hci_cmd_sync(hdev, BQR_OPCODE, sizeof(cp), &cp, HCI_CMD_TIMEOUT); - if (IS_ERR(skb)) { + if (IS_ERR_OR_NULL(skb)) { + if (!skb) + skb = ERR_PTR(-EIO); + bt_dev_err(hdev, "Disabling Android BQR failed (%ld)", PTR_ERR(skb)); return PTR_ERR(skb); diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index f54864e19866..9777e7b109ee 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -1551,8 +1551,8 @@ static void cis_add(struct iso_list_data *d, struct bt_iso_qos *qos) cis->cis_id = qos->cis; cis->c_sdu = cpu_to_le16(qos->out.sdu); cis->p_sdu = cpu_to_le16(qos->in.sdu); - cis->c_phy = qos->out.phy; - cis->p_phy = qos->in.phy; + cis->c_phy = qos->out.phy ? qos->out.phy : qos->in.phy; + cis->p_phy = qos->in.phy ? qos->in.phy : qos->out.phy; cis->c_rtn = qos->out.rtn; cis->p_rtn = qos->in.rtn; @@ -1735,13 +1735,6 @@ struct hci_conn *hci_bind_cis(struct hci_dev *hdev, bdaddr_t *dst, if (!qos->in.latency) qos->in.latency = qos->out.latency; - /* Mirror PHYs that are disabled as SDU will be set to 0 */ - if (!qos->in.phy) - qos->in.phy = qos->out.phy; - - if (!qos->out.phy) - qos->out.phy = qos->in.phy; - if (!hci_le_set_cig_params(cis, qos)) { hci_conn_drop(cis); return ERR_PTR(-EINVAL); diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index ea33dd0cd478..485c814cf44a 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -328,14 +328,17 @@ static u8 hci_cc_delete_stored_link_key(struct hci_dev *hdev, void *data, struct sk_buff *skb) { struct hci_rp_delete_stored_link_key *rp = data; + u16 num_keys; bt_dev_dbg(hdev, "status 0x%2.2x", rp->status); if (rp->status) return rp->status; - if (rp->num_keys <= hdev->stored_num_keys) - hdev->stored_num_keys -= le16_to_cpu(rp->num_keys); + num_keys = le16_to_cpu(rp->num_keys); + + if (num_keys <= hdev->stored_num_keys) + hdev->stored_num_keys -= num_keys; else hdev->stored_num_keys = 0; diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index ff09c353e64e..ced8ad4fed4f 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -44,6 +44,9 @@ static void iso_sock_kill(struct sock *sk); /* ----- ISO socket info ----- */ #define iso_pi(sk) ((struct iso_pinfo *)sk) +#define EIR_SERVICE_DATA_LENGTH 4 +#define BASE_MAX_LENGTH (HCI_MAX_PER_AD_LENGTH - EIR_SERVICE_DATA_LENGTH) + struct iso_pinfo { struct bt_sock bt; bdaddr_t src; @@ -57,7 +60,7 @@ struct iso_pinfo { __u32 flags; struct bt_iso_qos qos; __u8 base_len; - __u8 base[HCI_MAX_PER_AD_LENGTH]; + __u8 base[BASE_MAX_LENGTH]; struct iso_conn *conn; }; @@ -370,15 +373,24 @@ done: return err; } +static struct bt_iso_qos *iso_sock_get_qos(struct sock *sk) +{ + if (sk->sk_state == BT_CONNECTED || sk->sk_state == BT_CONNECT2) + return &iso_pi(sk)->conn->hcon->iso_qos; + + return &iso_pi(sk)->qos; +} + static int iso_send_frame(struct sock *sk, struct sk_buff *skb) { struct iso_conn *conn = iso_pi(sk)->conn; + struct bt_iso_qos *qos = iso_sock_get_qos(sk); struct hci_iso_data_hdr *hdr; int len = 0; BT_DBG("sk %p len %d", sk, skb->len); - if (skb->len > iso_pi(sk)->qos.out.sdu) + if (skb->len > qos->out.sdu) return -EMSGSIZE; len = skb->len; @@ -1177,8 +1189,10 @@ static int iso_sock_setsockopt(struct socket *sock, int level, int optname, } len = min_t(unsigned int, sizeof(qos), optlen); - if (len != sizeof(qos)) - return -EINVAL; + if (len != sizeof(qos)) { + err = -EINVAL; + break; + } memset(&qos, 0, sizeof(qos)); @@ -1233,7 +1247,7 @@ static int iso_sock_getsockopt(struct socket *sock, int level, int optname, { struct sock *sk = sock->sk; int len, err = 0; - struct bt_iso_qos qos; + struct bt_iso_qos *qos; u8 base_len; u8 *base; @@ -1246,7 +1260,7 @@ static int iso_sock_getsockopt(struct socket *sock, int level, int optname, switch (optname) { case BT_DEFER_SETUP: - if (sk->sk_state != BT_BOUND && sk->sk_state != BT_LISTEN) { + if (sk->sk_state == BT_CONNECTED) { err = -EINVAL; break; } @@ -1258,13 +1272,10 @@ static int iso_sock_getsockopt(struct socket *sock, int level, int optname, break; case BT_ISO_QOS: - if (sk->sk_state == BT_CONNECTED || sk->sk_state == BT_CONNECT2) - qos = iso_pi(sk)->conn->hcon->iso_qos; - else - qos = iso_pi(sk)->qos; + qos = iso_sock_get_qos(sk); - len = min_t(unsigned int, len, sizeof(qos)); - if (copy_to_user(optval, (char *)&qos, len)) + len = min_t(unsigned int, len, sizeof(*qos)); + if (copy_to_user(optval, qos, len)) err = -EFAULT; break; diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 77c0aac14539..cbe0cae73434 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -1970,11 +1970,11 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm, bdaddr_t *dst, u8 link_type) { - struct l2cap_chan *c, *c1 = NULL; + struct l2cap_chan *c, *tmp, *c1 = NULL; read_lock(&chan_list_lock); - list_for_each_entry(c, &chan_list, global_l) { + list_for_each_entry_safe(c, tmp, &chan_list, global_l) { if (state && c->state != state) continue; @@ -1993,11 +1993,10 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm, dst_match = !bacmp(&c->dst, dst); if (src_match && dst_match) { c = l2cap_chan_hold_unless_zero(c); - if (!c) - continue; - - read_unlock(&chan_list_lock); - return c; + if (c) { + read_unlock(&chan_list_lock); + return c; + } } /* Closest match */ diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index 646d10401b80..6e31023b84f5 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -3819,7 +3819,7 @@ static int set_blocked_keys(struct sock *sk, struct hci_dev *hdev, void *data, hci_blocked_keys_clear(hdev); - for (i = 0; i < keys->key_count; ++i) { + for (i = 0; i < key_count; ++i) { struct blocked_key *b = kzalloc(sizeof(*b), GFP_KERNEL); if (!b) { @@ -4624,8 +4624,7 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data, u32 current_flags = __le32_to_cpu(cp->current_flags); bt_dev_dbg(hdev, "Set device flags %pMR (type 0x%x) = 0x%x", - &cp->addr.bdaddr, cp->addr.type, - __le32_to_cpu(current_flags)); + &cp->addr.bdaddr, cp->addr.type, current_flags); // We should take hci_dev_lock() early, I think.. conn_flags can change supported_flags = hdev->conn_flags; @@ -8936,6 +8935,8 @@ void mgmt_index_removed(struct hci_dev *hdev) HCI_MGMT_EXT_INDEX_EVENTS); /* Cancel any remaining timed work */ + if (!hci_dev_test_flag(hdev, HCI_MGMT)) + return; cancel_delayed_work_sync(&hdev->discov_off); cancel_delayed_work_sync(&hdev->service_cache); cancel_delayed_work_sync(&hdev->rpa_expired); diff --git a/net/bluetooth/msft.c b/net/bluetooth/msft.c index 14975769f678..bee6a4c656be 100644 --- a/net/bluetooth/msft.c +++ b/net/bluetooth/msft.c @@ -120,7 +120,10 @@ static bool read_supported_features(struct hci_dev *hdev, skb = __hci_cmd_sync(hdev, hdev->msft_opcode, sizeof(cp), &cp, HCI_CMD_TIMEOUT); - if (IS_ERR(skb)) { + if (IS_ERR_OR_NULL(skb)) { + if (!skb) + skb = ERR_PTR(-EIO); + bt_dev_err(hdev, "Failed to read MSFT supported features (%ld)", PTR_ERR(skb)); return false; @@ -319,8 +322,11 @@ static int msft_remove_monitor_sync(struct hci_dev *hdev, skb = __hci_cmd_sync(hdev, hdev->msft_opcode, sizeof(cp), &cp, HCI_CMD_TIMEOUT); - if (IS_ERR(skb)) + if (IS_ERR_OR_NULL(skb)) { + if (!skb) + return -EIO; return PTR_ERR(skb); + } return msft_le_cancel_monitor_advertisement_cb(hdev, hdev->msft_opcode, monitor, skb); @@ -432,8 +438,11 @@ static int msft_add_monitor_sync(struct hci_dev *hdev, HCI_CMD_TIMEOUT); kfree(cp); - if (IS_ERR(skb)) + if (IS_ERR_OR_NULL(skb)) { + if (!skb) + return -EIO; return PTR_ERR(skb); + } return msft_le_monitor_advertisement_cb(hdev, hdev->msft_opcode, monitor, skb); diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index cbc9cd5058cb..d11209367dd0 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -1628,6 +1628,7 @@ static int __init bpf_prog_test_run_init(void) int ret; ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_prog_test_kfunc_set); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_prog_test_kfunc_set); return ret ?: register_btf_id_dtor_kfuncs(bpf_prog_test_dtor_kfunc, ARRAY_SIZE(bpf_prog_test_dtor_kfunc), THIS_MODULE); diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index f5ecfdcf57b2..b670ba03a675 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -178,7 +178,10 @@ activate_next: if (!first) return; - if (WARN_ON_ONCE(j1939_session_activate(first))) { + if (j1939_session_activate(first)) { + netdev_warn_once(first->priv->ndev, + "%s: 0x%p: Identical session is already activated.\n", + __func__, first); first->err = -EBUSY; goto activate_next; } else { diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c index 307ee1174a6e..d7d86c944d76 100644 --- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -260,6 +260,8 @@ static void __j1939_session_drop(struct j1939_session *session) static void j1939_session_destroy(struct j1939_session *session) { + struct sk_buff *skb; + if (session->transmission) { if (session->err) j1939_sk_errqueue(session, J1939_ERRQUEUE_TX_ABORT); @@ -274,7 +276,11 @@ static void j1939_session_destroy(struct j1939_session *session) WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry)); WARN_ON_ONCE(!list_empty(&session->active_session_list_entry)); - skb_queue_purge(&session->skb_queue); + while ((skb = skb_dequeue(&session->skb_queue)) != NULL) { + /* drop ref taken in j1939_session_skb_queue() */ + skb_unref(skb); + kfree_skb(skb); + } __j1939_session_drop(session); j1939_priv_put(session->priv); kfree(session); diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c index a25ec93729b9..1b7f385643b4 100644 --- a/net/core/bpf_sk_storage.c +++ b/net/core/bpf_sk_storage.c @@ -875,10 +875,18 @@ static int bpf_iter_init_sk_storage_map(void *priv_data, { struct bpf_iter_seq_sk_storage_map_info *seq_info = priv_data; + bpf_map_inc_with_uref(aux->map); seq_info->map = aux->map; return 0; } +static void bpf_iter_fini_sk_storage_map(void *priv_data) +{ + struct bpf_iter_seq_sk_storage_map_info *seq_info = priv_data; + + bpf_map_put_with_uref(seq_info->map); +} + static int bpf_iter_attach_map(struct bpf_prog *prog, union bpf_iter_link_info *linfo, struct bpf_iter_aux_info *aux) @@ -896,7 +904,7 @@ static int bpf_iter_attach_map(struct bpf_prog *prog, if (map->map_type != BPF_MAP_TYPE_SK_STORAGE) goto put_map; - if (prog->aux->max_rdonly_access > map->value_size) { + if (prog->aux->max_rdwr_access > map->value_size) { err = -EACCES; goto put_map; } @@ -924,7 +932,7 @@ static const struct seq_operations bpf_sk_storage_map_seq_ops = { static const struct bpf_iter_seq_info iter_seq_info = { .seq_ops = &bpf_sk_storage_map_seq_ops, .init_seq_private = bpf_iter_init_sk_storage_map, - .fini_seq_private = NULL, + .fini_seq_private = bpf_iter_fini_sk_storage_map, .seq_priv_size = sizeof(struct bpf_iter_seq_sk_storage_map_info), }; diff --git a/net/core/devlink.c b/net/core/devlink.c index 5da5c7cca98a..b50bcc18b8d9 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -5147,7 +5147,7 @@ static int devlink_param_get(struct devlink *devlink, const struct devlink_param *param, struct devlink_param_gset_ctx *ctx) { - if (!param->get) + if (!param->get || devlink->reload_failed) return -EOPNOTSUPP; return param->get(devlink, param->id, ctx); } @@ -5156,7 +5156,7 @@ static int devlink_param_set(struct devlink *devlink, const struct devlink_param *param, struct devlink_param_gset_ctx *ctx) { - if (!param->set) + if (!param->set || devlink->reload_failed) return -EOPNOTSUPP; return param->set(devlink, param->id, ctx); } diff --git a/net/core/filter.c b/net/core/filter.c index 5669248aff25..e8508aaafd27 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5063,7 +5063,10 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname, case SO_RCVLOWAT: if (val < 0) val = INT_MAX; - WRITE_ONCE(sk->sk_rcvlowat, val ? : 1); + if (sk->sk_socket && sk->sk_socket->ops->set_rcvlowat) + ret = sk->sk_socket->ops->set_rcvlowat(sk, val); + else + WRITE_ONCE(sk->sk_rcvlowat, val ? : 1); break; case SO_MARK: if (sk->sk_mark != val) { diff --git a/net/core/skmsg.c b/net/core/skmsg.c index cf3c24c8610d..f47338d89d5d 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -738,7 +738,9 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED); refcount_set(&psock->refcnt, 1); - rcu_assign_sk_user_data_nocopy(sk, psock); + __rcu_assign_sk_user_data_with_flags(sk, psock, + SK_USER_DATA_NOCOPY | + SK_USER_DATA_PSOCK); sock_hold(sk); out: diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 028813dfecb0..9a9fb9487d63 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -783,13 +783,22 @@ static int sock_map_init_seq_private(void *priv_data, { struct sock_map_seq_info *info = priv_data; + bpf_map_inc_with_uref(aux->map); info->map = aux->map; return 0; } +static void sock_map_fini_seq_private(void *priv_data) +{ + struct sock_map_seq_info *info = priv_data; + + bpf_map_put_with_uref(info->map); +} + static const struct bpf_iter_seq_info sock_map_iter_seq_info = { .seq_ops = &sock_map_seq_ops, .init_seq_private = sock_map_init_seq_private, + .fini_seq_private = sock_map_fini_seq_private, .seq_priv_size = sizeof(struct sock_map_seq_info), }; @@ -1369,18 +1378,27 @@ static const struct seq_operations sock_hash_seq_ops = { }; static int sock_hash_init_seq_private(void *priv_data, - struct bpf_iter_aux_info *aux) + struct bpf_iter_aux_info *aux) { struct sock_hash_seq_info *info = priv_data; + bpf_map_inc_with_uref(aux->map); info->map = aux->map; info->htab = container_of(aux->map, struct bpf_shtab, map); return 0; } +static void sock_hash_fini_seq_private(void *priv_data) +{ + struct sock_hash_seq_info *info = priv_data; + + bpf_map_put_with_uref(info->map); +} + static const struct bpf_iter_seq_info sock_hash_iter_seq_info = { .seq_ops = &sock_hash_seq_ops, .init_seq_private = sock_hash_init_seq_private, + .fini_seq_private = sock_hash_fini_seq_private, .seq_priv_size = sizeof(struct sock_hash_seq_info), }; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 897ca4f9b791..f152e51242cb 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1311,8 +1311,7 @@ struct dst_entry *ip6_dst_lookup_tunnel(struct sk_buff *skb, fl6.daddr = info->key.u.ipv6.dst; fl6.saddr = info->key.u.ipv6.src; prio = info->key.tos; - fl6.flowlabel = ip6_make_flowinfo(RT_TOS(prio), - info->key.label); + fl6.flowlabel = ip6_make_flowinfo(prio, info->key.label); dst = ipv6_stub->ipv6_dst_lookup_flow(net, sock->sk, &fl6, NULL); diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 2cd4a8d3b30a..b7de5e46fdd8 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -1614,7 +1614,7 @@ static void __destroy_attrs(unsigned long parsed_attrs, int max_parsed, * callback. If the callback is not available, then we skip to the next * attribute; otherwise, we call the destroy() callback. */ - for (i = 0; i < max_parsed; ++i) { + for (i = SEG6_LOCAL_SRH; i < max_parsed; ++i) { if (!(parsed_attrs & SEG6_F_ATTR(i))) continue; @@ -1643,7 +1643,7 @@ static int parse_nla_optional_attrs(struct nlattr **attrs, struct seg6_action_param *param; int err, i; - for (i = 0; i < SEG6_LOCAL_MAX + 1; ++i) { + for (i = SEG6_LOCAL_SRH; i < SEG6_LOCAL_MAX + 1; ++i) { if (!(desc->optattrs & SEG6_F_ATTR(i)) || !attrs[i]) continue; @@ -1742,7 +1742,7 @@ static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt) } /* parse the required attributes */ - for (i = 0; i < SEG6_LOCAL_MAX + 1; i++) { + for (i = SEG6_LOCAL_SRH; i < SEG6_LOCAL_MAX + 1; i++) { if (desc->attrs & SEG6_F_ATTR(i)) { if (!attrs[i]) return -EINVAL; @@ -1847,7 +1847,7 @@ static int seg6_local_fill_encap(struct sk_buff *skb, attrs = slwt->desc->attrs | slwt->parsed_optattrs; - for (i = 0; i < SEG6_LOCAL_MAX + 1; i++) { + for (i = SEG6_LOCAL_SRH; i < SEG6_LOCAL_MAX + 1; i++) { if (attrs & SEG6_F_ATTR(i)) { param = &seg6_action_params[i]; err = param->put(skb, slwt); @@ -1927,7 +1927,7 @@ static int seg6_local_cmp_encap(struct lwtunnel_state *a, if (attrs_a != attrs_b) return 1; - for (i = 0; i < SEG6_LOCAL_MAX + 1; i++) { + for (i = SEG6_LOCAL_SRH; i < SEG6_LOCAL_MAX + 1; i++) { if (attrs_a & SEG6_F_ATTR(i)) { param = &seg6_action_params[i]; if (param->cmp(slwt_a, slwt_b)) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index a3f1c1461874..da4257504fad 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1240,6 +1240,9 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, info->limit > dfrag->data_len)) return 0; + if (unlikely(!__tcp_can_send(ssk))) + return -EAGAIN; + /* compute send limit */ info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags); copy = info->size_goal; @@ -1413,7 +1416,8 @@ static struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) if (__mptcp_check_fallback(msk)) { if (!msk->first) return NULL; - return sk_stream_memory_free(msk->first) ? msk->first : NULL; + return __tcp_can_send(msk->first) && + sk_stream_memory_free(msk->first) ? msk->first : NULL; } /* re-use last subflow, if the burst allow that */ @@ -1564,6 +1568,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); if (ret <= 0) { + if (ret == -EAGAIN) + continue; mptcp_push_release(ssk, &info); goto out; } @@ -2769,30 +2775,16 @@ static void __mptcp_wr_shutdown(struct sock *sk) static void __mptcp_destroy_sock(struct sock *sk) { - struct mptcp_subflow_context *subflow, *tmp; struct mptcp_sock *msk = mptcp_sk(sk); - LIST_HEAD(conn_list); pr_debug("msk=%p", msk); might_sleep(); - /* join list will be eventually flushed (with rst) at sock lock release time*/ - list_splice_init(&msk->conn_list, &conn_list); - mptcp_stop_timer(sk); sk_stop_timer(sk, &sk->sk_timer); msk->pm.status = 0; - /* clears msk->subflow, allowing the following loop to close - * even the initial subflow - */ - mptcp_dispose_initial_subflow(msk); - list_for_each_entry_safe(subflow, tmp, &conn_list, node) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - __mptcp_close_ssk(sk, ssk, subflow, 0); - } - sk->sk_prot->destroy(sk); WARN_ON_ONCE(msk->rmem_fwd_alloc); @@ -2884,24 +2876,20 @@ static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk) static int mptcp_disconnect(struct sock *sk, int flags) { - struct mptcp_subflow_context *subflow, *tmp; struct mptcp_sock *msk = mptcp_sk(sk); inet_sk_state_store(sk, TCP_CLOSE); - list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - - __mptcp_close_ssk(sk, ssk, subflow, MPTCP_CF_FASTCLOSE); - } - mptcp_stop_timer(sk); sk_stop_timer(sk, &sk->sk_timer); if (mptcp_sk(sk)->token) mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL); - mptcp_destroy_common(msk); + /* msk->subflow is still intact, the following will not free the first + * subflow + */ + mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE); msk->last_snd = NULL; WRITE_ONCE(msk->flags, 0); msk->cb_flags = 0; @@ -3051,12 +3039,17 @@ out: return newsk; } -void mptcp_destroy_common(struct mptcp_sock *msk) +void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags) { + struct mptcp_subflow_context *subflow, *tmp; struct sock *sk = (struct sock *)msk; __mptcp_clear_xmit(sk); + /* join list will be eventually flushed (with rst) at sock lock release time */ + list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) + __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags); + /* move to sk_receive_queue, sk_stream_kill_queues will purge it */ mptcp_data_lock(sk); skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue); @@ -3078,7 +3071,11 @@ static void mptcp_destroy(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk); - mptcp_destroy_common(msk); + /* clears msk->subflow, allowing the following to close + * even the initial subflow + */ + mptcp_dispose_initial_subflow(msk); + mptcp_destroy_common(msk, 0); sk_sockets_allocated_dec(sk); } diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 5d6043c16b09..132d50833df1 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -624,16 +624,19 @@ void mptcp_info2sockaddr(const struct mptcp_addr_info *info, struct sockaddr_storage *addr, unsigned short family); -static inline bool __mptcp_subflow_active(struct mptcp_subflow_context *subflow) +static inline bool __tcp_can_send(const struct sock *ssk) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + /* only send if our side has not closed yet */ + return ((1 << inet_sk_state_load(ssk)) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)); +} +static inline bool __mptcp_subflow_active(struct mptcp_subflow_context *subflow) +{ /* can't send if JOIN hasn't completed yet (i.e. is usable for mptcp) */ if (subflow->request_join && !subflow->fully_established) return false; - /* only send if our side has not closed yet */ - return ((1 << ssk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)); + return __tcp_can_send(mptcp_subflow_tcp_sock(subflow)); } void mptcp_subflow_set_active(struct mptcp_subflow_context *subflow); @@ -717,7 +720,7 @@ static inline void mptcp_write_space(struct sock *sk) } } -void mptcp_destroy_common(struct mptcp_sock *msk); +void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags); #define MPTCP_TOKEN_MAX_RETRIES 4 diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 901c763dcdbb..c7d49fb6e7bd 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -621,7 +621,8 @@ static void mptcp_sock_destruct(struct sock *sk) sock_orphan(sk); } - mptcp_destroy_common(mptcp_sk(sk)); + /* We don't need to clear msk->subflow, as it's still NULL at this point */ + mptcp_destroy_common(mptcp_sk(sk), 0); inet_sock_destruct(sk); } diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index df6abbfe0079..22f15ebf6045 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -736,9 +736,8 @@ config NF_FLOW_TABLE config NF_FLOW_TABLE_PROCFS bool "Supply flow table statistics in procfs" - default y + depends on NF_FLOW_TABLE depends on PROC_FS - depends on SYSCTL help This option enables for the flow table offload statistics to be shown in procfs under net/netfilter/nf_flowtable. diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 9f976b11d896..3cc88998b879 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -153,6 +153,7 @@ static struct nft_trans *nft_trans_alloc_gfp(const struct nft_ctx *ctx, if (trans == NULL) return NULL; + INIT_LIST_HEAD(&trans->list); trans->msg_type = msg_type; trans->ctx = *ctx; @@ -2472,6 +2473,7 @@ err: } static struct nft_chain *nft_chain_lookup_byid(const struct net *net, + const struct nft_table *table, const struct nlattr *nla) { struct nftables_pernet *nft_net = nft_pernet(net); @@ -2482,6 +2484,7 @@ static struct nft_chain *nft_chain_lookup_byid(const struct net *net, struct nft_chain *chain = trans->ctx.chain; if (trans->msg_type == NFT_MSG_NEWCHAIN && + chain->table == table && id == nft_trans_chain_id(trans)) return chain; } @@ -3371,6 +3374,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table) } static struct nft_rule *nft_rule_lookup_byid(const struct net *net, + const struct nft_chain *chain, const struct nlattr *nla); #define NFT_RULE_MAXEXPRS 128 @@ -3417,7 +3421,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, return -EOPNOTSUPP; } else if (nla[NFTA_RULE_CHAIN_ID]) { - chain = nft_chain_lookup_byid(net, nla[NFTA_RULE_CHAIN_ID]); + chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID]); if (IS_ERR(chain)) { NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN_ID]); return PTR_ERR(chain); @@ -3459,7 +3463,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, return PTR_ERR(old_rule); } } else if (nla[NFTA_RULE_POSITION_ID]) { - old_rule = nft_rule_lookup_byid(net, nla[NFTA_RULE_POSITION_ID]); + old_rule = nft_rule_lookup_byid(net, chain, nla[NFTA_RULE_POSITION_ID]); if (IS_ERR(old_rule)) { NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION_ID]); return PTR_ERR(old_rule); @@ -3604,6 +3608,7 @@ err_release_expr: } static struct nft_rule *nft_rule_lookup_byid(const struct net *net, + const struct nft_chain *chain, const struct nlattr *nla) { struct nftables_pernet *nft_net = nft_pernet(net); @@ -3614,6 +3619,7 @@ static struct nft_rule *nft_rule_lookup_byid(const struct net *net, struct nft_rule *rule = nft_trans_rule(trans); if (trans->msg_type == NFT_MSG_NEWRULE && + trans->ctx.chain == chain && id == nft_trans_rule_id(trans)) return rule; } @@ -3663,7 +3669,7 @@ static int nf_tables_delrule(struct sk_buff *skb, const struct nfnl_info *info, err = nft_delrule(&ctx, rule); } else if (nla[NFTA_RULE_ID]) { - rule = nft_rule_lookup_byid(net, nla[NFTA_RULE_ID]); + rule = nft_rule_lookup_byid(net, chain, nla[NFTA_RULE_ID]); if (IS_ERR(rule)) { NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_ID]); return PTR_ERR(rule); @@ -3842,6 +3848,7 @@ static struct nft_set *nft_set_lookup_byhandle(const struct nft_table *table, } static struct nft_set *nft_set_lookup_byid(const struct net *net, + const struct nft_table *table, const struct nlattr *nla, u8 genmask) { struct nftables_pernet *nft_net = nft_pernet(net); @@ -3853,6 +3860,7 @@ static struct nft_set *nft_set_lookup_byid(const struct net *net, struct nft_set *set = nft_trans_set(trans); if (id == nft_trans_set_id(trans) && + set->table == table && nft_active_genmask(set, genmask)) return set; } @@ -3873,7 +3881,7 @@ struct nft_set *nft_set_lookup_global(const struct net *net, if (!nla_set_id) return set; - set = nft_set_lookup_byid(net, nla_set_id, genmask); + set = nft_set_lookup_byid(net, table, nla_set_id, genmask); } return set; } @@ -5195,19 +5203,13 @@ static int nft_setelem_parse_flags(const struct nft_set *set, static int nft_setelem_parse_key(struct nft_ctx *ctx, struct nft_set *set, struct nft_data *key, struct nlattr *attr) { - struct nft_data_desc desc; - int err; - - err = nft_data_init(ctx, key, NFT_DATA_VALUE_MAXLEN, &desc, attr); - if (err < 0) - return err; - - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) { - nft_data_release(key, desc.type); - return -EINVAL; - } + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = NFT_DATA_VALUE_MAXLEN, + .len = set->klen, + }; - return 0; + return nft_data_init(ctx, key, &desc, attr); } static int nft_setelem_parse_data(struct nft_ctx *ctx, struct nft_set *set, @@ -5216,24 +5218,18 @@ static int nft_setelem_parse_data(struct nft_ctx *ctx, struct nft_set *set, struct nlattr *attr) { u32 dtype; - int err; - - err = nft_data_init(ctx, data, NFT_DATA_VALUE_MAXLEN, desc, attr); - if (err < 0) - return err; if (set->dtype == NFT_DATA_VERDICT) dtype = NFT_DATA_VERDICT; else dtype = NFT_DATA_VALUE; - if (dtype != desc->type || - set->dlen != desc->len) { - nft_data_release(data, desc->type); - return -EINVAL; - } + desc->type = dtype; + desc->size = NFT_DATA_VALUE_MAXLEN; + desc->len = set->dlen; + desc->flags = NFT_DATA_DESC_SETELEM; - return 0; + return nft_data_init(ctx, data, desc, attr); } static void *nft_setelem_catchall_get(const struct net *net, @@ -5467,6 +5463,27 @@ err_set_elem_expr: return ERR_PTR(err); } +static int nft_set_ext_check(const struct nft_set_ext_tmpl *tmpl, u8 id, u32 len) +{ + len += nft_set_ext_types[id].len; + if (len > tmpl->ext_len[id] || + len > U8_MAX) + return -1; + + return 0; +} + +static int nft_set_ext_memcpy(const struct nft_set_ext_tmpl *tmpl, u8 id, + void *to, const void *from, u32 len) +{ + if (nft_set_ext_check(tmpl, id, len) < 0) + return -1; + + memcpy(to, from, len); + + return 0; +} + void *nft_set_elem_init(const struct nft_set *set, const struct nft_set_ext_tmpl *tmpl, const u32 *key, const u32 *key_end, @@ -5477,17 +5494,26 @@ void *nft_set_elem_init(const struct nft_set *set, elem = kzalloc(set->ops->elemsize + tmpl->len, gfp); if (elem == NULL) - return NULL; + return ERR_PTR(-ENOMEM); ext = nft_set_elem_ext(set, elem); nft_set_ext_init(ext, tmpl); - if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY)) - memcpy(nft_set_ext_key(ext), key, set->klen); - if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) - memcpy(nft_set_ext_key_end(ext), key_end, set->klen); - if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) - memcpy(nft_set_ext_data(ext), data, set->dlen); + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY) && + nft_set_ext_memcpy(tmpl, NFT_SET_EXT_KEY, + nft_set_ext_key(ext), key, set->klen) < 0) + goto err_ext_check; + + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) && + nft_set_ext_memcpy(tmpl, NFT_SET_EXT_KEY_END, + nft_set_ext_key_end(ext), key_end, set->klen) < 0) + goto err_ext_check; + + if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && + nft_set_ext_memcpy(tmpl, NFT_SET_EXT_DATA, + nft_set_ext_data(ext), data, set->dlen) < 0) + goto err_ext_check; + if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) { *nft_set_ext_expiration(ext) = get_jiffies_64() + expiration; if (expiration == 0) @@ -5497,6 +5523,11 @@ void *nft_set_elem_init(const struct nft_set *set, *nft_set_ext_timeout(ext) = timeout; return elem; + +err_ext_check: + kfree(elem); + + return ERR_PTR(-EINVAL); } static void __nft_set_elem_expr_destroy(const struct nft_ctx *ctx, @@ -5584,14 +5615,25 @@ err_expr: } static int nft_set_elem_expr_setup(struct nft_ctx *ctx, + const struct nft_set_ext_tmpl *tmpl, const struct nft_set_ext *ext, struct nft_expr *expr_array[], u32 num_exprs) { struct nft_set_elem_expr *elem_expr = nft_set_ext_expr(ext); + u32 len = sizeof(struct nft_set_elem_expr); struct nft_expr *expr; int i, err; + if (num_exprs == 0) + return 0; + + for (i = 0; i < num_exprs; i++) + len += expr_array[i]->ops->size; + + if (nft_set_ext_check(tmpl, NFT_SET_EXT_EXPRESSIONS, len) < 0) + return -EINVAL; + for (i = 0; i < num_exprs; i++) { expr = nft_setelem_expr_at(elem_expr, elem_expr->size); err = nft_expr_clone(expr, expr_array[i]); @@ -6054,17 +6096,23 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, } } - err = -ENOMEM; elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, elem.key_end.val.data, elem.data.val.data, timeout, expiration, GFP_KERNEL_ACCOUNT); - if (elem.priv == NULL) + if (IS_ERR(elem.priv)) { + err = PTR_ERR(elem.priv); goto err_parse_data; + } ext = nft_set_elem_ext(set, elem.priv); if (flags) *nft_set_ext_flags(ext) = flags; + if (ulen > 0) { + if (nft_set_ext_check(&tmpl, NFT_SET_EXT_USERDATA, ulen) < 0) { + err = -EINVAL; + goto err_elem_userdata; + } udata = nft_set_ext_userdata(ext); udata->len = ulen - 1; nla_memcpy(&udata->data, nla[NFTA_SET_ELEM_USERDATA], ulen); @@ -6073,14 +6121,14 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, *nft_set_ext_obj(ext) = obj; obj->use++; } - err = nft_set_elem_expr_setup(ctx, ext, expr_array, num_exprs); + err = nft_set_elem_expr_setup(ctx, &tmpl, ext, expr_array, num_exprs); if (err < 0) - goto err_elem_expr; + goto err_elem_free; trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set); if (trans == NULL) { err = -ENOMEM; - goto err_elem_expr; + goto err_elem_free; } ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; @@ -6126,10 +6174,10 @@ err_set_full: nft_setelem_remove(ctx->net, set, &elem); err_element_clash: kfree(trans); -err_elem_expr: +err_elem_free: if (obj) obj->use--; - +err_elem_userdata: nf_tables_set_elem_destroy(ctx, set, elem.priv); err_parse_data: if (nla[NFTA_SET_ELEM_DATA] != NULL) @@ -6311,8 +6359,10 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, elem.key_end.val.data, NULL, 0, 0, GFP_KERNEL_ACCOUNT); - if (elem.priv == NULL) + if (IS_ERR(elem.priv)) { + err = PTR_ERR(elem.priv); goto fail_elem_key_end; + } ext = nft_set_elem_ext(set, elem.priv); if (flags) @@ -9605,7 +9655,7 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data, tb[NFTA_VERDICT_CHAIN], genmask); } else if (tb[NFTA_VERDICT_CHAIN_ID]) { - chain = nft_chain_lookup_byid(ctx->net, + chain = nft_chain_lookup_byid(ctx->net, ctx->table, tb[NFTA_VERDICT_CHAIN_ID]); if (IS_ERR(chain)) return PTR_ERR(chain); @@ -9617,6 +9667,9 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data, return PTR_ERR(chain); if (nft_is_base_chain(chain)) return -EOPNOTSUPP; + if (desc->flags & NFT_DATA_DESC_SETELEM && + chain->flags & NFT_CHAIN_BINDING) + return -EINVAL; chain->use++; data->verdict.chain = chain; @@ -9624,7 +9677,7 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data, } desc->len = sizeof(data->verdict); - desc->type = NFT_DATA_VERDICT; + return 0; } @@ -9677,20 +9730,25 @@ nla_put_failure: } static int nft_value_init(const struct nft_ctx *ctx, - struct nft_data *data, unsigned int size, - struct nft_data_desc *desc, const struct nlattr *nla) + struct nft_data *data, struct nft_data_desc *desc, + const struct nlattr *nla) { unsigned int len; len = nla_len(nla); if (len == 0) return -EINVAL; - if (len > size) + if (len > desc->size) return -EOVERFLOW; + if (desc->len) { + if (len != desc->len) + return -EINVAL; + } else { + desc->len = len; + } nla_memcpy(data->data, nla, len); - desc->type = NFT_DATA_VALUE; - desc->len = len; + return 0; } @@ -9710,7 +9768,6 @@ static const struct nla_policy nft_data_policy[NFTA_DATA_MAX + 1] = { * * @ctx: context of the expression using the data * @data: destination struct nft_data - * @size: maximum data length * @desc: data description * @nla: netlink attribute containing data * @@ -9720,24 +9777,35 @@ static const struct nla_policy nft_data_policy[NFTA_DATA_MAX + 1] = { * The caller can indicate that it only wants to accept data of type * NFT_DATA_VALUE by passing NULL for the ctx argument. */ -int nft_data_init(const struct nft_ctx *ctx, - struct nft_data *data, unsigned int size, +int nft_data_init(const struct nft_ctx *ctx, struct nft_data *data, struct nft_data_desc *desc, const struct nlattr *nla) { struct nlattr *tb[NFTA_DATA_MAX + 1]; int err; + if (WARN_ON_ONCE(!desc->size)) + return -EINVAL; + err = nla_parse_nested_deprecated(tb, NFTA_DATA_MAX, nla, nft_data_policy, NULL); if (err < 0) return err; - if (tb[NFTA_DATA_VALUE]) - return nft_value_init(ctx, data, size, desc, - tb[NFTA_DATA_VALUE]); - if (tb[NFTA_DATA_VERDICT] && ctx != NULL) - return nft_verdict_init(ctx, data, desc, tb[NFTA_DATA_VERDICT]); - return -EINVAL; + if (tb[NFTA_DATA_VALUE]) { + if (desc->type != NFT_DATA_VALUE) + return -EINVAL; + + err = nft_value_init(ctx, data, desc, tb[NFTA_DATA_VALUE]); + } else if (tb[NFTA_DATA_VERDICT] && ctx != NULL) { + if (desc->type != NFT_DATA_VERDICT) + return -EINVAL; + + err = nft_verdict_init(ctx, data, desc, tb[NFTA_DATA_VERDICT]); + } else { + err = -EINVAL; + } + + return err; } EXPORT_SYMBOL_GPL(nft_data_init); diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c index 3ddce24ac76d..cee3e4e905ec 100644 --- a/net/netfilter/nf_tables_core.c +++ b/net/netfilter/nf_tables_core.c @@ -34,25 +34,23 @@ static noinline void __nft_trace_packet(struct nft_traceinfo *info, nft_trace_notify(info); } -static inline void nft_trace_packet(struct nft_traceinfo *info, +static inline void nft_trace_packet(const struct nft_pktinfo *pkt, + struct nft_traceinfo *info, const struct nft_chain *chain, const struct nft_rule_dp *rule, enum nft_trace_types type) { if (static_branch_unlikely(&nft_trace_enabled)) { - const struct nft_pktinfo *pkt = info->pkt; - info->nf_trace = pkt->skb->nf_trace; info->rule = rule; __nft_trace_packet(info, chain, type); } } -static inline void nft_trace_copy_nftrace(struct nft_traceinfo *info) +static inline void nft_trace_copy_nftrace(const struct nft_pktinfo *pkt, + struct nft_traceinfo *info) { if (static_branch_unlikely(&nft_trace_enabled)) { - const struct nft_pktinfo *pkt = info->pkt; - if (info->trace) info->nf_trace = pkt->skb->nf_trace; } @@ -96,7 +94,6 @@ static noinline void __nft_trace_verdict(struct nft_traceinfo *info, const struct nft_chain *chain, const struct nft_regs *regs) { - const struct nft_pktinfo *pkt = info->pkt; enum nft_trace_types type; switch (regs->verdict.code) { @@ -110,7 +107,9 @@ static noinline void __nft_trace_verdict(struct nft_traceinfo *info, break; default: type = NFT_TRACETYPE_RULE; - info->nf_trace = pkt->skb->nf_trace; + + if (info->trace) + info->nf_trace = info->pkt->skb->nf_trace; break; } @@ -271,10 +270,10 @@ next_rule: switch (regs.verdict.code) { case NFT_BREAK: regs.verdict.code = NFT_CONTINUE; - nft_trace_copy_nftrace(&info); + nft_trace_copy_nftrace(pkt, &info); continue; case NFT_CONTINUE: - nft_trace_packet(&info, chain, rule, + nft_trace_packet(pkt, &info, chain, rule, NFT_TRACETYPE_RULE); continue; } @@ -318,7 +317,7 @@ next_rule: goto next_rule; } - nft_trace_packet(&info, basechain, NULL, NFT_TRACETYPE_POLICY); + nft_trace_packet(pkt, &info, basechain, NULL, NFT_TRACETYPE_POLICY); if (static_branch_unlikely(&nft_counters_enabled)) nft_update_chain_stats(basechain, pkt); diff --git a/net/netfilter/nft_bitwise.c b/net/netfilter/nft_bitwise.c index 83590afe3768..e6e402b247d0 100644 --- a/net/netfilter/nft_bitwise.c +++ b/net/netfilter/nft_bitwise.c @@ -93,7 +93,16 @@ static const struct nla_policy nft_bitwise_policy[NFTA_BITWISE_MAX + 1] = { static int nft_bitwise_init_bool(struct nft_bitwise *priv, const struct nlattr *const tb[]) { - struct nft_data_desc mask, xor; + struct nft_data_desc mask = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->mask), + .len = priv->len, + }; + struct nft_data_desc xor = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->xor), + .len = priv->len, + }; int err; if (tb[NFTA_BITWISE_DATA]) @@ -103,37 +112,30 @@ static int nft_bitwise_init_bool(struct nft_bitwise *priv, !tb[NFTA_BITWISE_XOR]) return -EINVAL; - err = nft_data_init(NULL, &priv->mask, sizeof(priv->mask), &mask, - tb[NFTA_BITWISE_MASK]); + err = nft_data_init(NULL, &priv->mask, &mask, tb[NFTA_BITWISE_MASK]); if (err < 0) return err; - if (mask.type != NFT_DATA_VALUE || mask.len != priv->len) { - err = -EINVAL; - goto err_mask_release; - } - err = nft_data_init(NULL, &priv->xor, sizeof(priv->xor), &xor, - tb[NFTA_BITWISE_XOR]); + err = nft_data_init(NULL, &priv->xor, &xor, tb[NFTA_BITWISE_XOR]); if (err < 0) - goto err_mask_release; - if (xor.type != NFT_DATA_VALUE || xor.len != priv->len) { - err = -EINVAL; - goto err_xor_release; - } + goto err_xor_err; return 0; -err_xor_release: - nft_data_release(&priv->xor, xor.type); -err_mask_release: +err_xor_err: nft_data_release(&priv->mask, mask.type); + return err; } static int nft_bitwise_init_shift(struct nft_bitwise *priv, const struct nlattr *const tb[]) { - struct nft_data_desc d; + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->data), + .len = sizeof(u32), + }; int err; if (tb[NFTA_BITWISE_MASK] || @@ -143,13 +145,12 @@ static int nft_bitwise_init_shift(struct nft_bitwise *priv, if (!tb[NFTA_BITWISE_DATA]) return -EINVAL; - err = nft_data_init(NULL, &priv->data, sizeof(priv->data), &d, - tb[NFTA_BITWISE_DATA]); + err = nft_data_init(NULL, &priv->data, &desc, tb[NFTA_BITWISE_DATA]); if (err < 0) return err; - if (d.type != NFT_DATA_VALUE || d.len != sizeof(u32) || - priv->data.data[0] >= BITS_PER_TYPE(u32)) { - nft_data_release(&priv->data, d.type); + + if (priv->data.data[0] >= BITS_PER_TYPE(u32)) { + nft_data_release(&priv->data, desc.type); return -EINVAL; } @@ -339,22 +340,21 @@ static const struct nft_expr_ops nft_bitwise_ops = { static int nft_bitwise_extract_u32_data(const struct nlattr * const tb, u32 *out) { - struct nft_data_desc desc; struct nft_data data; - int err = 0; + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = sizeof(data), + .len = sizeof(u32), + }; + int err; - err = nft_data_init(NULL, &data, sizeof(data), &desc, tb); + err = nft_data_init(NULL, &data, &desc, tb); if (err < 0) return err; - if (desc.type != NFT_DATA_VALUE || desc.len != sizeof(u32)) { - err = -EINVAL; - goto err; - } *out = data.data[0]; -err: - nft_data_release(&data, desc.type); - return err; + + return 0; } static int nft_bitwise_fast_init(const struct nft_ctx *ctx, diff --git a/net/netfilter/nft_cmp.c b/net/netfilter/nft_cmp.c index 777f09e4dc60..963cf831799c 100644 --- a/net/netfilter/nft_cmp.c +++ b/net/netfilter/nft_cmp.c @@ -73,20 +73,16 @@ static int nft_cmp_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) { struct nft_cmp_expr *priv = nft_expr_priv(expr); - struct nft_data_desc desc; + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->data), + }; int err; - err = nft_data_init(NULL, &priv->data, sizeof(priv->data), &desc, - tb[NFTA_CMP_DATA]); + err = nft_data_init(NULL, &priv->data, &desc, tb[NFTA_CMP_DATA]); if (err < 0) return err; - if (desc.type != NFT_DATA_VALUE) { - err = -EINVAL; - nft_data_release(&priv->data, desc.type); - return err; - } - err = nft_parse_register_load(tb[NFTA_CMP_SREG], &priv->sreg, desc.len); if (err < 0) return err; @@ -214,12 +210,14 @@ static int nft_cmp_fast_init(const struct nft_ctx *ctx, const struct nlattr * const tb[]) { struct nft_cmp_fast_expr *priv = nft_expr_priv(expr); - struct nft_data_desc desc; struct nft_data data; + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = sizeof(data), + }; int err; - err = nft_data_init(NULL, &data, sizeof(data), &desc, - tb[NFTA_CMP_DATA]); + err = nft_data_init(NULL, &data, &desc, tb[NFTA_CMP_DATA]); if (err < 0) return err; @@ -313,11 +311,13 @@ static int nft_cmp16_fast_init(const struct nft_ctx *ctx, const struct nlattr * const tb[]) { struct nft_cmp16_fast_expr *priv = nft_expr_priv(expr); - struct nft_data_desc desc; + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->data), + }; int err; - err = nft_data_init(NULL, &priv->data, sizeof(priv->data), &desc, - tb[NFTA_CMP_DATA]); + err = nft_data_init(NULL, &priv->data, &desc, tb[NFTA_CMP_DATA]); if (err < 0) return err; @@ -380,8 +380,11 @@ const struct nft_expr_ops nft_cmp16_fast_ops = { static const struct nft_expr_ops * nft_cmp_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) { - struct nft_data_desc desc; struct nft_data data; + struct nft_data_desc desc = { + .type = NFT_DATA_VALUE, + .size = sizeof(data), + }; enum nft_cmp_ops op; u8 sreg; int err; @@ -404,14 +407,10 @@ nft_cmp_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) return ERR_PTR(-EINVAL); } - err = nft_data_init(NULL, &data, sizeof(data), &desc, - tb[NFTA_CMP_DATA]); + err = nft_data_init(NULL, &data, &desc, tb[NFTA_CMP_DATA]); if (err < 0) return ERR_PTR(err); - if (desc.type != NFT_DATA_VALUE) - goto err1; - sreg = ntohl(nla_get_be32(tb[NFTA_CMP_SREG])); if (op == NFT_CMP_EQ || op == NFT_CMP_NEQ) { @@ -423,9 +422,6 @@ nft_cmp_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[]) return &nft_cmp16_fast_ops; } return &nft_cmp_ops; -err1: - nft_data_release(&data, desc.type); - return ERR_PTR(-EINVAL); } struct nft_expr_type nft_cmp_type __read_mostly = { diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c index 22f70b543fa2..6983e6ddeef9 100644 --- a/net/netfilter/nft_dynset.c +++ b/net/netfilter/nft_dynset.c @@ -60,7 +60,7 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr, ®s->data[priv->sreg_key], NULL, ®s->data[priv->sreg_data], timeout, 0, GFP_ATOMIC); - if (elem == NULL) + if (IS_ERR(elem)) goto err1; ext = nft_set_elem_ext(set, elem); diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c index b80f7b507349..5f28b21abc7d 100644 --- a/net/netfilter/nft_immediate.c +++ b/net/netfilter/nft_immediate.c @@ -29,20 +29,36 @@ static const struct nla_policy nft_immediate_policy[NFTA_IMMEDIATE_MAX + 1] = { [NFTA_IMMEDIATE_DATA] = { .type = NLA_NESTED }, }; +static enum nft_data_types nft_reg_to_type(const struct nlattr *nla) +{ + enum nft_data_types type; + u8 reg; + + reg = ntohl(nla_get_be32(nla)); + if (reg == NFT_REG_VERDICT) + type = NFT_DATA_VERDICT; + else + type = NFT_DATA_VALUE; + + return type; +} + static int nft_immediate_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) { struct nft_immediate_expr *priv = nft_expr_priv(expr); - struct nft_data_desc desc; + struct nft_data_desc desc = { + .size = sizeof(priv->data), + }; int err; if (tb[NFTA_IMMEDIATE_DREG] == NULL || tb[NFTA_IMMEDIATE_DATA] == NULL) return -EINVAL; - err = nft_data_init(ctx, &priv->data, sizeof(priv->data), &desc, - tb[NFTA_IMMEDIATE_DATA]); + desc.type = nft_reg_to_type(tb[NFTA_IMMEDIATE_DREG]); + err = nft_data_init(ctx, &priv->data, &desc, tb[NFTA_IMMEDIATE_DATA]); if (err < 0) return err; diff --git a/net/netfilter/nft_range.c b/net/netfilter/nft_range.c index 66f77484c227..832f0d725a9e 100644 --- a/net/netfilter/nft_range.c +++ b/net/netfilter/nft_range.c @@ -51,7 +51,14 @@ static int nft_range_init(const struct nft_ctx *ctx, const struct nft_expr *expr const struct nlattr * const tb[]) { struct nft_range_expr *priv = nft_expr_priv(expr); - struct nft_data_desc desc_from, desc_to; + struct nft_data_desc desc_from = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->data_from), + }; + struct nft_data_desc desc_to = { + .type = NFT_DATA_VALUE, + .size = sizeof(priv->data_to), + }; int err; u32 op; @@ -61,26 +68,16 @@ static int nft_range_init(const struct nft_ctx *ctx, const struct nft_expr *expr !tb[NFTA_RANGE_TO_DATA]) return -EINVAL; - err = nft_data_init(NULL, &priv->data_from, sizeof(priv->data_from), - &desc_from, tb[NFTA_RANGE_FROM_DATA]); + err = nft_data_init(NULL, &priv->data_from, &desc_from, + tb[NFTA_RANGE_FROM_DATA]); if (err < 0) return err; - if (desc_from.type != NFT_DATA_VALUE) { - err = -EINVAL; - goto err1; - } - - err = nft_data_init(NULL, &priv->data_to, sizeof(priv->data_to), - &desc_to, tb[NFTA_RANGE_TO_DATA]); + err = nft_data_init(NULL, &priv->data_to, &desc_to, + tb[NFTA_RANGE_TO_DATA]); if (err < 0) goto err1; - if (desc_to.type != NFT_DATA_VALUE) { - err = -EINVAL; - goto err2; - } - if (desc_from.len != desc_to.len) { err = -EINVAL; goto err2; diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c index 8490e46359ae..0555dffd80e0 100644 --- a/net/netlabel/netlabel_unlabeled.c +++ b/net/netlabel/netlabel_unlabeled.c @@ -885,7 +885,7 @@ static int netlbl_unlabel_staticadd(struct sk_buff *skb, /* Don't allow users to add both IPv4 and IPv6 addresses for a * single entry. However, allow users to create two entries, one each - * for IPv4 and IPv4, with the same LSM security context which should + * for IPv4 and IPv6, with the same LSM security context which should * achieve the same result. */ if (!info->attrs[NLBL_UNLABEL_A_SECCTX] || !info->attrs[NLBL_UNLABEL_A_IFACE] || diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c index a35ab8c27866..3f935cbbaff6 100644 --- a/net/sched/cls_route.c +++ b/net/sched/cls_route.c @@ -526,7 +526,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb, rcu_assign_pointer(f->next, f1); rcu_assign_pointer(*fp, f); - if (fold && fold->handle && f->handle != fold->handle) { + if (fold) { th = to_hash(fold->handle); h = from_hash(fold->handle >> 16); b = rtnl_dereference(head->table[th]); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index cc6eabee2830..d47b9689eba6 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -427,14 +427,10 @@ void __qdisc_run(struct Qdisc *q) unsigned long dev_trans_start(struct net_device *dev) { - unsigned long val, res; + unsigned long res = READ_ONCE(netdev_get_tx_queue(dev, 0)->trans_start); + unsigned long val; unsigned int i; - if (is_vlan_dev(dev)) - dev = vlan_dev_real_dev(dev); - else if (netif_is_macvlan(dev)) - dev = macvlan_dev_real_dev(dev); - res = READ_ONCE(netdev_get_tx_queue(dev, 0)->trans_start); for (i = 1; i < dev->num_tx_queues; i++) { val = READ_ONCE(netdev_get_tx_queue(dev, i)->trans_start); if (val && time_after(val, res)) diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index e3e6cf75aa03..0f983e5f7dde 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -71,7 +71,13 @@ static void tls_device_tx_del_task(struct work_struct *work) struct tls_offload_context_tx *offload_ctx = container_of(work, struct tls_offload_context_tx, destruct_work); struct tls_context *ctx = offload_ctx->ctx; - struct net_device *netdev = ctx->netdev; + struct net_device *netdev; + + /* Safe, because this is the destroy flow, refcount is 0, so + * tls_device_down can't store this field in parallel. + */ + netdev = rcu_dereference_protected(ctx->netdev, + !refcount_read(&ctx->refcount)); netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_TX); dev_put(netdev); @@ -81,6 +87,7 @@ static void tls_device_tx_del_task(struct work_struct *work) static void tls_device_queue_ctx_destruction(struct tls_context *ctx) { + struct net_device *netdev; unsigned long flags; bool async_cleanup; @@ -91,7 +98,14 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx) } list_del(&ctx->list); /* Remove from tls_device_list / tls_device_down_list */ - async_cleanup = ctx->netdev && ctx->tx_conf == TLS_HW; + + /* Safe, because this is the destroy flow, refcount is 0, so + * tls_device_down can't store this field in parallel. + */ + netdev = rcu_dereference_protected(ctx->netdev, + !refcount_read(&ctx->refcount)); + + async_cleanup = netdev && ctx->tx_conf == TLS_HW; if (async_cleanup) { struct tls_offload_context_tx *offload_ctx = tls_offload_ctx_tx(ctx); @@ -229,7 +243,8 @@ static void tls_device_resync_tx(struct sock *sk, struct tls_context *tls_ctx, trace_tls_device_tx_resync_send(sk, seq, rcd_sn); down_read(&device_offload_lock); - netdev = tls_ctx->netdev; + netdev = rcu_dereference_protected(tls_ctx->netdev, + lockdep_is_held(&device_offload_lock)); if (netdev) err = netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn, @@ -710,7 +725,7 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx, trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type); rcu_read_lock(); - netdev = READ_ONCE(tls_ctx->netdev); + netdev = rcu_dereference(tls_ctx->netdev); if (netdev) netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn, TLS_OFFLOAD_CTX_DIR_RX); @@ -984,11 +999,17 @@ int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx) int is_decrypted = skb->decrypted; int is_encrypted = !is_decrypted; struct sk_buff *skb_iter; + int left; + left = rxm->full_len - skb->len; /* Check if all the data is decrypted already */ - skb_walk_frags(skb, skb_iter) { + skb_iter = skb_shinfo(skb)->frag_list; + while (skb_iter && left > 0) { is_decrypted &= skb_iter->decrypted; is_encrypted &= !skb_iter->decrypted; + + left -= skb_iter->len; + skb_iter = skb_iter->next; } trace_tls_device_decrypted(sk, tcp_sk(sk)->copied_seq - rxm->full_len, @@ -1029,7 +1050,7 @@ static void tls_device_attach(struct tls_context *ctx, struct sock *sk, if (sk->sk_destruct != tls_device_sk_destruct) { refcount_set(&ctx->refcount, 1); dev_hold(netdev); - ctx->netdev = netdev; + RCU_INIT_POINTER(ctx->netdev, netdev); spin_lock_irq(&tls_device_lock); list_add_tail(&ctx->list, &tls_device_list); spin_unlock_irq(&tls_device_lock); @@ -1300,7 +1321,8 @@ void tls_device_offload_cleanup_rx(struct sock *sk) struct net_device *netdev; down_read(&device_offload_lock); - netdev = tls_ctx->netdev; + netdev = rcu_dereference_protected(tls_ctx->netdev, + lockdep_is_held(&device_offload_lock)); if (!netdev) goto out; @@ -1309,7 +1331,7 @@ void tls_device_offload_cleanup_rx(struct sock *sk) if (tls_ctx->tx_conf != TLS_HW) { dev_put(netdev); - tls_ctx->netdev = NULL; + rcu_assign_pointer(tls_ctx->netdev, NULL); } else { set_bit(TLS_RX_DEV_CLOSED, &tls_ctx->flags); } @@ -1329,7 +1351,11 @@ static int tls_device_down(struct net_device *netdev) spin_lock_irqsave(&tls_device_lock, flags); list_for_each_entry_safe(ctx, tmp, &tls_device_list, list) { - if (ctx->netdev != netdev || + struct net_device *ctx_netdev = + rcu_dereference_protected(ctx->netdev, + lockdep_is_held(&device_offload_lock)); + + if (ctx_netdev != netdev || !refcount_inc_not_zero(&ctx->refcount)) continue; @@ -1346,7 +1372,7 @@ static int tls_device_down(struct net_device *netdev) /* Stop the RX and TX resync. * tls_dev_resync must not be called after tls_dev_del. */ - WRITE_ONCE(ctx->netdev, NULL); + rcu_assign_pointer(ctx->netdev, NULL); /* Start skipping the RX resync logic completely. */ set_bit(TLS_RX_DEV_DEGRADED, &ctx->flags); diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c index 618cee704217..7dfc8023e0f1 100644 --- a/net/tls/tls_device_fallback.c +++ b/net/tls/tls_device_fallback.c @@ -426,7 +426,8 @@ struct sk_buff *tls_validate_xmit_skb(struct sock *sk, struct net_device *dev, struct sk_buff *skb) { - if (dev == tls_get_ctx(sk)->netdev || netif_is_bond_master(dev)) + if (dev == rcu_dereference_bh(tls_get_ctx(sk)->netdev) || + netif_is_bond_master(dev)) return skb; return tls_sw_fallback(sk, skb); diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index f0b7c9122fba..9b79e334dbd9 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -41,7 +41,7 @@ static struct sk_buff *tls_strp_msg_make_copy(struct tls_strparser *strp) struct sk_buff *skb; int i, err, offset; - skb = alloc_skb_with_frags(0, strp->anchor->len, TLS_PAGE_ORDER, + skb = alloc_skb_with_frags(0, strp->stm.full_len, TLS_PAGE_ORDER, &err, strp->sk->sk_allocation); if (!skb) return NULL; diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f04abf662ec6..b4ee163154a6 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1286,6 +1286,7 @@ static void vsock_connect_timeout(struct work_struct *work) if (sk->sk_state == TCP_SYN_SENT && (sk->sk_shutdown != SHUTDOWN_MASK)) { sk->sk_state = TCP_CLOSE; + sk->sk_socket->state = SS_UNCONNECTED; sk->sk_err = ETIMEDOUT; sk_error_report(sk); vsock_transport_cancel_pkt(vsk); @@ -1391,7 +1392,14 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, * timeout fires. */ sock_hold(sk); - schedule_delayed_work(&vsk->connect_work, timeout); + + /* If the timeout function is already scheduled, + * reschedule it, then ungrab the socket refcount to + * keep it balanced. + */ + if (mod_delayed_work(system_wq, &vsk->connect_work, + timeout)) + sock_put(sk); /* Skip ahead to preserve error code set above. */ goto out_wait; diff --git a/net/wireless/sme.c b/net/wireless/sme.c index 62c773cf1b8d..27fb2a0c4052 100644 --- a/net/wireless/sme.c +++ b/net/wireless/sme.c @@ -782,9 +782,11 @@ void __cfg80211_connect_result(struct net_device *dev, #endif if (cr->status == WLAN_STATUS_SUCCESS) { - for_each_valid_link(cr, link) { - if (WARN_ON_ONCE(!cr->links[link].bss)) - break; + if (!wiphy_to_rdev(wdev->wiphy)->ops->connect) { + for_each_valid_link(cr, link) { + if (WARN_ON_ONCE(!cr->links[link].bss)) + break; + } } for_each_valid_link(cr, link) { diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c index 6bc2ac8d8146..3b55502b2965 100644 --- a/net/x25/af_x25.c +++ b/net/x25/af_x25.c @@ -719,6 +719,11 @@ static int x25_wait_for_connection_establishment(struct sock *sk) sk->sk_socket->state = SS_UNCONNECTED; break; } + rc = -ENOTCONN; + if (sk->sk_state == TCP_CLOSE) { + sk->sk_socket->state = SS_UNCONNECTED; + break; + } rc = 0; if (sk->sk_state != TCP_ESTABLISHED) { release_sock(sk); diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h index bd6f4505e7b1..70adf7b119b9 100644 --- a/tools/lib/bpf/skel_internal.h +++ b/tools/lib/bpf/skel_internal.h @@ -66,13 +66,13 @@ struct bpf_load_and_run_opts { const char *errstr; }; -long bpf_sys_bpf(__u32 cmd, void *attr, __u32 attr_size); +long kern_sys_bpf(__u32 cmd, void *attr, __u32 attr_size); static inline int skel_sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size) { #ifdef __KERNEL__ - return bpf_sys_bpf(cmd, attr, size); + return kern_sys_bpf(cmd, attr, size); #else return syscall(__NR_bpf, cmd, attr, size); #endif diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index a33874b081b6..e89685bd587c 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -28,6 +28,7 @@ #include "bpf_iter_test_kern6.skel.h" #include "bpf_iter_bpf_link.skel.h" #include "bpf_iter_ksym.skel.h" +#include "bpf_iter_sockmap.skel.h" static int duration; @@ -67,6 +68,50 @@ free_link: bpf_link__destroy(link); } +static void do_read_map_iter_fd(struct bpf_object_skeleton **skel, struct bpf_program *prog, + struct bpf_map *map) +{ + DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); + union bpf_iter_link_info linfo; + struct bpf_link *link; + char buf[16] = {}; + int iter_fd, len; + + memset(&linfo, 0, sizeof(linfo)); + linfo.map.map_fd = bpf_map__fd(map); + opts.link_info = &linfo; + opts.link_info_len = sizeof(linfo); + link = bpf_program__attach_iter(prog, &opts); + if (!ASSERT_OK_PTR(link, "attach_map_iter")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (!ASSERT_GE(iter_fd, 0, "create_map_iter")) { + bpf_link__destroy(link); + return; + } + + /* Close link and map fd prematurely */ + bpf_link__destroy(link); + bpf_object__destroy_skeleton(*skel); + *skel = NULL; + + /* Try to let map free work to run first if map is freed */ + usleep(100); + /* Memory used by both sock map and sock local storage map are + * freed after two synchronize_rcu() calls, so wait for it + */ + kern_sync_rcu(); + kern_sync_rcu(); + + /* Read after both map fd and link fd are closed */ + while ((len = read(iter_fd, buf, sizeof(buf))) > 0) + ; + ASSERT_GE(len, 0, "read_iterator"); + + close(iter_fd); +} + static int read_fd_into_buffer(int fd, char *buf, int size) { int bufleft = size; @@ -634,6 +679,12 @@ static void test_bpf_hash_map(void) goto out; } + /* Sleepable program is prohibited for hash map iterator */ + linfo.map.map_fd = map_fd; + link = bpf_program__attach_iter(skel->progs.sleepable_dummy_dump, &opts); + if (!ASSERT_ERR_PTR(link, "attach_sleepable_prog_to_iter")) + goto out; + linfo.map.map_fd = map_fd; link = bpf_program__attach_iter(skel->progs.dump_bpf_hash_map, &opts); if (!ASSERT_OK_PTR(link, "attach_iter")) @@ -827,6 +878,20 @@ out: bpf_iter_bpf_array_map__destroy(skel); } +static void test_bpf_array_map_iter_fd(void) +{ + struct bpf_iter_bpf_array_map *skel; + + skel = bpf_iter_bpf_array_map__open_and_load(); + if (!ASSERT_OK_PTR(skel, "bpf_iter_bpf_array_map__open_and_load")) + return; + + do_read_map_iter_fd(&skel->skeleton, skel->progs.dump_bpf_array_map, + skel->maps.arraymap1); + + bpf_iter_bpf_array_map__destroy(skel); +} + static void test_bpf_percpu_array_map(void) { DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); @@ -1009,6 +1074,20 @@ out: bpf_iter_bpf_sk_storage_helpers__destroy(skel); } +static void test_bpf_sk_stoarge_map_iter_fd(void) +{ + struct bpf_iter_bpf_sk_storage_map *skel; + + skel = bpf_iter_bpf_sk_storage_map__open_and_load(); + if (!ASSERT_OK_PTR(skel, "bpf_iter_bpf_sk_storage_map__open_and_load")) + return; + + do_read_map_iter_fd(&skel->skeleton, skel->progs.rw_bpf_sk_storage_map, + skel->maps.sk_stg_map); + + bpf_iter_bpf_sk_storage_map__destroy(skel); +} + static void test_bpf_sk_storage_map(void) { DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); @@ -1044,7 +1123,15 @@ static void test_bpf_sk_storage_map(void) linfo.map.map_fd = map_fd; opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); - link = bpf_program__attach_iter(skel->progs.dump_bpf_sk_storage_map, &opts); + link = bpf_program__attach_iter(skel->progs.oob_write_bpf_sk_storage_map, &opts); + err = libbpf_get_error(link); + if (!ASSERT_EQ(err, -EACCES, "attach_oob_write_iter")) { + if (!err) + bpf_link__destroy(link); + goto out; + } + + link = bpf_program__attach_iter(skel->progs.rw_bpf_sk_storage_map, &opts); if (!ASSERT_OK_PTR(link, "attach_iter")) goto out; @@ -1052,6 +1139,7 @@ static void test_bpf_sk_storage_map(void) if (!ASSERT_GE(iter_fd, 0, "create_iter")) goto free_link; + skel->bss->to_add_val = time(NULL); /* do some tests */ while ((len = read(iter_fd, buf, sizeof(buf))) > 0) ; @@ -1065,6 +1153,13 @@ static void test_bpf_sk_storage_map(void) if (!ASSERT_EQ(skel->bss->val_sum, expected_val, "val_sum")) goto close_iter; + for (i = 0; i < num_sockets; i++) { + err = bpf_map_lookup_elem(map_fd, &sock_fd[i], &val); + if (!ASSERT_OK(err, "map_lookup") || + !ASSERT_EQ(val, i + 1 + skel->bss->to_add_val, "check_map_value")) + break; + } + close_iter: close(iter_fd); free_link: @@ -1217,6 +1312,19 @@ out: bpf_iter_task_vma__destroy(skel); } +void test_bpf_sockmap_map_iter_fd(void) +{ + struct bpf_iter_sockmap *skel; + + skel = bpf_iter_sockmap__open_and_load(); + if (!ASSERT_OK_PTR(skel, "bpf_iter_sockmap__open_and_load")) + return; + + do_read_map_iter_fd(&skel->skeleton, skel->progs.copy, skel->maps.sockmap); + + bpf_iter_sockmap__destroy(skel); +} + void test_bpf_iter(void) { if (test__start_subtest("btf_id_or_null")) @@ -1267,10 +1375,14 @@ void test_bpf_iter(void) test_bpf_percpu_hash_map(); if (test__start_subtest("bpf_array_map")) test_bpf_array_map(); + if (test__start_subtest("bpf_array_map_iter_fd")) + test_bpf_array_map_iter_fd(); if (test__start_subtest("bpf_percpu_array_map")) test_bpf_percpu_array_map(); if (test__start_subtest("bpf_sk_storage_map")) test_bpf_sk_storage_map(); + if (test__start_subtest("bpf_sk_storage_map_iter_fd")) + test_bpf_sk_stoarge_map_iter_fd(); if (test__start_subtest("bpf_sk_storage_delete")) test_bpf_sk_storage_delete(); if (test__start_subtest("bpf_sk_storage_get")) @@ -1283,4 +1395,6 @@ void test_bpf_iter(void) test_link_iter(); if (test__start_subtest("ksym")) test_ksym_iter(); + if (test__start_subtest("bpf_sockmap_map_iter_fd")) + test_bpf_sockmap_map_iter_fd(); } diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index 02bb8cbf9194..da860b07abb5 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -3,6 +3,7 @@ #include <test_progs.h> #include <network_helpers.h> #include <bpf/btf.h> +#include "bind4_prog.skel.h" typedef int (*test_cb)(struct bpf_object *obj); @@ -407,6 +408,98 @@ static void test_func_replace_global_func(void) prog_name, false, NULL); } +static int find_prog_btf_id(const char *name, __u32 attach_prog_fd) +{ + struct bpf_prog_info info = {}; + __u32 info_len = sizeof(info); + struct btf *btf; + int ret; + + ret = bpf_obj_get_info_by_fd(attach_prog_fd, &info, &info_len); + if (ret) + return ret; + + if (!info.btf_id) + return -EINVAL; + + btf = btf__load_from_kernel_by_id(info.btf_id); + ret = libbpf_get_error(btf); + if (ret) + return ret; + + ret = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC); + btf__free(btf); + return ret; +} + +static int load_fentry(int attach_prog_fd, int attach_btf_id) +{ + LIBBPF_OPTS(bpf_prog_load_opts, opts, + .expected_attach_type = BPF_TRACE_FENTRY, + .attach_prog_fd = attach_prog_fd, + .attach_btf_id = attach_btf_id, + ); + struct bpf_insn insns[] = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }; + + return bpf_prog_load(BPF_PROG_TYPE_TRACING, + "bind4_fentry", + "GPL", + insns, + ARRAY_SIZE(insns), + &opts); +} + +static void test_fentry_to_cgroup_bpf(void) +{ + struct bind4_prog *skel = NULL; + struct bpf_prog_info info = {}; + __u32 info_len = sizeof(info); + int cgroup_fd = -1; + int fentry_fd = -1; + int btf_id; + + cgroup_fd = test__join_cgroup("/fentry_to_cgroup_bpf"); + if (!ASSERT_GE(cgroup_fd, 0, "cgroup_fd")) + return; + + skel = bind4_prog__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel")) + goto cleanup; + + skel->links.bind_v4_prog = bpf_program__attach_cgroup(skel->progs.bind_v4_prog, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.bind_v4_prog, "bpf_program__attach_cgroup")) + goto cleanup; + + btf_id = find_prog_btf_id("bind_v4_prog", bpf_program__fd(skel->progs.bind_v4_prog)); + if (!ASSERT_GE(btf_id, 0, "find_prog_btf_id")) + goto cleanup; + + fentry_fd = load_fentry(bpf_program__fd(skel->progs.bind_v4_prog), btf_id); + if (!ASSERT_GE(fentry_fd, 0, "load_fentry")) + goto cleanup; + + /* Make sure bpf_obj_get_info_by_fd works correctly when attaching + * to another BPF program. + */ + + ASSERT_OK(bpf_obj_get_info_by_fd(fentry_fd, &info, &info_len), + "bpf_obj_get_info_by_fd"); + + ASSERT_EQ(info.btf_id, 0, "info.btf_id"); + ASSERT_EQ(info.attach_btf_id, btf_id, "info.attach_btf_id"); + ASSERT_GT(info.attach_btf_obj_id, 0, "info.attach_btf_obj_id"); + +cleanup: + if (cgroup_fd >= 0) + close(cgroup_fd); + if (fentry_fd >= 0) + close(fentry_fd); + bind4_prog__destroy(skel); +} + /* NOTE: affect other tests, must run in serial mode */ void serial_test_fexit_bpf2bpf(void) { @@ -430,4 +523,6 @@ void serial_test_fexit_bpf2bpf(void) test_fmod_ret_freplace(); if (test__start_subtest("func_replace_global_func")) test_func_replace_global_func(); + if (test__start_subtest("fentry_to_cgroup_bpf")) + test_fentry_to_cgroup_bpf(); } diff --git a/tools/testing/selftests/bpf/prog_tests/lru_bug.c b/tools/testing/selftests/bpf/prog_tests/lru_bug.c new file mode 100644 index 000000000000..3c7822390827 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/lru_bug.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> + +#include "lru_bug.skel.h" + +void test_lru_bug(void) +{ + struct lru_bug *skel; + int ret; + + skel = lru_bug__open_and_load(); + if (!ASSERT_OK_PTR(skel, "lru_bug__open_and_load")) + return; + ret = lru_bug__attach(skel); + if (!ASSERT_OK(ret, "lru_bug__attach")) + goto end; + usleep(1); + ASSERT_OK(skel->data->result, "prealloc_lru_pop doesn't call check_and_init_map_value"); +end: + lru_bug__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c index 0aa3cd34cbe3..d7a69217fb68 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_hash_map.c @@ -112,3 +112,12 @@ int dump_bpf_hash_map(struct bpf_iter__bpf_map_elem *ctx) return 0; } + +SEC("iter.s/bpf_map_elem") +int sleepable_dummy_dump(struct bpf_iter__bpf_map_elem *ctx) +{ + if (ctx->meta->seq_num == 0) + BPF_SEQ_PRINTF(ctx->meta->seq, "map dump starts\n"); + + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_sk_storage_map.c b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_sk_storage_map.c index 6b70ccaba301..c7b8e006b171 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_bpf_sk_storage_map.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_bpf_sk_storage_map.c @@ -16,19 +16,37 @@ struct { __u32 val_sum = 0; __u32 ipv6_sk_count = 0; +__u32 to_add_val = 0; SEC("iter/bpf_sk_storage_map") -int dump_bpf_sk_storage_map(struct bpf_iter__bpf_sk_storage_map *ctx) +int rw_bpf_sk_storage_map(struct bpf_iter__bpf_sk_storage_map *ctx) { struct sock *sk = ctx->sk; __u32 *val = ctx->value; - if (sk == (void *)0 || val == (void *)0) + if (sk == NULL || val == NULL) return 0; if (sk->sk_family == AF_INET6) ipv6_sk_count++; val_sum += *val; + + *val += to_add_val; + + return 0; +} + +SEC("iter/bpf_sk_storage_map") +int oob_write_bpf_sk_storage_map(struct bpf_iter__bpf_sk_storage_map *ctx) +{ + struct sock *sk = ctx->sk; + __u32 *val = ctx->value; + + if (sk == NULL || val == NULL) + return 0; + + *(val + 1) = 0xdeadbeef; + return 0; } diff --git a/tools/testing/selftests/bpf/progs/lru_bug.c b/tools/testing/selftests/bpf/progs/lru_bug.c new file mode 100644 index 000000000000..687081a724b3 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/lru_bug.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_helpers.h> + +struct map_value { + struct task_struct __kptr *ptr; +}; + +struct { + __uint(type, BPF_MAP_TYPE_LRU_HASH); + __uint(max_entries, 1); + __type(key, int); + __type(value, struct map_value); +} lru_map SEC(".maps"); + +int pid = 0; +int result = 1; + +SEC("fentry/bpf_ktime_get_ns") +int printk(void *ctx) +{ + struct map_value v = {}; + + if (pid == bpf_get_current_task_btf()->pid) + bpf_map_update_elem(&lru_map, &(int){0}, &v, 0); + return 0; +} + +SEC("fentry/do_nanosleep") +int nanosleep(void *ctx) +{ + struct map_value val = {}, *v; + struct task_struct *current; + + bpf_map_update_elem(&lru_map, &(int){0}, &val, 0); + v = bpf_map_lookup_elem(&lru_map, &(int){0}); + if (!v) + return 0; + bpf_map_delete_elem(&lru_map, &(int){0}); + current = bpf_get_current_task_btf(); + v->ptr = current; + pid = current->pid; + bpf_ktime_get_ns(); + result = !v->ptr; + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 892306bdb47d..0e5751af6247 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -38,4 +38,5 @@ ioam6_parser toeplitz tun cmsg_sender -unix_connect
\ No newline at end of file +unix_connect +tap
\ No newline at end of file diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index e2dfef8b78a7..c0ee2955fe54 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -57,7 +57,7 @@ TEST_GEN_FILES += ipsec TEST_GEN_FILES += ioam6_parser TEST_GEN_FILES += gro TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa -TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls tun +TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls tun tap TEST_GEN_FILES += toeplitz TEST_GEN_FILES += cmsg_sender TEST_GEN_FILES += stress_reuseport_listen diff --git a/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh index a15d21dc035a..56eb83d1a3bd 100755 --- a/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh +++ b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh @@ -181,37 +181,43 @@ ping_ipv6() send_src_ipv4() { - $MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_dst_ipv4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_src_udp4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B 203.0.113.2 \ -d 1msec -t udp "sp=0-32768,dp=30000" } send_dst_udp4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B 203.0.113.2 \ -d 1msec -t udp "sp=20000,dp=0-32768" } send_src_ipv6() { - $MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:4::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:4::2 \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_dst_ipv6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:4::2-2001:db8:4::fd" \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B "2001:db8:4::2-2001:db8:4::fd" \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } @@ -226,13 +232,15 @@ send_flowlabel() send_src_udp6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:4::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B 2001:db8:4::2 \ -d 1msec -t udp "sp=0-32768,dp=30000" } send_dst_udp6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:4::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B 2001:db8:4::2 \ -d 1msec -t udp "sp=20000,dp=0-32768" } diff --git a/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh index a73f52efcb6c..0446db9c6f74 100755 --- a/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh +++ b/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh @@ -276,37 +276,43 @@ ping_ipv6() send_src_ipv4() { - $MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_dst_ipv4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_src_udp4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B 203.0.113.2 \ -d 1msec -t udp "sp=0-32768,dp=30000" } send_dst_udp4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B 203.0.113.2 \ -d 1msec -t udp "sp=20000,dp=0-32768" } send_src_ipv6() { - $MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_dst_ipv6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } @@ -321,13 +327,15 @@ send_flowlabel() send_src_udp6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B 2001:db8:2::2 \ -d 1msec -t udp "sp=0-32768,dp=30000" } send_dst_udp6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B 2001:db8:2::2 \ -d 1msec -t udp "sp=20000,dp=0-32768" } diff --git a/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh index 8fea2c2e0b25..d40183b4eccc 100755 --- a/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh +++ b/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh @@ -278,37 +278,43 @@ ping_ipv6() send_src_ipv4() { - $MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_dst_ipv4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_src_udp4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B 203.0.113.2 \ -d 1msec -t udp "sp=0-32768,dp=30000" } send_dst_udp4() { - $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + ip vrf exec v$h1 $MZ $h1 -q -p 64 \ + -A 198.51.100.2 -B 203.0.113.2 \ -d 1msec -t udp "sp=20000,dp=0-32768" } send_src_ipv6() { - $MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } send_dst_ipv6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \ -d 1msec -c 50 -t udp "sp=20000,dp=30000" } @@ -323,13 +329,15 @@ send_flowlabel() send_src_udp6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B 2001:db8:2::2 \ -d 1msec -t udp "sp=0-32768,dp=30000" } send_dst_udp6() { - $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + ip vrf exec v$h1 $MZ -6 $h1 -q -p 64 \ + -A 2001:db8:1::2 -B 2001:db8:2::2 \ -d 1msec -t udp "sp=20000,dp=0-32768" } diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index e2ea6c126c99..24d4e9cb617e 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -553,6 +553,18 @@ static void set_nonblock(int fd, bool nonblock) fcntl(fd, F_SETFL, flags & ~O_NONBLOCK); } +static void shut_wr(int fd) +{ + /* Close our write side, ev. give some time + * for address notification and/or checking + * the current status + */ + if (cfg_wait) + usleep(cfg_wait); + + shutdown(fd, SHUT_WR); +} + static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after_out) { struct pollfd fds = { @@ -630,14 +642,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after /* ... and peer also closed already */ break; - /* ... but we still receive. - * Close our write side, ev. give some time - * for address notification and/or checking - * the current status - */ - if (cfg_wait) - usleep(cfg_wait); - shutdown(peerfd, SHUT_WR); + shut_wr(peerfd); } else { if (errno == EINTR) continue; @@ -767,7 +772,7 @@ static int copyfd_io_mmap(int infd, int peerfd, int outfd, if (err) return err; - shutdown(peerfd, SHUT_WR); + shut_wr(peerfd); err = do_recvfile(peerfd, outfd); *in_closed_after_out = true; @@ -791,6 +796,9 @@ static int copyfd_io_sendfile(int infd, int peerfd, int outfd, err = do_sendfile(infd, peerfd, size); if (err) return err; + + shut_wr(peerfd); + err = do_recvfile(peerfd, outfd); *in_closed_after_out = true; } diff --git a/tools/testing/selftests/net/tap.c b/tools/testing/selftests/net/tap.c new file mode 100644 index 000000000000..247c3b3ac1c9 --- /dev/null +++ b/tools/testing/selftests/net/tap.c @@ -0,0 +1,434 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE + +#include <errno.h> +#include <fcntl.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <net/if.h> +#include <linux/if_tun.h> +#include <linux/netlink.h> +#include <linux/rtnetlink.h> +#include <sys/ioctl.h> +#include <sys/socket.h> +#include <linux/virtio_net.h> +#include <netinet/ip.h> +#include <netinet/udp.h> +#include "../kselftest_harness.h" + +static const char param_dev_tap_name[] = "xmacvtap0"; +static const char param_dev_dummy_name[] = "xdummy0"; +static unsigned char param_hwaddr_src[] = { 0x00, 0xfe, 0x98, 0x14, 0x22, 0x42 }; +static unsigned char param_hwaddr_dest[] = { + 0x00, 0xfe, 0x98, 0x94, 0xd2, 0x43 +}; + +#define MAX_RTNL_PAYLOAD (2048) +#define PKT_DATA 0xCB +#define TEST_PACKET_SZ (sizeof(struct virtio_net_hdr) + ETH_HLEN + ETH_MAX_MTU) + +static struct rtattr *rtattr_add(struct nlmsghdr *nh, unsigned short type, + unsigned short len) +{ + struct rtattr *rta = + (struct rtattr *)((uint8_t *)nh + RTA_ALIGN(nh->nlmsg_len)); + rta->rta_type = type; + rta->rta_len = RTA_LENGTH(len); + nh->nlmsg_len = RTA_ALIGN(nh->nlmsg_len) + RTA_ALIGN(rta->rta_len); + return rta; +} + +static struct rtattr *rtattr_begin(struct nlmsghdr *nh, unsigned short type) +{ + return rtattr_add(nh, type, 0); +} + +static void rtattr_end(struct nlmsghdr *nh, struct rtattr *attr) +{ + uint8_t *end = (uint8_t *)nh + nh->nlmsg_len; + + attr->rta_len = end - (uint8_t *)attr; +} + +static struct rtattr *rtattr_add_str(struct nlmsghdr *nh, unsigned short type, + const char *s) +{ + struct rtattr *rta = rtattr_add(nh, type, strlen(s)); + + memcpy(RTA_DATA(rta), s, strlen(s)); + return rta; +} + +static struct rtattr *rtattr_add_strsz(struct nlmsghdr *nh, unsigned short type, + const char *s) +{ + struct rtattr *rta = rtattr_add(nh, type, strlen(s) + 1); + + strcpy(RTA_DATA(rta), s); + return rta; +} + +static struct rtattr *rtattr_add_any(struct nlmsghdr *nh, unsigned short type, + const void *arr, size_t len) +{ + struct rtattr *rta = rtattr_add(nh, type, len); + + memcpy(RTA_DATA(rta), arr, len); + return rta; +} + +static int dev_create(const char *dev, const char *link_type, + int (*fill_rtattr)(struct nlmsghdr *nh), + int (*fill_info_data)(struct nlmsghdr *nh)) +{ + struct { + struct nlmsghdr nh; + struct ifinfomsg info; + unsigned char data[MAX_RTNL_PAYLOAD]; + } req; + struct rtattr *link_info, *info_data; + int ret, rtnl; + + rtnl = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE); + if (rtnl < 0) { + fprintf(stderr, "%s: socket %s\n", __func__, strerror(errno)); + return 1; + } + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(req.info)); + req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE; + req.nh.nlmsg_type = RTM_NEWLINK; + + req.info.ifi_family = AF_UNSPEC; + req.info.ifi_type = 1; + req.info.ifi_index = 0; + req.info.ifi_flags = IFF_BROADCAST | IFF_UP; + req.info.ifi_change = 0xffffffff; + + rtattr_add_str(&req.nh, IFLA_IFNAME, dev); + + if (fill_rtattr) { + ret = fill_rtattr(&req.nh); + if (ret) + return ret; + } + + link_info = rtattr_begin(&req.nh, IFLA_LINKINFO); + + rtattr_add_strsz(&req.nh, IFLA_INFO_KIND, link_type); + + if (fill_info_data) { + info_data = rtattr_begin(&req.nh, IFLA_INFO_DATA); + ret = fill_info_data(&req.nh); + if (ret) + return ret; + rtattr_end(&req.nh, info_data); + } + + rtattr_end(&req.nh, link_info); + + ret = send(rtnl, &req, req.nh.nlmsg_len, 0); + if (ret < 0) + fprintf(stderr, "%s: send %s\n", __func__, strerror(errno)); + ret = (unsigned int)ret != req.nh.nlmsg_len; + + close(rtnl); + return ret; +} + +static int dev_delete(const char *dev) +{ + struct { + struct nlmsghdr nh; + struct ifinfomsg info; + unsigned char data[MAX_RTNL_PAYLOAD]; + } req; + int ret, rtnl; + + rtnl = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE); + if (rtnl < 0) { + fprintf(stderr, "%s: socket %s\n", __func__, strerror(errno)); + return 1; + } + + memset(&req, 0, sizeof(req)); + req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(req.info)); + req.nh.nlmsg_flags = NLM_F_REQUEST; + req.nh.nlmsg_type = RTM_DELLINK; + + req.info.ifi_family = AF_UNSPEC; + + rtattr_add_str(&req.nh, IFLA_IFNAME, dev); + + ret = send(rtnl, &req, req.nh.nlmsg_len, 0); + if (ret < 0) + fprintf(stderr, "%s: send %s\n", __func__, strerror(errno)); + + ret = (unsigned int)ret != req.nh.nlmsg_len; + + close(rtnl); + return ret; +} + +static int macvtap_fill_rtattr(struct nlmsghdr *nh) +{ + int ifindex; + + ifindex = if_nametoindex(param_dev_dummy_name); + if (ifindex == 0) { + fprintf(stderr, "%s: ifindex %s\n", __func__, strerror(errno)); + return -errno; + } + + rtattr_add_any(nh, IFLA_LINK, &ifindex, sizeof(ifindex)); + rtattr_add_any(nh, IFLA_ADDRESS, param_hwaddr_src, ETH_ALEN); + + return 0; +} + +static int opentap(const char *devname) +{ + int ifindex; + char buf[256]; + int fd; + struct ifreq ifr; + + ifindex = if_nametoindex(devname); + if (ifindex == 0) { + fprintf(stderr, "%s: ifindex %s\n", __func__, strerror(errno)); + return -errno; + } + + sprintf(buf, "/dev/tap%d", ifindex); + fd = open(buf, O_RDWR | O_NONBLOCK); + if (fd < 0) { + fprintf(stderr, "%s: open %s\n", __func__, strerror(errno)); + return -errno; + } + + memset(&ifr, 0, sizeof(ifr)); + strcpy(ifr.ifr_name, devname); + ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_VNET_HDR | IFF_MULTI_QUEUE; + if (ioctl(fd, TUNSETIFF, &ifr, sizeof(ifr)) < 0) + return -errno; + return fd; +} + +size_t build_eth(uint8_t *buf, uint16_t proto) +{ + struct ethhdr *eth = (struct ethhdr *)buf; + + eth->h_proto = htons(proto); + memcpy(eth->h_source, param_hwaddr_src, ETH_ALEN); + memcpy(eth->h_dest, param_hwaddr_dest, ETH_ALEN); + + return ETH_HLEN; +} + +static uint32_t add_csum(const uint8_t *buf, int len) +{ + uint32_t sum = 0; + uint16_t *sbuf = (uint16_t *)buf; + + while (len > 1) { + sum += *sbuf++; + len -= 2; + } + + if (len) + sum += *(uint8_t *)sbuf; + + return sum; +} + +static uint16_t finish_ip_csum(uint32_t sum) +{ + uint16_t lo = sum & 0xffff; + uint16_t hi = sum >> 16; + + return ~(lo + hi); + +} + +static uint16_t build_ip_csum(const uint8_t *buf, int len, + uint32_t sum) +{ + sum += add_csum(buf, len); + return finish_ip_csum(sum); +} + +static int build_ipv4_header(uint8_t *buf, int payload_len) +{ + struct iphdr *iph = (struct iphdr *)buf; + + iph->ihl = 5; + iph->version = 4; + iph->ttl = 8; + iph->tot_len = + htons(sizeof(*iph) + sizeof(struct udphdr) + payload_len); + iph->id = htons(1337); + iph->protocol = IPPROTO_UDP; + iph->saddr = htonl((172 << 24) | (17 << 16) | 2); + iph->daddr = htonl((172 << 24) | (17 << 16) | 1); + iph->check = build_ip_csum(buf, iph->ihl << 2, 0); + + return iph->ihl << 2; +} + +static int build_udp_packet(uint8_t *buf, int payload_len, bool csum_off) +{ + const int ip4alen = sizeof(uint32_t); + struct udphdr *udph = (struct udphdr *)buf; + int len = sizeof(*udph) + payload_len; + uint32_t sum = 0; + + udph->source = htons(22); + udph->dest = htons(58822); + udph->len = htons(len); + + memset(buf + sizeof(struct udphdr), PKT_DATA, payload_len); + + sum = add_csum(buf - 2 * ip4alen, 2 * ip4alen); + sum += htons(IPPROTO_UDP) + udph->len; + + if (!csum_off) + sum += add_csum(buf, len); + + udph->check = finish_ip_csum(sum); + + return sizeof(*udph) + payload_len; +} + +size_t build_test_packet_valid_udp_gso(uint8_t *buf, size_t payload_len) +{ + uint8_t *cur = buf; + struct virtio_net_hdr *vh = (struct virtio_net_hdr *)buf; + + vh->hdr_len = ETH_HLEN + sizeof(struct iphdr) + sizeof(struct udphdr); + vh->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; + vh->csum_start = ETH_HLEN + sizeof(struct iphdr); + vh->csum_offset = __builtin_offsetof(struct udphdr, check); + vh->gso_type = VIRTIO_NET_HDR_GSO_UDP; + vh->gso_size = ETH_DATA_LEN - sizeof(struct iphdr); + cur += sizeof(*vh); + + cur += build_eth(cur, ETH_P_IP); + cur += build_ipv4_header(cur, payload_len); + cur += build_udp_packet(cur, payload_len, true); + + return cur - buf; +} + +size_t build_test_packet_valid_udp_csum(uint8_t *buf, size_t payload_len) +{ + uint8_t *cur = buf; + struct virtio_net_hdr *vh = (struct virtio_net_hdr *)buf; + + vh->flags = VIRTIO_NET_HDR_F_DATA_VALID; + vh->gso_type = VIRTIO_NET_HDR_GSO_NONE; + cur += sizeof(*vh); + + cur += build_eth(cur, ETH_P_IP); + cur += build_ipv4_header(cur, payload_len); + cur += build_udp_packet(cur, payload_len, false); + + return cur - buf; +} + +size_t build_test_packet_crash_tap_invalid_eth_proto(uint8_t *buf, + size_t payload_len) +{ + uint8_t *cur = buf; + struct virtio_net_hdr *vh = (struct virtio_net_hdr *)buf; + + vh->hdr_len = ETH_HLEN + sizeof(struct iphdr) + sizeof(struct udphdr); + vh->flags = 0; + vh->gso_type = VIRTIO_NET_HDR_GSO_UDP; + vh->gso_size = ETH_DATA_LEN - sizeof(struct iphdr); + cur += sizeof(*vh); + + cur += build_eth(cur, 0); + cur += sizeof(struct iphdr) + sizeof(struct udphdr); + cur += build_ipv4_header(cur, payload_len); + cur += build_udp_packet(cur, payload_len, true); + cur += payload_len; + + return cur - buf; +} + +FIXTURE(tap) +{ + int fd; +}; + +FIXTURE_SETUP(tap) +{ + int ret; + + ret = dev_create(param_dev_dummy_name, "dummy", NULL, NULL); + EXPECT_EQ(ret, 0); + + ret = dev_create(param_dev_tap_name, "macvtap", macvtap_fill_rtattr, + NULL); + EXPECT_EQ(ret, 0); + + self->fd = opentap(param_dev_tap_name); + ASSERT_GE(self->fd, 0); +} + +FIXTURE_TEARDOWN(tap) +{ + int ret; + + if (self->fd != -1) + close(self->fd); + + ret = dev_delete(param_dev_tap_name); + EXPECT_EQ(ret, 0); + + ret = dev_delete(param_dev_dummy_name); + EXPECT_EQ(ret, 0); +} + +TEST_F(tap, test_packet_valid_udp_gso) +{ + uint8_t pkt[TEST_PACKET_SZ]; + size_t off; + int ret; + + memset(pkt, 0, sizeof(pkt)); + off = build_test_packet_valid_udp_gso(pkt, 1021); + ret = write(self->fd, pkt, off); + ASSERT_EQ(ret, off); +} + +TEST_F(tap, test_packet_valid_udp_csum) +{ + uint8_t pkt[TEST_PACKET_SZ]; + size_t off; + int ret; + + memset(pkt, 0, sizeof(pkt)); + off = build_test_packet_valid_udp_csum(pkt, 1024); + ret = write(self->fd, pkt, off); + ASSERT_EQ(ret, off); +} + +TEST_F(tap, test_packet_crash_tap_invalid_eth_proto) +{ + uint8_t pkt[TEST_PACKET_SZ]; + size_t off; + int ret; + + memset(pkt, 0, sizeof(pkt)); + off = build_test_packet_crash_tap_invalid_eth_proto(pkt, 1024); + ret = write(self->fd, pkt, off); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); +} + +TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/netfilter/nft_trans_stress.sh b/tools/testing/selftests/netfilter/nft_trans_stress.sh index f1affd12c4b1..a7f62ad4f661 100755 --- a/tools/testing/selftests/netfilter/nft_trans_stress.sh +++ b/tools/testing/selftests/netfilter/nft_trans_stress.sh @@ -9,8 +9,27 @@ # Kselftest framework requirement - SKIP code is 4. ksft_skip=4 -testns=testns1 +testns=testns-$(mktemp -u "XXXXXXXX") + tables="foo bar baz quux" +global_ret=0 +eret=0 +lret=0 + +check_result() +{ + local r=$1 + local OK="PASS" + + if [ $r -ne 0 ] ;then + OK="FAIL" + global_ret=$r + fi + + echo "$OK: nft $2 test returned $r" + + eret=0 +} nft --version > /dev/null 2>&1 if [ $? -ne 0 ];then @@ -59,16 +78,66 @@ done) sleep 1 +ip netns exec "$testns" nft -f "$tmp" for i in $(seq 1 10) ; do ip netns exec "$testns" nft -f "$tmp" & done for table in $tables;do - randsleep=$((RANDOM%10)) + randsleep=$((RANDOM%2)) sleep $randsleep - ip netns exec "$testns" nft delete table inet $table 2>/dev/null + ip netns exec "$testns" nft delete table inet $table + lret=$? + if [ $lret -ne 0 ]; then + eret=$lret + fi done -randsleep=$((RANDOM%10)) -sleep $randsleep +check_result $eret "add/delete" + +for i in $(seq 1 10) ; do + (echo "flush ruleset"; cat "$tmp") | ip netns exec "$testns" nft -f /dev/stdin + + lret=$? + if [ $lret -ne 0 ]; then + eret=$lret + fi +done + +check_result $eret "reload" + +for i in $(seq 1 10) ; do + (echo "flush ruleset"; cat "$tmp" + echo "insert rule inet foo INPUT meta nftrace set 1" + echo "insert rule inet foo OUTPUT meta nftrace set 1" + ) | ip netns exec "$testns" nft -f /dev/stdin + lret=$? + if [ $lret -ne 0 ]; then + eret=$lret + fi + + (echo "flush ruleset"; cat "$tmp" + ) | ip netns exec "$testns" nft -f /dev/stdin + + lret=$? + if [ $lret -ne 0 ]; then + eret=$lret + fi +done + +check_result $eret "add/delete with nftrace enabled" + +echo "insert rule inet foo INPUT meta nftrace set 1" >> $tmp +echo "insert rule inet foo OUTPUT meta nftrace set 1" >> $tmp + +for i in $(seq 1 10) ; do + (echo "flush ruleset"; cat "$tmp") | ip netns exec "$testns" nft -f /dev/stdin + + lret=$? + if [ $lret -ne 0 ]; then + eret=1 + fi +done + +check_result $lret "add/delete with nftrace enabled" pkill -9 ping @@ -76,3 +145,5 @@ wait rm -f "$tmp" ip netns del "$testns" + +exit $global_ret |