diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-07-16 19:28:34 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-07-16 19:28:34 -0700 |
commit | 51835949dda3783d4639cfa74ce13a3c9829de00 (patch) | |
tree | 2b593de5eba6ecc73f7c58fc65fdaffae45c7323 /net/sched/cls_flower.c | |
parent | 0434dbe32053d07d658165be681505120c6b1abc (diff) | |
parent | 77ae5e5b00720372af2860efdc4bc652ac682696 (diff) |
Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextHEADmaster
Pull networking updates from Jakub Kicinski:
"Not much excitement - a handful of large patchsets (devmem among them)
did not make it in time.
Core & protocols:
- Use local_lock in addition to local_bh_disable() to protect per-CPU
resources in networking, a step closer for local_bh_disable() not
to act as a big lock on PREEMPT_RT
- Use flex array for netdevice priv area, ensure its cache alignment
- Add a sysctl knob to allow user to specify a default rto_min at
socket init time. Bit of a big hammer but multiple companies were
independently carrying such patch downstream so clearly it's useful
- Support scheduling transmission of packets based on CLOCK_TAI
- Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
off using cpusets
- Support multiple L2TPv3 UDP tunnels using the same 5-tuple address
- Allow configuration of multipath hash seed, to both allow
synchronizing hashing of two routers, and preventing partial
accidental sync
- Improve TCP compliance with RFC 9293 for simultaneous connect()
- Support sending NAT keepalives in IPsec ESP in UDP states.
Userspace IKE daemon had to do this before, but the kernel can
better keep track of it
- Support sending supervision HSR frames with MAC addresses stored in
ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled
- Introduce IPPROTO_SMC for selecting SMC when socket is created
- Allow UDP GSO transmit from devices with no checksum offload
- openvswitch: add packet sampling via psample, separating the
sampled traffic from "upcall" packets sent to user space for
forwarding
- nf_tables: shrink memory consumption for transaction objects
Things we sprinkled into general kernel code:
- Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
QCA6390) [ Already merged separately - Linus ]
- Add IRQ information in sysfs for auxiliary bus
- Introduce guard definition for local_lock
- Add aligned flavor of __cacheline_group_{begin, end}() markings for
grouping fields in structures
BPF:
- Notify user space (via epoll) when a struct_ops object is getting
detached/unregistered
- Add new kfuncs for a generic, open-coded bits iterator
- Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
bpf_list_head
- Support resilient split BTF which cuts down on duplication and
makes BTF as compact as possible WRT BTF from modules
- Add support for dumping kfunc prototypes from BTF which enables
both detecting as well as dumping compilable prototypes for kfuncs
- riscv64 BPF JIT improvements in particular to add 12-argument
support for BPF trampolines and to utilize bpf_prog_pack for the
latter
- Add the capability to offload the netfilter flowtable in XDP layer
through kfuncs
Driver API:
- Allow users to configure IRQ tresholds between which automatic IRQ
moderation can choose
- Expand Power Sourcing (PoE) status with power, class and failure
reason. Support setting power limits
- Track additional RSS contexts in the core, make sure configuration
changes don't break them
- Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
ESP data paths
- Support updating firmware on SFP modules
Tests and tooling:
- mptcp: use net/lib.sh to manage netns
- TCP-AO and TCP-MD5: replace debug prints used by tests with
tracepoints
- openvswitch: make test self-contained (don't depend on OvS CLI
tools)
Drivers:
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- increase the max total outstanding PTP TX packets to 4
- add timestamping statistics support
- implement netdev_queue_mgmt_ops
- support new RSS context API
- Intel (100G, ice, idpf):
- implement FEC statistics and dumping signal quality indicators
- support E825C products (with 56Gbps PHYs)
- nVidia/Mellanox:
- support HW-GRO
- mlx4/mlx5: support per-queue statistics via netlink
- obey the max number of EQs setting in sub-functions
- AMD/Solarflare:
- support new RSS context API
- AMD/Pensando:
- ionic: rework fix for doorbell miss to lower overhead and
skip it on new HW
- Wangxun:
- txgbe: support Flow Director perfect filters
- Ethernet NICs consumer, embedded and virtual:
- Add driver for Tehuti Networks TN40xx chips
- Add driver for Meta's internal NIC chips
- Add driver for Ethernet MAC on Airoha EN7581 SoCs
- Add driver for Renesas Ethernet-TSN devices
- Google cloud vNIC:
- flow steering support
- Microsoft vNIC:
- support page sizes other than 4KB on ARM64
- vmware vNIC:
- support latency measurement (update to version 9)
- VirtIO net:
- support for Byte Queue Limits
- support configuring thresholds for automatic IRQ moderation
- support for AF_XDP Rx zero-copy
- Synopsys (stmmac):
- support for STM32MP13 SoC
- let platforms select the right PCS implementation
- TI:
- icssg-prueth: add multicast filtering support
- icssg-prueth: enable PTP timestamping and PPS
- Renesas:
- ravb: improve Rx performance 30-400% by using page pool,
theaded NAPI and timer-based IRQ coalescing
- ravb: add MII support for R-Car V4M
- Cadence (macb):
- macb: add ARP support to Wake-On-LAN
- Cortina:
- use phylib for RX and TX pause configuration
- Ethernet switches:
- nVidia/Mellanox:
- support configuration of multipath hash seed
- report more accurate max MTU
- use page_pool to improve Rx performance
- MediaTek:
- mt7530: add support for bridge port isolation
- Qualcomm:
- qca8k: add support for bridge port isolation
- Microchip:
- lan9371/2: add 100BaseTX PHY support
- NXP:
- vsc73xx: implement VLAN operations
- Ethernet PHYs:
- aquantia: enable support for aqr115c
- aquantia: add support for PHY LEDs
- realtek: add support for rtl8224 2.5Gbps PHY
- xpcs: add memory-mapped device support
- add BroadR-Reach link mode and support in Broadcom's PHY driver
- CAN:
- add document for ISO 15765-2 protocol support
- mcp251xfd: workaround for erratum DS80000789E, use timestamps to
catch when device returns incorrect FIFO status
- WiFi:
- mac80211/cfg80211:
- parse Transmit Power Envelope (TPE) data in mac80211 instead
of in drivers
- improvements for 6 GHz regulatory flexibility
- multi-link improvements
- support multiple radios per wiphy
- remove DEAUTH_NEED_MGD_TX_PREP flag
- Intel (iwlwifi):
- bump FW API to 91 for BZ/SC devices
- report 64-bit radiotap timestamp
- enable P2P low latency by default
- handle Transmit Power Envelope (TPE) advertised by AP
- remove support for older FW for new devices
- fast resume (keeping the device configured)
- mvm: re-enable Multi-Link Operation (MLO)
- aggregation (A-MSDU) optimizations
- MediaTek (mt76):
- mt7925 Multi-Link Operation (MLO) support
- Qualcomm (ath10k):
- LED support for various chipsets
- Qualcomm (ath12k):
- remove unsupported Tx monitor handling
- support channel 2 in 6 GHz band
- support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
- supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
Advertisements (EMA)
- support dynamic VLAN
- add panic handler for resetting the firmware state
- DebugFS support for datapath statistics
- WCN7850: support for Wake on WLAN
- Microchip (wilc1000):
- read MAC address during probe to make it visible to user space
- suspend/resume improvements
- TI (wl18xx):
- support newer firmware versions
- RealTek (rtw89):
- preparation for RTL8852BE-VT support
- Wake on WLAN support for WiFi 6 chips
- 36-bit PCI DMA support
- RealTek (rtlwifi):
- RTL8192DU support
- Broadcom (brcmfmac):
- Management Frame Protection support (to enable WPA3)
- Bluetooth:
- qualcomm: use the power sequencer for QCA6390
- btusb: mediatek: add ISO data transmission functions
- hci_bcm4377: add BCM4388 support
- btintel: add support for BlazarU core
- btintel: add support for Whale Peak2
- btnxpuart: add support for AW693 A1 chipset
- btnxpuart: add support for IW615 chipset
- btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"
* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
tcp: Replace strncpy() with strscpy()
wifi: ath12k: fix build vs old compiler
tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
eth: fbnic: Add L2 address programming
eth: fbnic: Add basic Rx handling
eth: fbnic: Add basic Tx handling
eth: fbnic: Add link detection
eth: fbnic: Add initial messaging to notify FW of our presence
eth: fbnic: Implement Rx queue alloc/start/stop/free
eth: fbnic: Implement Tx queue alloc/start/stop/free
eth: fbnic: Allocate a netdevice and napi vectors with queues
eth: fbnic: Add FW communication mechanism
eth: fbnic: Add message parsing for FW messages
eth: fbnic: Add register init to set PCIe/Ethernet device config
eth: fbnic: Allocate core device specific structures and devlink interface
eth: fbnic: Add scaffolding for Meta's NIC driver
PCI: Add Meta Platforms vendor ID
net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
...
Diffstat (limited to 'net/sched/cls_flower.c')
-rw-r--r-- | net/sched/cls_flower.c | 132 |
1 files changed, 108 insertions, 24 deletions
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index fd9a6f20b60b..e280c27cb9f9 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -41,6 +41,16 @@ #define TCA_FLOWER_KEY_CT_FLAGS_MASK \ (TCA_FLOWER_KEY_CT_FLAGS_MAX - 1) +#define TCA_FLOWER_KEY_FLAGS_POLICY_MASK \ + (TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT | \ + TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST) + +#define TCA_FLOWER_KEY_ENC_FLAGS_POLICY_MASK \ + (TCA_FLOWER_KEY_FLAGS_TUNNEL_CSUM | \ + TCA_FLOWER_KEY_FLAGS_TUNNEL_DONT_FRAGMENT | \ + TCA_FLOWER_KEY_FLAGS_TUNNEL_OAM | \ + TCA_FLOWER_KEY_FLAGS_TUNNEL_CRIT_OPT) + struct fl_flow_key { struct flow_dissector_key_meta meta; struct flow_dissector_key_control control; @@ -669,8 +679,10 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = { [TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK] = { .type = NLA_U16 }, [TCA_FLOWER_KEY_ENC_UDP_DST_PORT] = { .type = NLA_U16 }, [TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK] = { .type = NLA_U16 }, - [TCA_FLOWER_KEY_FLAGS] = { .type = NLA_U32 }, - [TCA_FLOWER_KEY_FLAGS_MASK] = { .type = NLA_U32 }, + [TCA_FLOWER_KEY_FLAGS] = NLA_POLICY_MASK(NLA_BE32, + TCA_FLOWER_KEY_FLAGS_POLICY_MASK), + [TCA_FLOWER_KEY_FLAGS_MASK] = NLA_POLICY_MASK(NLA_BE32, + TCA_FLOWER_KEY_FLAGS_POLICY_MASK), [TCA_FLOWER_KEY_ICMPV4_TYPE] = { .type = NLA_U8 }, [TCA_FLOWER_KEY_ICMPV4_TYPE_MASK] = { .type = NLA_U8 }, [TCA_FLOWER_KEY_ICMPV4_CODE] = { .type = NLA_U8 }, @@ -732,6 +744,10 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = { [TCA_FLOWER_KEY_SPI_MASK] = { .type = NLA_U32 }, [TCA_FLOWER_L2_MISS] = NLA_POLICY_MAX(NLA_U8, 1), [TCA_FLOWER_KEY_CFM] = { .type = NLA_NESTED }, + [TCA_FLOWER_KEY_ENC_FLAGS] = NLA_POLICY_MASK(NLA_BE32, + TCA_FLOWER_KEY_ENC_FLAGS_POLICY_MASK), + [TCA_FLOWER_KEY_ENC_FLAGS_MASK] = NLA_POLICY_MASK(NLA_BE32, + TCA_FLOWER_KEY_ENC_FLAGS_POLICY_MASK), }; static const struct nla_policy @@ -1155,19 +1171,29 @@ static void fl_set_key_flag(u32 flower_key, u32 flower_mask, } } -static int fl_set_key_flags(struct nlattr **tb, u32 *flags_key, - u32 *flags_mask, struct netlink_ext_ack *extack) +static int fl_set_key_flags(struct nlattr *tca_opts, struct nlattr **tb, + bool encap, u32 *flags_key, u32 *flags_mask, + struct netlink_ext_ack *extack) { + int fl_key, fl_mask; u32 key, mask; + if (encap) { + fl_key = TCA_FLOWER_KEY_ENC_FLAGS; + fl_mask = TCA_FLOWER_KEY_ENC_FLAGS_MASK; + } else { + fl_key = TCA_FLOWER_KEY_FLAGS; + fl_mask = TCA_FLOWER_KEY_FLAGS_MASK; + } + /* mask is mandatory for flags */ - if (!tb[TCA_FLOWER_KEY_FLAGS_MASK]) { + if (NL_REQ_ATTR_CHECK(extack, tca_opts, tb, fl_mask)) { NL_SET_ERR_MSG(extack, "Missing flags mask"); return -EINVAL; } - key = be32_to_cpu(nla_get_be32(tb[TCA_FLOWER_KEY_FLAGS])); - mask = be32_to_cpu(nla_get_be32(tb[TCA_FLOWER_KEY_FLAGS_MASK])); + key = be32_to_cpu(nla_get_be32(tb[fl_key])); + mask = be32_to_cpu(nla_get_be32(tb[fl_mask])); *flags_key = 0; *flags_mask = 0; @@ -1178,6 +1204,21 @@ static int fl_set_key_flags(struct nlattr **tb, u32 *flags_key, TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST, FLOW_DIS_FIRST_FRAG); + fl_set_key_flag(key, mask, flags_key, flags_mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_CSUM, + FLOW_DIS_F_TUNNEL_CSUM); + + fl_set_key_flag(key, mask, flags_key, flags_mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_DONT_FRAGMENT, + FLOW_DIS_F_TUNNEL_DONT_FRAGMENT); + + fl_set_key_flag(key, mask, flags_key, flags_mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_OAM, FLOW_DIS_F_TUNNEL_OAM); + + fl_set_key_flag(key, mask, flags_key, flags_mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_CRIT_OPT, + FLOW_DIS_F_TUNNEL_CRIT_OPT); + return 0; } @@ -1825,9 +1866,9 @@ static int fl_set_key_cfm(struct nlattr **tb, return 0; } -static int fl_set_key(struct net *net, struct nlattr **tb, - struct fl_flow_key *key, struct fl_flow_key *mask, - struct netlink_ext_ack *extack) +static int fl_set_key(struct net *net, struct nlattr *tca_opts, + struct nlattr **tb, struct fl_flow_key *key, + struct fl_flow_key *mask, struct netlink_ext_ack *extack) { __be16 ethertype; int ret = 0; @@ -2059,9 +2100,18 @@ static int fl_set_key(struct net *net, struct nlattr **tb, if (ret) return ret; - if (tb[TCA_FLOWER_KEY_FLAGS]) - ret = fl_set_key_flags(tb, &key->control.flags, + if (tb[TCA_FLOWER_KEY_FLAGS]) { + ret = fl_set_key_flags(tca_opts, tb, false, + &key->control.flags, &mask->control.flags, extack); + if (ret) + return ret; + } + + if (tb[TCA_FLOWER_KEY_ENC_FLAGS]) + ret = fl_set_key_flags(tca_opts, tb, true, + &key->enc_control.flags, + &mask->enc_control.flags, extack); return ret; } @@ -2152,7 +2202,8 @@ static void fl_init_dissector(struct flow_dissector *dissector, FL_KEY_SET_IF_MASKED(mask, keys, cnt, FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS, enc_ipv6); if (FL_KEY_IS_MASKED(mask, enc_ipv4) || - FL_KEY_IS_MASKED(mask, enc_ipv6)) + FL_KEY_IS_MASKED(mask, enc_ipv6) || + FL_KEY_IS_MASKED(mask, enc_control)) FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_CONTROL, enc_control); FL_KEY_SET_IF_MASKED(mask, keys, cnt, @@ -2310,6 +2361,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, { struct cls_fl_head *head = fl_head_dereference(tp); bool rtnl_held = !(flags & TCA_ACT_FLAGS_NO_RTNL); + struct nlattr *tca_opts = tca[TCA_OPTIONS]; struct cls_fl_filter *fold = *arg; bool bound_to_filter = false; struct cls_fl_filter *fnew; @@ -2318,7 +2370,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, bool in_ht; int err; - if (!tca[TCA_OPTIONS]) { + if (!tca_opts) { err = -EINVAL; goto errout_fold; } @@ -2336,7 +2388,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, } err = nla_parse_nested_deprecated(tb, TCA_FLOWER_MAX, - tca[TCA_OPTIONS], fl_policy, NULL); + tca_opts, fl_policy, NULL); if (err < 0) goto errout_tb; @@ -2412,7 +2464,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, bound_to_filter = true; } - err = fl_set_key(net, tb, &fnew->key, &mask->key, extack); + err = fl_set_key(net, tca_opts, tb, &fnew->key, &mask->key, extack); if (err) goto unbind_filter; @@ -2752,18 +2804,19 @@ static void *fl_tmplt_create(struct net *net, struct tcf_chain *chain, struct nlattr **tca, struct netlink_ext_ack *extack) { + struct nlattr *tca_opts = tca[TCA_OPTIONS]; struct fl_flow_tmplt *tmplt; struct nlattr **tb; int err; - if (!tca[TCA_OPTIONS]) + if (!tca_opts) return ERR_PTR(-EINVAL); tb = kcalloc(TCA_FLOWER_MAX + 1, sizeof(struct nlattr *), GFP_KERNEL); if (!tb) return ERR_PTR(-ENOBUFS); err = nla_parse_nested_deprecated(tb, TCA_FLOWER_MAX, - tca[TCA_OPTIONS], fl_policy, NULL); + tca_opts, fl_policy, NULL); if (err) goto errout_tb; @@ -2773,7 +2826,8 @@ static void *fl_tmplt_create(struct net *net, struct tcf_chain *chain, goto errout_tb; } tmplt->chain = chain; - err = fl_set_key(net, tb, &tmplt->dummy_key, &tmplt->mask, extack); + err = fl_set_key(net, tca_opts, tb, &tmplt->dummy_key, + &tmplt->mask, extack); if (err) goto errout_tmplt; @@ -3049,12 +3103,22 @@ static void fl_get_key_flag(u32 dissector_key, u32 dissector_mask, } } -static int fl_dump_key_flags(struct sk_buff *skb, u32 flags_key, u32 flags_mask) +static int fl_dump_key_flags(struct sk_buff *skb, bool encap, + u32 flags_key, u32 flags_mask) { - u32 key, mask; + int fl_key, fl_mask; __be32 _key, _mask; + u32 key, mask; int err; + if (encap) { + fl_key = TCA_FLOWER_KEY_ENC_FLAGS; + fl_mask = TCA_FLOWER_KEY_ENC_FLAGS_MASK; + } else { + fl_key = TCA_FLOWER_KEY_FLAGS; + fl_mask = TCA_FLOWER_KEY_FLAGS_MASK; + } + if (!memchr_inv(&flags_mask, 0, sizeof(flags_mask))) return 0; @@ -3067,14 +3131,29 @@ static int fl_dump_key_flags(struct sk_buff *skb, u32 flags_key, u32 flags_mask) TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST, FLOW_DIS_FIRST_FRAG); + fl_get_key_flag(flags_key, flags_mask, &key, &mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_CSUM, + FLOW_DIS_F_TUNNEL_CSUM); + + fl_get_key_flag(flags_key, flags_mask, &key, &mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_DONT_FRAGMENT, + FLOW_DIS_F_TUNNEL_DONT_FRAGMENT); + + fl_get_key_flag(flags_key, flags_mask, &key, &mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_OAM, FLOW_DIS_F_TUNNEL_OAM); + + fl_get_key_flag(flags_key, flags_mask, &key, &mask, + TCA_FLOWER_KEY_FLAGS_TUNNEL_CRIT_OPT, + FLOW_DIS_F_TUNNEL_CRIT_OPT); + _key = cpu_to_be32(key); _mask = cpu_to_be32(mask); - err = nla_put(skb, TCA_FLOWER_KEY_FLAGS, 4, &_key); + err = nla_put(skb, fl_key, 4, &_key); if (err) return err; - return nla_put(skb, TCA_FLOWER_KEY_FLAGS_MASK, 4, &_mask); + return nla_put(skb, fl_mask, 4, &_mask); } static int fl_dump_key_geneve_opt(struct sk_buff *skb, @@ -3581,7 +3660,8 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, if (fl_dump_key_ct(skb, &key->ct, &mask->ct)) goto nla_put_failure; - if (fl_dump_key_flags(skb, key->control.flags, mask->control.flags)) + if (fl_dump_key_flags(skb, false, key->control.flags, + mask->control.flags)) goto nla_put_failure; if (fl_dump_key_val(skb, &key->hash.hash, TCA_FLOWER_KEY_HASH, @@ -3592,6 +3672,10 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, if (fl_dump_key_cfm(skb, &key->cfm, &mask->cfm)) goto nla_put_failure; + if (fl_dump_key_flags(skb, true, key->enc_control.flags, + mask->enc_control.flags)) + goto nla_put_failure; + return 0; nla_put_failure: |