diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-29 11:57:23 -0700 |
commit | 9d31d2338950293ec19d9b095fbaa9030899dcb4 (patch) | |
tree | e688040d0557c24a2eeb9f6c9c223d949f6f7ef9 /include/net | |
parent | 635de956a7f5a6ffcb04f29d70630c64c717b56b (diff) | |
parent | 4a52dd8fefb45626dace70a63c0738dbd83b7edb (diff) |
Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
Diffstat (limited to 'include/net')
50 files changed, 683 insertions, 385 deletions
diff --git a/include/net/addrconf.h b/include/net/addrconf.h index 18f783dcd55f..78ea3e332688 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -233,7 +233,6 @@ void ipv6_mc_unmap(struct inet6_dev *idev); void ipv6_mc_remap(struct inet6_dev *idev); void ipv6_mc_init_dev(struct inet6_dev *idev); void ipv6_mc_destroy_dev(struct inet6_dev *idev); -int ipv6_mc_check_icmpv6(struct sk_buff *skb); int ipv6_mc_check_mld(struct sk_buff *skb); void addrconf_dad_failure(struct sk_buff *skb, struct inet6_ifaddr *ifp); diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h index ba2f439bc04d..ea4ae551c426 100644 --- a/include/net/bluetooth/hci.h +++ b/include/net/bluetooth/hci.h @@ -320,6 +320,7 @@ enum { HCI_BREDR_ENABLED, HCI_LE_SCAN_INTERRUPTED, HCI_WIDEBAND_SPEECH_ENABLED, + HCI_EVENT_FILTER_CONFIGURED, HCI_DUT_MODE, HCI_VENDOR_DIAG, diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index ebdd4afe30d2..c73ac52af186 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -584,6 +584,11 @@ struct hci_dev { #if IS_ENABLED(CONFIG_BT_MSFTEXT) __u16 msft_opcode; void *msft_data; + bool msft_curve_validity; +#endif + +#if IS_ENABLED(CONFIG_BT_AOSPEXT) + bool aosp_capable; #endif int (*open)(struct hci_dev *hdev); @@ -704,6 +709,7 @@ struct hci_chan { struct sk_buff_head data_q; unsigned int sent; __u8 state; + bool amp; }; struct hci_conn_params { @@ -1238,6 +1244,13 @@ static inline void hci_set_msft_opcode(struct hci_dev *hdev, __u16 opcode) #endif } +static inline void hci_set_aosp_capable(struct hci_dev *hdev) +{ +#if IS_ENABLED(CONFIG_BT_AOSPEXT) + hdev->aosp_capable = true; +#endif +} + int hci_dev_open(__u16 dev); int hci_dev_close(__u16 dev); int hci_dev_do_close(struct hci_dev *hdev); @@ -1742,8 +1755,8 @@ void hci_mgmt_chan_unregister(struct hci_mgmt_chan *c); #define DISCOV_INTERLEAVED_INQUIRY_LEN 0x04 #define DISCOV_BREDR_INQUIRY_LEN 0x08 #define DISCOV_LE_RESTART_DELAY msecs_to_jiffies(200) /* msec */ -#define DISCOV_LE_FAST_ADV_INT_MIN 100 /* msec */ -#define DISCOV_LE_FAST_ADV_INT_MAX 150 /* msec */ +#define DISCOV_LE_FAST_ADV_INT_MIN 0x00A0 /* 100 msec */ +#define DISCOV_LE_FAST_ADV_INT_MAX 0x00F0 /* 150 msec */ void mgmt_fill_version_info(void *ver); int mgmt_new_settings(struct hci_dev *hdev); diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h index 61800a7b6192..3c4f550e5a8b 100644 --- a/include/net/bluetooth/l2cap.h +++ b/include/net/bluetooth/l2cap.h @@ -494,6 +494,7 @@ struct l2cap_le_credits { #define L2CAP_ECRED_MIN_MTU 64 #define L2CAP_ECRED_MIN_MPS 64 +#define L2CAP_ECRED_MAX_CID 5 struct l2cap_ecred_conn_req { __le16 psm; diff --git a/include/net/bluetooth/mgmt.h b/include/net/bluetooth/mgmt.h index 839a2028009e..a7cffb069565 100644 --- a/include/net/bluetooth/mgmt.h +++ b/include/net/bluetooth/mgmt.h @@ -578,6 +578,7 @@ struct mgmt_rp_add_advertising { #define MGMT_ADV_PARAM_TIMEOUT BIT(13) #define MGMT_ADV_PARAM_INTERVALS BIT(14) #define MGMT_ADV_PARAM_TX_POWER BIT(15) +#define MGMT_ADV_PARAM_SCAN_RSP BIT(16) #define MGMT_ADV_FLAG_SEC_MASK (MGMT_ADV_FLAG_SEC_1M | MGMT_ADV_FLAG_SEC_2M | \ MGMT_ADV_FLAG_SEC_CODED) diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h index 0e85713f56df..2926f1f00d65 100644 --- a/include/net/bpf_sk_storage.h +++ b/include/net/bpf_sk_storage.h @@ -27,7 +27,6 @@ struct bpf_local_storage_elem; struct bpf_sk_storage_diag; struct sk_buff; struct nlattr; -struct sock; #ifdef CONFIG_BPF_SYSCALL int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk); diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 911fae42b0c0..5224f885a99a 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -11,6 +11,7 @@ */ #include <linux/ethtool.h> +#include <uapi/linux/rfkill.h> #include <linux/netdevice.h> #include <linux/debugfs.h> #include <linux/list.h> @@ -359,7 +360,7 @@ struct ieee80211_sta_he_cap { }; /** - * struct ieee80211_sband_iftype_data + * struct ieee80211_sband_iftype_data - sband data per interface type * * This structure encapsulates sband data that is relevant for the * interface types defined in @types_mask. Each type in the @@ -3520,6 +3521,8 @@ struct cfg80211_pmsr_result { * @non_trigger_based: use non trigger based ranging for the measurement * If neither @trigger_based nor @non_trigger_based is set, * EDCA based ranging will be used. + * @lmr_feedback: negotiate for I2R LMR feedback. Only valid if either + * @trigger_based or @non_trigger_based is set. * * See also nl80211 for the respective attribute documentation. */ @@ -3531,7 +3534,8 @@ struct cfg80211_pmsr_ftm_request_peer { request_lci:1, request_civicloc:1, trigger_based:1, - non_trigger_based:1; + non_trigger_based:1, + lmr_feedback:1; u8 num_bursts_exp; u8 burst_duration; u8 ftms_per_burst; @@ -5606,7 +5610,7 @@ static inline bool cfg80211_channel_is_psc(struct ieee80211_channel *chan) * which is, for this function, given as a bitmap of indices of * rates in the band's bitrate table. */ -struct ieee80211_rate * +const struct ieee80211_rate * ieee80211_get_response_rate(struct ieee80211_supported_band *sband, u32 basic_rates, int bitrate); @@ -6633,11 +6637,19 @@ void cfg80211_notify_new_peer_candidate(struct net_device *dev, */ /** - * wiphy_rfkill_set_hw_state - notify cfg80211 about hw block state + * wiphy_rfkill_set_hw_state_reason - notify cfg80211 about hw block state * @wiphy: the wiphy * @blocked: block status + * @reason: one of reasons in &enum rfkill_hard_block_reasons */ -void wiphy_rfkill_set_hw_state(struct wiphy *wiphy, bool blocked); +void wiphy_rfkill_set_hw_state_reason(struct wiphy *wiphy, bool blocked, + enum rfkill_hard_block_reasons reason); + +static inline void wiphy_rfkill_set_hw_state(struct wiphy *wiphy, bool blocked) +{ + wiphy_rfkill_set_hw_state_reason(wiphy, blocked, + RFKILL_HARD_BLOCK_SIGNAL); +} /** * wiphy_rfkill_start_polling - start polling rfkill @@ -6731,7 +6743,7 @@ cfg80211_vendor_cmd_alloc_reply_skb(struct wiphy *wiphy, int approxlen) int cfg80211_vendor_cmd_reply(struct sk_buff *skb); /** - * cfg80211_vendor_cmd_get_sender + * cfg80211_vendor_cmd_get_sender - get the current sender netlink ID * @wiphy: the wiphy * * Return the current netlink port ID in a vendor command handler. diff --git a/include/net/devlink.h b/include/net/devlink.h index 853420db5d32..7c984cadfec4 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -98,11 +98,13 @@ struct devlink_port_pci_vf_attrs { * @controller: Associated controller number * @sf: Associated PCI SF for of the PCI PF for this port. * @pf: Associated PCI PF number for this port. + * @external: when set, indicates if a port is for an external controller */ struct devlink_port_pci_sf_attrs { u32 controller; u32 sf; u16 pf; + u8 external:1; }; /** @@ -1508,7 +1510,8 @@ void devlink_port_attrs_pci_pf_set(struct devlink_port *devlink_port, u32 contro void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 controller, u16 pf, u16 vf, bool external); void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, - u32 controller, u16 pf, u32 sf); + u32 controller, u16 pf, u32 sf, + bool external); int devlink_sb_register(struct devlink *devlink, unsigned int sb_index, u32 size, u16 ingress_pools_count, u16 egress_pools_count, u16 ingress_tc_count, diff --git a/include/net/dsa.h b/include/net/dsa.h index 83a933e563fe..e1a2610a0e06 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -49,10 +49,12 @@ struct phylink_link_state; #define DSA_TAG_PROTO_XRS700X_VALUE 19 #define DSA_TAG_PROTO_OCELOT_8021Q_VALUE 20 #define DSA_TAG_PROTO_SEVILLE_VALUE 21 +#define DSA_TAG_PROTO_BRCM_LEGACY_VALUE 22 enum dsa_tag_protocol { DSA_TAG_PROTO_NONE = DSA_TAG_PROTO_NONE_VALUE, DSA_TAG_PROTO_BRCM = DSA_TAG_PROTO_BRCM_VALUE, + DSA_TAG_PROTO_BRCM_LEGACY = DSA_TAG_PROTO_BRCM_LEGACY_VALUE, DSA_TAG_PROTO_BRCM_PREPEND = DSA_TAG_PROTO_BRCM_PREPEND_VALUE, DSA_TAG_PROTO_DSA = DSA_TAG_PROTO_DSA_VALUE, DSA_TAG_PROTO_EDSA = DSA_TAG_PROTO_EDSA_VALUE, @@ -115,20 +117,6 @@ struct dsa_netdevice_ops { #define MODULE_ALIAS_DSA_TAG_DRIVER(__proto) \ MODULE_ALIAS(DSA_TAG_DRIVER_ALIAS __stringify(__proto##_VALUE)) -struct dsa_skb_cb { - struct sk_buff *clone; -}; - -struct __dsa_skb_cb { - struct dsa_skb_cb cb; - u8 priv[48 - sizeof(struct dsa_skb_cb)]; -}; - -#define DSA_SKB_CB(skb) ((struct dsa_skb_cb *)((skb)->cb)) - -#define DSA_SKB_CB_PRIV(skb) \ - ((void *)(skb)->cb + offsetof(struct __dsa_skb_cb, priv)) - struct dsa_switch_tree { struct list_head list; @@ -147,6 +135,11 @@ struct dsa_switch_tree { /* Tagging protocol operations */ const struct dsa_device_ops *tag_ops; + /* Default tagging protocol preferred by the switches in this + * tree. + */ + enum dsa_tag_protocol default_proto; + /* * Configuration data for the platform device that owns * this dsa switch tree instance. @@ -258,7 +251,7 @@ struct dsa_port { unsigned int index; const char *name; struct dsa_port *cpu_dp; - const char *mac; + u8 mac[ETH_ALEN]; struct device_node *dn; unsigned int ageing_time; bool vlan_filtering; @@ -491,6 +484,20 @@ static inline bool dsa_port_is_vlan_filtering(const struct dsa_port *dp) return dp->vlan_filtering; } +static inline +struct net_device *dsa_port_to_bridge_port(const struct dsa_port *dp) +{ + if (!dp->bridge_dev) + return NULL; + + if (dp->lag_dev) + return dp->lag_dev; + else if (dp->hsr_dev) + return dp->hsr_dev; + + return dp->slave; +} + typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid, bool is_static, void *data); struct dsa_switch_ops { @@ -561,6 +568,8 @@ struct dsa_switch_ops { int port, uint64_t *data); void (*get_stats64)(struct dsa_switch *ds, int port, struct rtnl_link_stats64 *s); + void (*self_test)(struct dsa_switch *ds, int port, + struct ethtool_test *etest, u64 *data); /* * ethtool Wake-on-LAN @@ -717,8 +726,8 @@ struct dsa_switch_ops { struct ifreq *ifr); int (*port_hwtstamp_set)(struct dsa_switch *ds, int port, struct ifreq *ifr); - bool (*port_txtstamp)(struct dsa_switch *ds, int port, - struct sk_buff *clone, unsigned int type); + void (*port_txtstamp)(struct dsa_switch *ds, int port, + struct sk_buff *skb); bool (*port_rxtstamp)(struct dsa_switch *ds, int port, struct sk_buff *skb, unsigned int type); diff --git a/include/net/flow.h b/include/net/flow.h index 39d0cedcddee..6f5e70240071 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -59,7 +59,6 @@ union flowi_uli { __le16 sport; } dnports; - __be32 spi; __be32 gre_key; struct { @@ -90,7 +89,6 @@ struct flowi4 { #define fl4_dport uli.ports.dport #define fl4_icmp_type uli.icmpt.type #define fl4_icmp_code uli.icmpt.code -#define fl4_ipsec_spi uli.spi #define fl4_mh_type uli.mht.type #define fl4_gre_key uli.gre_key } __attribute__((__aligned__(BITS_PER_LONG/8))); @@ -150,7 +148,6 @@ struct flowi6 { #define fl6_dport uli.ports.dport #define fl6_icmp_type uli.icmpt.type #define fl6_icmp_code uli.icmpt.code -#define fl6_ipsec_spi uli.spi #define fl6_mh_type uli.mht.type #define fl6_gre_key uli.gre_key __u32 mp_hash; diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index cc10b10dc3a1..ffd386ea0dbb 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -350,7 +350,7 @@ static inline bool flow_keys_have_l4(const struct flow_keys *keys) u32 flow_hash_from_keys(struct flow_keys *keys); void skb_flow_get_icmp_tci(const struct sk_buff *skb, struct flow_dissector_key_icmp *key_icmp, - void *data, int thoff, int hlen); + const void *data, int thoff, int hlen); static inline bool dissector_uses_key(const struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id) @@ -368,8 +368,8 @@ static inline void *skb_flow_dissector_target(struct flow_dissector *flow_dissec struct bpf_flow_dissector { struct bpf_flow_keys *flow_keys; const struct sk_buff *skb; - void *data; - void *data_end; + const void *data; + const void *data_end; }; static inline void diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h index e6bd8ebf9ac3..dc5c1e69cd9f 100644 --- a/include/net/flow_offload.h +++ b/include/net/flow_offload.h @@ -147,6 +147,7 @@ enum flow_action_id { FLOW_ACTION_MPLS_POP, FLOW_ACTION_MPLS_MANGLE, FLOW_ACTION_GATE, + FLOW_ACTION_PPPOE_PUSH, NUM_FLOW_ACTIONS, }; @@ -234,6 +235,8 @@ struct flow_action_entry { u32 index; u32 burst; u64 rate_bytes_ps; + u64 burst_pkt; + u64 rate_pkt_ps; u32 mtu; } police; struct { /* FLOW_ACTION_CT */ @@ -272,6 +275,9 @@ struct flow_action_entry { u32 num_entries; struct action_gate_entry *entries; } gate; + struct { /* FLOW_ACTION_PPPOE_PUSH */ + u16 sid; + } pppoe; }; struct flow_action_cookie *cookie; /* user defined action cookie */ }; diff --git a/include/net/gro.h b/include/net/gro.h index 8a6eb5303cc4..01edaf3fdda0 100644 --- a/include/net/gro.h +++ b/include/net/gro.h @@ -3,10 +3,23 @@ #ifndef _NET_IPV6_GRO_H #define _NET_IPV6_GRO_H +#include <linux/indirect_call_wrapper.h> + +struct list_head; +struct sk_buff; + INDIRECT_CALLABLE_DECLARE(struct sk_buff *ipv6_gro_receive(struct list_head *, struct sk_buff *)); INDIRECT_CALLABLE_DECLARE(int ipv6_gro_complete(struct sk_buff *, int)); INDIRECT_CALLABLE_DECLARE(struct sk_buff *inet_gro_receive(struct list_head *, struct sk_buff *)); INDIRECT_CALLABLE_DECLARE(int inet_gro_complete(struct sk_buff *, int)); + +#define indirect_call_gro_receive_inet(cb, f2, f1, head, skb) \ +({ \ + unlikely(gro_recursion_inc_test(skb)) ? \ + NAPI_GRO_CB(skb)->flush |= 1, NULL : \ + INDIRECT_CALL_INET(cb, f2, f1, head, skb); \ +}) + #endif /* _NET_IPV6_GRO_H */ diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 8bf5906073bc..71bb4cc4d05d 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -78,6 +78,7 @@ struct inet6_ifaddr { struct ip6_sf_socklist { unsigned int sl_max; unsigned int sl_count; + struct rcu_head rcu; struct in6_addr sl_addr[]; }; @@ -91,18 +92,18 @@ struct ipv6_mc_socklist { int ifindex; unsigned int sfmode; /* MCAST_{INCLUDE,EXCLUDE} */ struct ipv6_mc_socklist __rcu *next; - rwlock_t sflock; - struct ip6_sf_socklist *sflist; + struct ip6_sf_socklist __rcu *sflist; struct rcu_head rcu; }; struct ip6_sf_list { - struct ip6_sf_list *sf_next; + struct ip6_sf_list __rcu *sf_next; struct in6_addr sf_addr; unsigned long sf_count[2]; /* include/exclude counts */ unsigned char sf_gsresp; /* include in g & s response? */ unsigned char sf_oldin; /* change state */ unsigned char sf_crcount; /* retrans. left to send */ + struct rcu_head rcu; }; #define MAF_TIMER_RUNNING 0x01 @@ -114,19 +115,19 @@ struct ip6_sf_list { struct ifmcaddr6 { struct in6_addr mca_addr; struct inet6_dev *idev; - struct ifmcaddr6 *next; - struct ip6_sf_list *mca_sources; - struct ip6_sf_list *mca_tomb; + struct ifmcaddr6 __rcu *next; + struct ip6_sf_list __rcu *mca_sources; + struct ip6_sf_list __rcu *mca_tomb; unsigned int mca_sfmode; unsigned char mca_crcount; unsigned long mca_sfcount[2]; - struct timer_list mca_timer; + struct delayed_work mca_work; unsigned int mca_flags; int mca_users; refcount_t mca_refcnt; - spinlock_t mca_lock; unsigned long mca_cstamp; unsigned long mca_tstamp; + struct rcu_head rcu; }; /* Anycast stuff */ @@ -165,9 +166,8 @@ struct inet6_dev { struct list_head addr_list; - struct ifmcaddr6 *mc_list; - struct ifmcaddr6 *mc_tomb; - spinlock_t mc_lock; + struct ifmcaddr6 __rcu *mc_list; + struct ifmcaddr6 __rcu *mc_tomb; unsigned char mc_qrv; /* Query Robustness Variable */ unsigned char mc_gq_running; @@ -179,9 +179,18 @@ struct inet6_dev { unsigned long mc_qri; /* Query Response Interval */ unsigned long mc_maxdelay; - struct timer_list mc_gq_timer; /* general query timer */ - struct timer_list mc_ifc_timer; /* interface change timer */ - struct timer_list mc_dad_timer; /* dad complete mc timer */ + struct delayed_work mc_gq_work; /* general query work */ + struct delayed_work mc_ifc_work; /* interface change work */ + struct delayed_work mc_dad_work; /* dad complete mc work */ + struct delayed_work mc_query_work; /* mld query work */ + struct delayed_work mc_report_work; /* mld report work */ + + struct sk_buff_head mc_query_queue; /* mld query queue */ + struct sk_buff_head mc_report_queue; /* mld report queue */ + + spinlock_t mc_query_lock; /* mld query queue lock */ + spinlock_t mc_report_lock; /* mld query report lock */ + struct mutex mc_lock; /* mld global lock */ struct ifacaddr6 *ac_list; rwlock_t lock; diff --git a/include/net/ipv6.h b/include/net/ipv6.h index bd1f396cc9c7..448bf2b34759 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -30,6 +30,7 @@ */ #define NEXTHDR_HOP 0 /* Hop-by-hop option header. */ +#define NEXTHDR_IPV4 4 /* IPv4 in IPv6 */ #define NEXTHDR_TCP 6 /* TCP segment. */ #define NEXTHDR_UDP 17 /* UDP message. */ #define NEXTHDR_IPV6 41 /* IPv6 in IPv6 */ diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h index 8fce558b5fea..afbce90c4480 100644 --- a/include/net/ipv6_stubs.h +++ b/include/net/ipv6_stubs.h @@ -66,6 +66,8 @@ struct ipv6_stub { int (*ipv6_fragment)(struct net *net, struct sock *sk, struct sk_buff *skb, int (*output)(struct net *, struct sock *, struct sk_buff *)); + struct net_device *(*ipv6_dev_find)(struct net *net, const struct in6_addr *addr, + struct net_device *dev); }; extern const struct ipv6_stub *ipv6_stub __read_mostly; diff --git a/include/net/lapb.h b/include/net/lapb.h index eee73442a1ba..124ee122f2c8 100644 --- a/include/net/lapb.h +++ b/include/net/lapb.h @@ -92,7 +92,7 @@ struct lapb_cb { unsigned short n2, n2count; unsigned short t1, t2; struct timer_list t1timer, t2timer; - bool t1timer_stop, t2timer_stop; + bool t1timer_running, t2timer_running; /* Internal control information */ struct sk_buff_head write_queue; diff --git a/include/net/mac80211.h b/include/net/mac80211.h index 2d1d629e5d14..445b66c6eb7e 100644 --- a/include/net/mac80211.h +++ b/include/net/mac80211.h @@ -1768,10 +1768,7 @@ struct ieee80211_vif *wdev_to_ieee80211_vif(struct wireless_dev *wdev); * * This can be used by mac80211 drivers with direct cfg80211 APIs * (like the vendor commands) that needs to get the wdev for a vif. - * - * Note that this function may return %NULL if the given wdev isn't - * associated with a vif that the driver knows about (e.g. monitor - * or AP_VLAN interfaces.) + * This can also be useful to get the netdev associated to a vif. */ struct wireless_dev *ieee80211_vif_to_wdev(struct ieee80211_vif *vif); @@ -2399,6 +2396,12 @@ struct ieee80211_txq { * @IEEE80211_HW_SUPPORTS_RX_DECAP_OFFLOAD: Hardware supports rx decapsulation * offload * + * @IEEE80211_HW_SUPPORTS_CONC_MON_RX_DECAP: Hardware supports concurrent rx + * decapsulation offload and passing raw 802.11 frames for monitor iface. + * If this is supported, the driver must pass both 802.3 frames for real + * usage and 802.11 frames with %RX_FLAG_ONLY_MONITOR set for monitor to + * the stack. + * * @NUM_IEEE80211_HW_FLAGS: number of hardware flags, used for sizing arrays */ enum ieee80211_hw_flags { @@ -2453,6 +2456,7 @@ enum ieee80211_hw_flags { IEEE80211_HW_AMPDU_KEYBORDER_SUPPORT, IEEE80211_HW_SUPPORTS_TX_ENCAP_OFFLOAD, IEEE80211_HW_SUPPORTS_RX_DECAP_OFFLOAD, + IEEE80211_HW_SUPPORTS_CONC_MON_RX_DECAP, /* keep last, obviously */ NUM_IEEE80211_HW_FLAGS diff --git a/include/net/mld.h b/include/net/mld.h index 496bddb59942..c07359808493 100644 --- a/include/net/mld.h +++ b/include/net/mld.h @@ -92,6 +92,9 @@ struct mld2_query { #define MLD_EXP_MIN_LIMIT 32768UL #define MLDV1_MRD_MAX_COMPAT (MLD_EXP_MIN_LIMIT - 1) +#define MLD_MAX_QUEUE 8 +#define MLD_MAX_SKBS 32 + static inline unsigned long mldv2_mrc(const struct mld2_query *mlh2) { /* RFC3810, 5.1.3. Maximum Response Code */ diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 5694370be3d4..83f23774b908 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -30,8 +30,27 @@ struct mptcp_ext { ack64:1, mpc_map:1, frozen:1, - __unused:1; - /* one byte hole */ + reset_transient:1; + u8 reset_reason:4; +}; + +#define MPTCP_RM_IDS_MAX 8 + +struct mptcp_rm_list { + u8 ids[MPTCP_RM_IDS_MAX]; + u8 nr; +}; + +struct mptcp_addr_info { + u8 id; + sa_family_t family; + __be16 port; + union { + struct in_addr addr; +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + struct in6_addr addr6; +#endif + }; }; struct mptcp_out_options { @@ -39,18 +58,13 @@ struct mptcp_out_options { u16 suboptions; u64 sndr_key; u64 rcvr_key; - union { - struct in_addr addr; -#if IS_ENABLED(CONFIG_MPTCP_IPV6) - struct in6_addr addr6; -#endif - }; - u8 addr_id; - u16 port; u64 ahmac; - u8 rm_id; + struct mptcp_addr_info addr; + struct mptcp_rm_list rm_list; u8 join_id; u8 backup; + u8 reset_reason:4; + u8 reset_transient:1; u32 nonce; u64 thmac; u32 token; @@ -149,6 +163,16 @@ void mptcp_seq_show(struct seq_file *seq); int mptcp_subflow_init_cookie_req(struct request_sock *req, const struct sock *sk_listener, struct sk_buff *skb); + +__be32 mptcp_get_reset_option(const struct sk_buff *skb); + +static inline __be32 mptcp_reset_option(const struct sk_buff *skb) +{ + if (skb_ext_exist(skb, SKB_EXT_MPTCP)) + return mptcp_get_reset_option(skb); + + return htonl(0u); +} #else static inline void mptcp_init(void) @@ -229,6 +253,8 @@ static inline int mptcp_subflow_init_cookie_req(struct request_sock *req, { return 0; /* TCP fallback */ } + +static inline __be32 mptcp_reset_option(const struct sk_buff *skb) { return htonl(0u); } #endif /* CONFIG_MPTCP */ #if IS_ENABLED(CONFIG_MPTCP_IPV6) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index dcaee24a4d87..fa5887143f0d 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -22,7 +22,6 @@ #include <net/netns/nexthop.h> #include <net/netns/ieee802154_6lowpan.h> #include <net/netns/sctp.h> -#include <net/netns/dccp.h> #include <net/netns/netfilter.h> #include <net/netns/x_tables.h> #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) @@ -130,9 +129,6 @@ struct net { #if defined(CONFIG_IP_SCTP) || defined(CONFIG_IP_SCTP_MODULE) struct netns_sctp sctp; #endif -#if defined(CONFIG_IP_DCCP) || defined(CONFIG_IP_DCCP_MODULE) - struct netns_dccp dccp; -#endif #ifdef CONFIG_NETFILTER struct netns_nf nf; struct netns_xt xt; @@ -142,15 +138,6 @@ struct net { #if defined(CONFIG_NF_TABLES) || defined(CONFIG_NF_TABLES_MODULE) struct netns_nftables nft; #endif -#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) - struct netns_nf_frag nf_frag; - struct ctl_table_header *nf_frag_frags_hdr; -#endif - struct sock *nfnl; - struct sock *nfnl_stash; -#if IS_ENABLED(CONFIG_NF_CT_NETLINK_TIMEOUT) - struct list_head nfct_timeout_list; -#endif #endif #ifdef CONFIG_WEXT_CORE struct sk_buff_head wext_nlevents; @@ -407,7 +394,6 @@ int register_pernet_device(struct pernet_operations *); void unregister_pernet_device(struct pernet_operations *); struct ctl_table; -struct ctl_table_header; #ifdef CONFIG_SYSCTL int net_sysctl_init(void); diff --git a/include/net/netfilter/ipv4/nf_defrag_ipv4.h b/include/net/netfilter/ipv4/nf_defrag_ipv4.h index bcbd724cc048..7fda9ce9f694 100644 --- a/include/net/netfilter/ipv4/nf_defrag_ipv4.h +++ b/include/net/netfilter/ipv4/nf_defrag_ipv4.h @@ -3,6 +3,7 @@ #define _NF_DEFRAG_IPV4_H struct net; -int nf_defrag_ipv4_enable(struct net *); +int nf_defrag_ipv4_enable(struct net *net); +void nf_defrag_ipv4_disable(struct net *net); #endif /* _NF_DEFRAG_IPV4_H */ diff --git a/include/net/netfilter/ipv6/nf_conntrack_ipv6.h b/include/net/netfilter/ipv6/nf_conntrack_ipv6.h index 7b3c873f8839..e95483192d1b 100644 --- a/include/net/netfilter/ipv6/nf_conntrack_ipv6.h +++ b/include/net/netfilter/ipv6/nf_conntrack_ipv6.h @@ -4,7 +4,4 @@ extern const struct nf_conntrack_l4proto nf_conntrack_l4proto_icmpv6; -#include <linux/sysctl.h> -extern struct ctl_table nf_ct_ipv6_sysctl_table[]; - #endif /* _NF_CONNTRACK_IPV6_H*/ diff --git a/include/net/netfilter/ipv6/nf_defrag_ipv6.h b/include/net/netfilter/ipv6/nf_defrag_ipv6.h index 6d31cd041143..0fd8a4159662 100644 --- a/include/net/netfilter/ipv6/nf_defrag_ipv6.h +++ b/include/net/netfilter/ipv6/nf_defrag_ipv6.h @@ -5,7 +5,8 @@ #include <linux/skbuff.h> #include <linux/types.h> -int nf_defrag_ipv6_enable(struct net *); +int nf_defrag_ipv6_enable(struct net *net); +void nf_defrag_ipv6_disable(struct net *net); int nf_ct_frag6_init(void); void nf_ct_frag6_cleanup(void); @@ -13,4 +14,10 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user); struct inet_frags_ctl; +struct nft_ct_frag6_pernet { + struct ctl_table_header *nf_frag_frags_hdr; + struct fqdir *fqdir; + unsigned int users; +}; + #endif /* _NF_DEFRAG_IPV6_H */ diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h index 439379ca9ffa..06dc6db70d18 100644 --- a/include/net/netfilter/nf_conntrack.h +++ b/include/net/netfilter/nf_conntrack.h @@ -44,9 +44,23 @@ union nf_conntrack_expect_proto { }; struct nf_conntrack_net { + /* only used when new connection is allocated: */ + atomic_t count; + unsigned int expect_count; + u8 sysctl_auto_assign_helper; + bool auto_assign_helper_warned; + + /* only used from work queues, configuration plane, and so on: */ unsigned int users4; unsigned int users6; unsigned int users_bridge; +#ifdef CONFIG_SYSCTL + struct ctl_table_header *sysctl_header; +#endif +#ifdef CONFIG_NF_CONNTRACK_EVENTS + struct delayed_work ecache_dwork; + struct netns_ct *ct_net; +#endif }; #include <linux/types.h> @@ -324,6 +338,7 @@ struct nf_conn *nf_ct_tmpl_alloc(struct net *net, void nf_ct_tmpl_free(struct nf_conn *tmpl); u32 nf_ct_get_id(const struct nf_conn *ct); +u32 nf_conntrack_count(const struct net *net); static inline void nf_ct_set(struct sk_buff *skb, struct nf_conn *ct, enum ip_conntrack_info info) diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h index eb81f9195e28..d00ba6048e44 100644 --- a/include/net/netfilter/nf_conntrack_ecache.h +++ b/include/net/netfilter/nf_conntrack_ecache.h @@ -171,12 +171,18 @@ void nf_ct_expect_event_report(enum ip_conntrack_expect_events event, struct nf_conntrack_expect *exp, u32 portid, int report); +void nf_conntrack_ecache_work(struct net *net, enum nf_ct_ecache_state state); + void nf_conntrack_ecache_pernet_init(struct net *net); void nf_conntrack_ecache_pernet_fini(struct net *net); int nf_conntrack_ecache_init(void); void nf_conntrack_ecache_fini(void); +static inline bool nf_conntrack_ecache_dwork_pending(const struct net *net) +{ + return net->ct.ecache_dwork_pending; +} #else /* CONFIG_NF_CONNTRACK_EVENTS */ static inline void nf_ct_expect_event_report(enum ip_conntrack_expect_events e, @@ -186,6 +192,11 @@ static inline void nf_ct_expect_event_report(enum ip_conntrack_expect_events e, { } +static inline void nf_conntrack_ecache_work(struct net *net, + enum nf_ct_ecache_state s) +{ +} + static inline void nf_conntrack_ecache_pernet_init(struct net *net) { } @@ -203,26 +214,6 @@ static inline void nf_conntrack_ecache_fini(void) { } +static inline bool nf_conntrack_ecache_dwork_pending(const struct net *net) { return false; } #endif /* CONFIG_NF_CONNTRACK_EVENTS */ - -static inline void nf_conntrack_ecache_delayed_work(struct net *net) -{ -#ifdef CONFIG_NF_CONNTRACK_EVENTS - if (!delayed_work_pending(&net->ct.ecache_dwork)) { - schedule_delayed_work(&net->ct.ecache_dwork, HZ); - net->ct.ecache_dwork_pending = true; - } -#endif -} - -static inline void nf_conntrack_ecache_work(struct net *net) -{ -#ifdef CONFIG_NF_CONNTRACK_EVENTS - if (net->ct.ecache_dwork_pending) { - net->ct.ecache_dwork_pending = false; - mod_delayed_work(system_wq, &net->ct.ecache_dwork, 0); - } -#endif -} - #endif /*_NF_CONNTRACK_ECACHE_H*/ diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index 54c4d5c908a5..51d8eb99764d 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -21,6 +21,8 @@ struct nf_flow_key { struct flow_dissector_key_control control; struct flow_dissector_key_control enc_control; struct flow_dissector_key_basic basic; + struct flow_dissector_key_vlan vlan; + struct flow_dissector_key_vlan cvlan; union { struct flow_dissector_key_ipv4_addrs ipv4; struct flow_dissector_key_ipv6_addrs ipv6; @@ -86,8 +88,17 @@ static inline bool nf_flowtable_hw_offload(struct nf_flowtable *flowtable) enum flow_offload_tuple_dir { FLOW_OFFLOAD_DIR_ORIGINAL = IP_CT_DIR_ORIGINAL, FLOW_OFFLOAD_DIR_REPLY = IP_CT_DIR_REPLY, - FLOW_OFFLOAD_DIR_MAX = IP_CT_DIR_MAX }; +#define FLOW_OFFLOAD_DIR_MAX IP_CT_DIR_MAX + +enum flow_offload_xmit_type { + FLOW_OFFLOAD_XMIT_UNSPEC = 0, + FLOW_OFFLOAD_XMIT_NEIGH, + FLOW_OFFLOAD_XMIT_XFRM, + FLOW_OFFLOAD_XMIT_DIRECT, +}; + +#define NF_FLOW_TABLE_ENCAP_MAX 2 struct flow_offload_tuple { union { @@ -107,15 +118,31 @@ struct flow_offload_tuple { u8 l3proto; u8 l4proto; + struct { + u16 id; + __be16 proto; + } encap[NF_FLOW_TABLE_ENCAP_MAX]; /* All members above are keys for lookups, see flow_offload_hash(). */ struct { } __hash; - u8 dir; - + u8 dir:2, + xmit_type:2, + encap_num:2, + in_vlan_ingress:2; u16 mtu; - - struct dst_entry *dst_cache; + union { + struct { + struct dst_entry *dst_cache; + u32 dst_cookie; + }; + struct { + u32 ifidx; + u32 hw_ifidx; + u8 h_source[ETH_ALEN]; + u8 h_dest[ETH_ALEN]; + } out; + }; }; struct flow_offload_tuple_rhash { @@ -158,7 +185,23 @@ static inline __s32 nf_flow_timeout_delta(unsigned int timeout) struct nf_flow_route { struct { - struct dst_entry *dst; + struct dst_entry *dst; + struct { + u32 ifindex; + struct { + u16 id; + __be16 proto; + } encap[NF_FLOW_TABLE_ENCAP_MAX]; + u8 num_encaps:2, + ingress_vlans:2; + } in; + struct { + u32 ifindex; + u32 hw_ifindex; + u8 h_source[ETH_ALEN]; + u8 h_dest[ETH_ALEN]; + } out; + enum flow_offload_xmit_type xmit_type; } tuple[FLOW_OFFLOAD_DIR_MAX]; }; @@ -229,12 +272,12 @@ void nf_flow_table_free(struct nf_flowtable *flow_table); void flow_offload_teardown(struct flow_offload *flow); -int nf_flow_snat_port(const struct flow_offload *flow, - struct sk_buff *skb, unsigned int thoff, - u8 protocol, enum flow_offload_tuple_dir dir); -int nf_flow_dnat_port(const struct flow_offload *flow, - struct sk_buff *skb, unsigned int thoff, - u8 protocol, enum flow_offload_tuple_dir dir); +void nf_flow_snat_port(const struct flow_offload *flow, + struct sk_buff *skb, unsigned int thoff, + u8 protocol, enum flow_offload_tuple_dir dir); +void nf_flow_dnat_port(const struct flow_offload *flow, + struct sk_buff *skb, unsigned int thoff, + u8 protocol, enum flow_offload_tuple_dir dir); struct flow_ports { __be16 source, dest; diff --git a/include/net/netfilter/nf_log.h b/include/net/netfilter/nf_log.h index 716db4a0fed8..e55eedc84ed7 100644 --- a/include/net/netfilter/nf_log.h +++ b/include/net/netfilter/nf_log.h @@ -68,7 +68,6 @@ void nf_log_unbind_pf(struct net *net, u_int8_t pf); int nf_logger_find_get(int pf, enum nf_log_type type); void nf_logger_put(int pf, enum nf_log_type type); -void nf_logger_request_module(int pf, enum nf_log_type type); #define MODULE_ALIAS_NF_LOGGER(family, type) \ MODULE_ALIAS("nf-logger-" __stringify(family) "-" __stringify(type)) @@ -99,28 +98,4 @@ struct nf_log_buf; struct nf_log_buf *nf_log_buf_open(void); __printf(2, 3) int nf_log_buf_add(struct nf_log_buf *m, const char *f, ...); void nf_log_buf_close(struct nf_log_buf *m); - -/* common logging functions */ -int nf_log_dump_udp_header(struct nf_log_buf *m, const struct sk_buff *skb, - u8 proto, int fragment, unsigned int offset); -int nf_log_dump_tcp_header(struct nf_log_buf *m, const struct sk_buff *skb, - u8 proto, int fragment, unsigned int offset, - unsigned int logflags); -void nf_log_dump_sk_uid_gid(struct net *net, struct nf_log_buf *m, - struct sock *sk); -void nf_log_dump_vlan(struct nf_log_buf *m, const struct sk_buff *skb); -void nf_log_dump_packet_common(struct nf_log_buf *m, u_int8_t pf, - unsigned int hooknum, const struct sk_buff *skb, - const struct net_device *in, - const struct net_device *out, - const struct nf_loginfo *loginfo, - const char *prefix); -void nf_log_l2packet(struct net *net, u_int8_t pf, - __be16 protocol, - unsigned int hooknum, - const struct sk_buff *skb, - const struct net_device *in, - const struct net_device *out, - const struct nf_loginfo *loginfo, const char *prefix); - #endif /* _NF_LOG_H */ diff --git a/include/net/netfilter/nf_nat.h b/include/net/netfilter/nf_nat.h index 0d412dd63707..987111ae5240 100644 --- a/include/net/netfilter/nf_nat.h +++ b/include/net/netfilter/nf_nat.h @@ -104,8 +104,6 @@ unsigned int nf_nat_inet_fn(void *priv, struct sk_buff *skb, const struct nf_hook_state *state); -int nf_xfrm_me_harder(struct net *n, struct sk_buff *s, unsigned int family); - static inline int nf_nat_initialized(struct nf_conn *ct, enum nf_nat_manip_type manip) { diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 5aaced6bf13e..27eeb613bb4e 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -13,6 +13,7 @@ #include <net/netfilter/nf_flow_table.h> #include <net/netlink.h> #include <net/flow_offload.h> +#include <net/netns/generic.h> #define NFT_MAX_HOOKS (NF_INET_INGRESS + 1) @@ -496,6 +497,7 @@ struct nft_set { u8 dlen; u8 num_exprs; struct nft_expr *exprs[NFT_SET_EXPR_MAX]; + struct list_head catchall_list; unsigned char data[] __attribute__((aligned(__alignof__(u64)))); }; @@ -521,6 +523,10 @@ struct nft_set *nft_set_lookup_global(const struct net *net, const struct nlattr *nla_set_id, u8 genmask); +struct nft_set_ext *nft_set_catchall_lookup(const struct net *net, + const struct nft_set *set); +void *nft_set_catchall_gc(const struct nft_set *set); + static inline unsigned long nft_set_gc_interval(const struct nft_set *set) { return set->gc_int ? msecs_to_jiffies(set->gc_int) : HZ; @@ -867,6 +873,8 @@ struct nft_expr_ops { int (*offload)(struct nft_offload_ctx *ctx, struct nft_flow_rule *flow, const struct nft_expr *expr); + void (*offload_stats)(struct nft_expr *expr, + const struct flow_stats *stats); u32 offload_flags; const struct nft_expr_type *type; void *data; @@ -1498,13 +1506,16 @@ struct nft_trans_chain { struct nft_trans_table { bool update; - bool enable; + u8 state; + u32 flags; }; #define nft_trans_table_update(trans) \ (((struct nft_trans_table *)trans->data)->update) -#define nft_trans_table_enable(trans) \ - (((struct nft_trans_table *)trans->data)->enable) +#define nft_trans_table_state(trans) \ + (((struct nft_trans_table *)trans->data)->state) +#define nft_trans_table_flags(trans) \ + (((struct nft_trans_table *)trans->data)->flags) struct nft_trans_elem { struct nft_set *set; @@ -1559,4 +1570,27 @@ void nf_tables_trans_destroy_flush_work(void); int nf_msecs_to_jiffies64(const struct nlattr *nla, u64 *result); __be64 nf_jiffies64_to_msecs(u64 input); +#ifdef CONFIG_MODULES +__printf(2, 3) int nft_request_module(struct net *net, const char *fmt, ...); +#else +static inline int nft_request_module(struct net *net, const char *fmt, ...) { return -ENOENT; } +#endif + +struct nftables_pernet { + struct list_head tables; + struct list_head commit_list; + struct list_head module_list; + struct list_head notify_list; + struct mutex commit_mutex; + unsigned int base_seq; + u8 validate_state; +}; + +extern unsigned int nf_tables_net_id; + +static inline struct nftables_pernet *nft_pernet(const struct net *net) +{ + return net_generic(net, nf_tables_net_id); +} + #endif /* _NET_NF_TABLES_H */ diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h index 1d34fe154fe0..f9d95ff82df8 100644 --- a/include/net/netfilter/nf_tables_offload.h +++ b/include/net/netfilter/nf_tables_offload.h @@ -4,11 +4,16 @@ #include <net/flow_offload.h> #include <net/netfilter/nf_tables.h> +enum nft_offload_reg_flags { + NFT_OFFLOAD_F_NETWORK2HOST = (1 << 0), +}; + struct nft_offload_reg { u32 key; u32 len; u32 base_offset; u32 offset; + u32 flags; struct nft_data data; struct nft_data mask; }; @@ -45,6 +50,7 @@ struct nft_flow_key { struct flow_dissector_key_ports tp; struct flow_dissector_key_ip ip; struct flow_dissector_key_vlan vlan; + struct flow_dissector_key_vlan cvlan; struct flow_dissector_key_eth_addrs eth_addrs; struct flow_dissector_key_meta meta; } __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */ @@ -68,16 +74,21 @@ void nft_flow_rule_set_addr_type(struct nft_flow_rule *flow, struct nft_rule; struct nft_flow_rule *nft_flow_rule_create(struct net *net, const struct nft_rule *rule); +int nft_flow_rule_stats(const struct nft_chain *chain, const struct nft_rule *rule); void nft_flow_rule_destroy(struct nft_flow_rule *flow); int nft_flow_rule_offload_commit(struct net *net); -#define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg) \ +#define NFT_OFFLOAD_MATCH_FLAGS(__key, __base, __field, __len, __reg, __flags) \ (__reg)->base_offset = \ offsetof(struct nft_flow_key, __base); \ (__reg)->offset = \ offsetof(struct nft_flow_key, __base.__field); \ (__reg)->len = __len; \ (__reg)->key = __key; \ + (__reg)->flags = __flags; + +#define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg) \ + NFT_OFFLOAD_MATCH_FLAGS(__key, __base, __field, __len, __reg, 0) #define NFT_OFFLOAD_MATCH_EXACT(__key, __base, __field, __len, __reg) \ NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg) \ diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h index 806454e767bf..ad0a95c2335e 100644 --- a/include/net/netns/conntrack.h +++ b/include/net/netns/conntrack.h @@ -24,9 +24,9 @@ struct nf_generic_net { struct nf_tcp_net { unsigned int timeouts[TCP_CONNTRACK_TIMEOUT_MAX]; - int tcp_loose; - int tcp_be_liberal; - int tcp_max_retrans; + u8 tcp_loose; + u8 tcp_be_liberal; + u8 tcp_max_retrans; }; enum udp_conntrack { @@ -45,7 +45,7 @@ struct nf_icmp_net { #ifdef CONFIG_NF_CT_PROTO_DCCP struct nf_dccp_net { - int dccp_loose; + u8 dccp_loose; unsigned int dccp_timeout[CT_DCCP_MAX + 1]; }; #endif @@ -93,22 +93,15 @@ struct ct_pcpu { }; struct netns_ct { - atomic_t count; - unsigned int expect_count; #ifdef CONFIG_NF_CONNTRACK_EVENTS - struct delayed_work ecache_dwork; bool ecache_dwork_pending; #endif - bool auto_assign_helper_warned; -#ifdef CONFIG_SYSCTL - struct ctl_table_header *sysctl_header; -#endif - unsigned int sysctl_log_invalid; /* Log invalid packets */ - int sysctl_events; - int sysctl_acct; - int sysctl_auto_assign_helper; - int sysctl_tstamp; - int sysctl_checksum; + u8 sysctl_log_invalid; /* Log invalid packets */ + u8 sysctl_events; + u8 sysctl_acct; + u8 sysctl_auto_assign_helper; + u8 sysctl_tstamp; + u8 sysctl_checksum; struct ct_pcpu __percpu *pcpu_lists; struct ip_conntrack_stat __percpu *stat; diff --git a/include/net/netns/dccp.h b/include/net/netns/dccp.h deleted file mode 100644 index cdbc4f5b8390..000000000000 --- a/include/net/netns/dccp.h +++ /dev/null @@ -1,12 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef __NETNS_DCCP_H__ -#define __NETNS_DCCP_H__ - -struct sock; - -struct netns_dccp { - struct sock *v4_ctl_sk; - struct sock *v6_ctl_sk; -}; - -#endif diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 70a2a085dd1a..f6af8d96d3c6 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -11,7 +11,6 @@ #include <linux/rcupdate.h> #include <linux/siphash.h> -struct tcpm_hash_bucket; struct ctl_table_header; struct ipv4_devconf; struct fib_rules_ops; @@ -33,14 +32,18 @@ struct inet_hashinfo; struct inet_timewait_death_row { atomic_t tw_count; + char tw_pad[L1_CACHE_BYTES - sizeof(atomic_t)]; - struct inet_hashinfo *hashinfo ____cacheline_aligned_in_smp; + struct inet_hashinfo *hashinfo; int sysctl_max_tw_buckets; }; struct tcp_fastopen_context; struct netns_ipv4 { + /* Please keep tcp_death_row at first field in netns_ipv4 */ + struct inet_timewait_death_row tcp_death_row ____cacheline_aligned_in_smp; + #ifdef CONFIG_SYSCTL struct ctl_table_header *forw_hdr; struct ctl_table_header *frags_hdr; @@ -54,17 +57,17 @@ struct netns_ipv4 { struct mutex ra_mutex; #ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_rules_ops *rules_ops; - bool fib_has_custom_rules; - unsigned int fib_rules_require_fldissect; struct fib_table __rcu *fib_main; struct fib_table __rcu *fib_default; + unsigned int fib_rules_require_fldissect; + bool fib_has_custom_rules; #endif bool fib_has_custom_local_routes; + bool fib_offload_disabled; #ifdef CONFIG_IP_ROUTE_CLASSID int fib_num_tclassid_users; #endif struct hlist_head *fib_table_hash; - bool fib_offload_disabled; struct sock *fibnl; struct sock * __percpu *icmp_sk; @@ -73,52 +76,43 @@ struct netns_ipv4 { struct inet_peer_base *peers; struct sock * __percpu *tcp_sk; struct fqdir *fqdir; -#ifdef CONFIG_NETFILTER - struct xt_table *iptable_filter; - struct xt_table *iptable_mangle; - struct xt_table *iptable_raw; - struct xt_table *arptable_filter; -#ifdef CONFIG_SECURITY - struct xt_table *iptable_security; -#endif - struct xt_table *nat_table; -#endif - int sysctl_icmp_echo_ignore_all; - int sysctl_icmp_echo_ignore_broadcasts; - int sysctl_icmp_ignore_bogus_error_responses; + u8 sysctl_icmp_echo_ignore_all; + u8 sysctl_icmp_echo_enable_probe; + u8 sysctl_icmp_echo_ignore_broadcasts; + u8 sysctl_icmp_ignore_bogus_error_responses; + u8 sysctl_icmp_errors_use_inbound_ifaddr; int sysctl_icmp_ratelimit; int sysctl_icmp_ratemask; - int sysctl_icmp_errors_use_inbound_ifaddr; struct local_ports ip_local_ports; - int sysctl_tcp_ecn; - int sysctl_tcp_ecn_fallback; + u8 sysctl_tcp_ecn; + u8 sysctl_tcp_ecn_fallback; - int sysctl_ip_default_ttl; - int sysctl_ip_no_pmtu_disc; - int sysctl_ip_fwd_use_pmtu; - int sysctl_ip_fwd_update_priority; - int sysctl_ip_nonlocal_bind; - int sysctl_ip_autobind_reuse; + u8 sysctl_ip_default_ttl; + u8 sysctl_ip_no_pmtu_disc; + u8 sysctl_ip_fwd_use_pmtu; + u8 sysctl_ip_fwd_update_priority; + u8 sysctl_ip_nonlocal_bind; + u8 sysctl_ip_autobind_reuse; /* Shall we try to damage output packets if routing dev changes? */ - int sysctl_ip_dynaddr; - int sysctl_ip_early_demux; + u8 sysctl_ip_dynaddr; + u8 sysctl_ip_early_demux; #ifdef CONFIG_NET_L3_MASTER_DEV - int sysctl_raw_l3mdev_accept; + u8 sysctl_raw_l3mdev_accept; #endif - int sysctl_tcp_early_demux; - int sysctl_udp_early_demux; + u8 sysctl_tcp_early_demux; + u8 sysctl_udp_early_demux; - int sysctl_nexthop_compat_mode; + u8 sysctl_nexthop_compat_mode; - int sysctl_fwmark_reflect; - int sysctl_tcp_fwmark_accept; + u8 sysctl_fwmark_reflect; + u8 sysctl_tcp_fwmark_accept; #ifdef CONFIG_NET_L3_MASTER_DEV - int sysctl_tcp_l3mdev_accept; + u8 sysctl_tcp_l3mdev_accept; #endif - int sysctl_tcp_mtu_probing; + u8 sysctl_tcp_mtu_probing; int sysctl_tcp_mtu_probe_floor; int sysctl_tcp_base_mss; int sysctl_tcp_min_snd_mss; @@ -126,55 +120,55 @@ struct netns_ipv4 { u32 sysctl_tcp_probe_interval; int sysctl_tcp_keepalive_time; - int sysctl_tcp_keepalive_probes; int sysctl_tcp_keepalive_intvl; + u8 sysctl_tcp_keepalive_probes; - int sysctl_tcp_syn_retries; - int sysctl_tcp_synack_retries; - int sysctl_tcp_syncookies; + u8 sysctl_tcp_syn_retries; + u8 sysctl_tcp_synack_retries; + u8 sysctl_tcp_syncookies; int sysctl_tcp_reordering; - int sysctl_tcp_retries1; - int sysctl_tcp_retries2; - int sysctl_tcp_orphan_retries; + u8 sysctl_tcp_retries1; + u8 sysctl_tcp_retries2; + u8 sysctl_tcp_orphan_retries; + u8 sysctl_tcp_tw_reuse; int sysctl_tcp_fin_timeout; unsigned int sysctl_tcp_notsent_lowat; - int sysctl_tcp_tw_reuse; - int sysctl_tcp_sack; - int sysctl_tcp_window_scaling; - int sysctl_tcp_timestamps; - int sysctl_tcp_early_retrans; - int sysctl_tcp_recovery; - int sysctl_tcp_thin_linear_timeouts; - int sysctl_tcp_slow_start_after_idle; - int sysctl_tcp_retrans_collapse; - int sysctl_tcp_stdurg; - int sysctl_tcp_rfc1337; - int sysctl_tcp_abort_on_overflow; - int sysctl_tcp_fack; + u8 sysctl_tcp_sack; + u8 sysctl_tcp_window_scaling; + u8 sysctl_tcp_timestamps; + u8 sysctl_tcp_early_retrans; + u8 sysctl_tcp_recovery; + u8 sysctl_tcp_thin_linear_timeouts; + u8 sysctl_tcp_slow_start_after_idle; + u8 sysctl_tcp_retrans_collapse; + u8 sysctl_tcp_stdurg; + u8 sysctl_tcp_rfc1337; + u8 sysctl_tcp_abort_on_overflow; + u8 sysctl_tcp_fack; /* obsolete */ int sysctl_tcp_max_reordering; - int sysctl_tcp_dsack; - int sysctl_tcp_app_win; int sysctl_tcp_adv_win_scale; - int sysctl_tcp_frto; - int sysctl_tcp_nometrics_save; - int sysctl_tcp_no_ssthresh_metrics_save; - int sysctl_tcp_moderate_rcvbuf; - int sysctl_tcp_tso_win_divisor; - int sysctl_tcp_workaround_signed_windows; + u8 sysctl_tcp_dsack; + u8 sysctl_tcp_app_win; + u8 sysctl_tcp_frto; + u8 sysctl_tcp_nometrics_save; + u8 sysctl_tcp_no_ssthresh_metrics_save; + u8 sysctl_tcp_moderate_rcvbuf; + u8 sysctl_tcp_tso_win_divisor; + u8 sysctl_tcp_workaround_signed_windows; int sysctl_tcp_limit_output_bytes; int sysctl_tcp_challenge_ack_limit; - int sysctl_tcp_min_tso_segs; int sysctl_tcp_min_rtt_wlen; - int sysctl_tcp_autocorking; + u8 sysctl_tcp_min_tso_segs; + u8 sysctl_tcp_autocorking; + u8 sysctl_tcp_reflect_tos; + u8 sysctl_tcp_comp_sack_nr; int sysctl_tcp_invalid_ratelimit; int sysctl_tcp_pacing_ss_ratio; int sysctl_tcp_pacing_ca_ratio; int sysctl_tcp_wmem[3]; int sysctl_tcp_rmem[3]; - int sysctl_tcp_comp_sack_nr; unsigned long sysctl_tcp_comp_sack_delay_ns; unsigned long sysctl_tcp_comp_sack_slack_ns; - struct inet_timewait_death_row tcp_death_row; int sysctl_max_syn_backlog; int sysctl_tcp_fastopen; const struct tcp_congestion_ops __rcu *tcp_congestion_control; @@ -183,20 +177,19 @@ struct netns_ipv4 { unsigned int sysctl_tcp_fastopen_blackhole_timeout; atomic_t tfo_active_disable_times; unsigned long tfo_active_disable_stamp; - int sysctl_tcp_reflect_tos; int sysctl_udp_wmem_min; int sysctl_udp_rmem_min; - int sysctl_fib_notify_on_flag_change; + u8 sysctl_fib_notify_on_flag_change; #ifdef CONFIG_NET_L3_MASTER_DEV - int sysctl_udp_l3mdev_accept; + u8 sysctl_udp_l3mdev_accept; #endif + u8 sysctl_igmp_llm_reports; int sysctl_igmp_max_memberships; int sysctl_igmp_max_msf; - int sysctl_igmp_llm_reports; int sysctl_igmp_qrv; struct ping_group_range ping_group_range; @@ -217,8 +210,8 @@ struct netns_ipv4 { #endif #endif #ifdef CONFIG_IP_ROUTE_MULTIPATH - int sysctl_fib_multipath_use_neigh; - int sysctl_fib_multipath_hash_policy; + u8 sysctl_fib_multipath_use_neigh; + u8 sysctl_fib_multipath_hash_policy; #endif struct fib_notifier_ops *notifier_ops; diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h index 21c0debbd39e..6153c8067009 100644 --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -20,7 +20,6 @@ struct netns_sysctl_ipv6 { struct ctl_table_header *frags_hdr; struct ctl_table_header *xfrm6_hdr; #endif - int bindv6only; int flush_delay; int ip6_rt_max_size; int ip6_rt_gc_min_interval; @@ -29,21 +28,22 @@ struct netns_sysctl_ipv6 { int ip6_rt_gc_elasticity; int ip6_rt_mtu_expires; int ip6_rt_min_advmss; - int multipath_hash_policy; - int flowlabel_consistency; - int auto_flowlabels; + u8 bindv6only; + u8 multipath_hash_policy; + u8 flowlabel_consistency; + u8 auto_flowlabels; int icmpv6_time; - int icmpv6_echo_ignore_all; - int icmpv6_echo_ignore_multicast; - int icmpv6_echo_ignore_anycast; + u8 icmpv6_echo_ignore_all; + u8 icmpv6_echo_ignore_multicast; + u8 icmpv6_echo_ignore_anycast; DECLARE_BITMAP(icmpv6_ratemask, ICMPV6_MSG_MAX + 1); unsigned long *icmpv6_ratemask_ptr; - int anycast_src_echo_reply; - int ip_nonlocal_bind; - int fwmark_reflect; + u8 anycast_src_echo_reply; + u8 ip_nonlocal_bind; + u8 fwmark_reflect; + u8 flowlabel_state_ranges; int idgen_retries; int idgen_delay; - int flowlabel_state_ranges; int flowlabel_reflect; int max_dst_opts_cnt; int max_hbh_opts_cnt; @@ -51,24 +51,18 @@ struct netns_sysctl_ipv6 { int max_hbh_opts_len; int seg6_flowlabel; bool skip_notify_on_dev_down; - int fib_notify_on_flag_change; + u8 fib_notify_on_flag_change; }; struct netns_ipv6 { + /* Keep ip6_dst_ops at the beginning of netns_sysctl_ipv6 */ + struct dst_ops ip6_dst_ops; + struct netns_sysctl_ipv6 sysctl; struct ipv6_devconf *devconf_all; struct ipv6_devconf *devconf_dflt; struct inet_peer_base *peers; struct fqdir *fqdir; -#ifdef CONFIG_NETFILTER - struct xt_table *ip6table_filter; - struct xt_table *ip6table_mangle; - struct xt_table *ip6table_raw; -#ifdef CONFIG_SECURITY - struct xt_table *ip6table_security; -#endif - struct xt_table *ip6table_nat; -#endif struct fib6_info *fib6_null_entry; struct rt6_info *ip6_null_entry; struct rt6_statistics *rt6_stats; @@ -76,7 +70,6 @@ struct netns_ipv6 { struct hlist_head *fib_table_hash; struct fib6_table *fib6_main_tbl; struct list_head fib6_walkers; - struct dst_ops ip6_dst_ops; rwlock_t fib6_walker_lock; spinlock_t fib6_gc_lock; unsigned int ip6_rt_gc_expire; diff --git a/include/net/netns/mib.h b/include/net/netns/mib.h index 59b2c3a3db42..7e373664b1e7 100644 --- a/include/net/netns/mib.h +++ b/include/net/netns/mib.h @@ -5,22 +5,19 @@ #include <net/snmp.h> struct netns_mib { - DEFINE_SNMP_STAT(struct tcp_mib, tcp_statistics); DEFINE_SNMP_STAT(struct ipstats_mib, ip_statistics); +#if IS_ENABLED(CONFIG_IPV6) + DEFINE_SNMP_STAT(struct ipstats_mib, ipv6_statistics); +#endif + + DEFINE_SNMP_STAT(struct tcp_mib, tcp_statistics); DEFINE_SNMP_STAT(struct linux_mib, net_statistics); - DEFINE_SNMP_STAT(struct udp_mib, udp_statistics); - DEFINE_SNMP_STAT(struct udp_mib, udplite_statistics); - DEFINE_SNMP_STAT(struct icmp_mib, icmp_statistics); - DEFINE_SNMP_STAT_ATOMIC(struct icmpmsg_mib, icmpmsg_statistics); + DEFINE_SNMP_STAT(struct udp_mib, udp_statistics); #if IS_ENABLED(CONFIG_IPV6) - struct proc_dir_entry *proc_net_devsnmp6; DEFINE_SNMP_STAT(struct udp_mib, udp_stats_in6); - DEFINE_SNMP_STAT(struct udp_mib, udplite_stats_in6); - DEFINE_SNMP_STAT(struct ipstats_mib, ipv6_statistics); - DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6_statistics); - DEFINE_SNMP_STAT_ATOMIC(struct icmpv6msg_mib, icmpv6msg_statistics); #endif + #ifdef CONFIG_XFRM_STATISTICS DEFINE_SNMP_STAT(struct linux_xfrm_mib, xfrm_statistics); #endif @@ -30,6 +27,19 @@ struct netns_mib { #ifdef CONFIG_MPTCP DEFINE_SNMP_STAT(struct mptcp_mib, mptcp_statistics); #endif + + DEFINE_SNMP_STAT(struct udp_mib, udplite_statistics); +#if IS_ENABLED(CONFIG_IPV6) + DEFINE_SNMP_STAT(struct udp_mib, udplite_stats_in6); +#endif + + DEFINE_SNMP_STAT(struct icmp_mib, icmp_statistics); + DEFINE_SNMP_STAT_ATOMIC(struct icmpmsg_mib, icmpmsg_statistics); +#if IS_ENABLED(CONFIG_IPV6) + DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6_statistics); + DEFINE_SNMP_STAT_ATOMIC(struct icmpv6msg_mib, icmpv6msg_statistics); + struct proc_dir_entry *proc_net_devsnmp6; +#endif }; #endif diff --git a/include/net/netns/netfilter.h b/include/net/netns/netfilter.h index ca043342c0eb..15e2b13fb0c0 100644 --- a/include/net/netns/netfilter.h +++ b/include/net/netns/netfilter.h @@ -28,11 +28,5 @@ struct netns_nf { #if IS_ENABLED(CONFIG_DECNET) struct nf_hook_entries __rcu *hooks_decnet[NF_DN_NUMHOOKS]; #endif -#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4) - bool defrag_ipv4; -#endif -#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) - bool defrag_ipv6; -#endif }; #endif diff --git a/include/net/netns/nftables.h b/include/net/netns/nftables.h index 6c0806bd8d1e..8c77832d0240 100644 --- a/include/net/netns/nftables.h +++ b/include/net/netns/nftables.h @@ -5,14 +5,7 @@ #include <linux/list.h> struct netns_nftables { - struct list_head tables; - struct list_head commit_list; - struct list_head module_list; - struct list_head notify_list; - struct mutex commit_mutex; - unsigned int base_seq; u8 gencursor; - u8 validate_state; }; #endif diff --git a/include/net/netns/x_tables.h b/include/net/netns/x_tables.h index 9bc5a12fdbb0..d02316ec2906 100644 --- a/include/net/netns/x_tables.h +++ b/include/net/netns/x_tables.h @@ -5,17 +5,8 @@ #include <linux/list.h> #include <linux/netfilter_defs.h> -struct ebt_table; - struct netns_xt { - struct list_head tables[NFPROTO_NUMPROTO]; bool notrack_deprecated_warning; bool clusterip_deprecated_warning; -#if defined(CONFIG_BRIDGE_NF_EBTABLES) || \ - defined(CONFIG_BRIDGE_NF_EBTABLES_MODULE) - struct ebt_table *broute_table; - struct ebt_table *frame_filter; - struct ebt_table *frame_nat; -#endif }; #endif diff --git a/include/net/nexthop.h b/include/net/nexthop.h index a10a319d7eb2..10e1777877e6 100644 --- a/include/net/nexthop.h +++ b/include/net/nexthop.h @@ -40,6 +40,12 @@ struct nh_config { struct nlattr *nh_grp; u16 nh_grp_type; + u16 nh_grp_res_num_buckets; + unsigned long nh_grp_res_idle_timer; + unsigned long nh_grp_res_unbalanced_timer; + bool nh_grp_res_has_num_buckets; + bool nh_grp_res_has_idle_timer; + bool nh_grp_res_has_unbalanced_timer; struct nlattr *nh_encap; u16 nh_encap_type; @@ -63,6 +69,32 @@ struct nh_info { }; }; +struct nh_res_bucket { + struct nh_grp_entry __rcu *nh_entry; + atomic_long_t used_time; + unsigned long migrated_time; + bool occupied; + u8 nh_flags; +}; + +struct nh_res_table { + struct net *net; + u32 nhg_id; + struct delayed_work upkeep_dw; + + /* List of NHGEs that have too few buckets ("uw" for underweight). + * Reclaimed buckets will be given to entries in this list. + */ + struct list_head uw_nh_entries; + unsigned long unbalanced_since; + + u32 idle_timer; + u32 unbalanced_timer; + + u16 num_nh_buckets; + struct nh_res_bucket nh_buckets[]; +}; + struct nh_grp_entry { struct nexthop *nh; u8 weight; @@ -70,7 +102,14 @@ struct nh_grp_entry { union { struct { atomic_t upper_bound; - } mpath; + } hthr; + struct { + /* Member on uw_nh_entries. */ + struct list_head uw_nh_entry; + + u16 count_buckets; + u16 wants_buckets; + } res; }; struct list_head nh_list; @@ -80,9 +119,13 @@ struct nh_grp_entry { struct nh_group { struct nh_group *spare; /* spare group for removals */ u16 num_nh; - bool mpath; + bool is_multipath; + bool hash_threshold; + bool resilient; bool fdb_nh; bool has_v4; + + struct nh_res_table __rcu *res_table; struct nh_grp_entry nh_entries[]; }; @@ -112,11 +155,15 @@ struct nexthop { enum nexthop_event_type { NEXTHOP_EVENT_DEL, NEXTHOP_EVENT_REPLACE, + NEXTHOP_EVENT_RES_TABLE_PRE_REPLACE, + NEXTHOP_EVENT_BUCKET_REPLACE, }; enum nh_notifier_info_type { NH_NOTIFIER_INFO_TYPE_SINGLE, NH_NOTIFIER_INFO_TYPE_GRP, + NH_NOTIFIER_INFO_TYPE_RES_TABLE, + NH_NOTIFIER_INFO_TYPE_RES_BUCKET, }; struct nh_notifier_single_info { @@ -143,6 +190,19 @@ struct nh_notifier_grp_info { struct nh_notifier_grp_entry_info nh_entries[]; }; +struct nh_notifier_res_bucket_info { + u16 bucket_index; + unsigned int idle_timer_ms; + bool force; + struct nh_notifier_single_info old_nh; + struct nh_notifier_single_info new_nh; +}; + +struct nh_notifier_res_table_info { + u16 num_nh_buckets; + struct nh_notifier_single_info nhs[]; +}; + struct nh_notifier_info { struct net *net; struct netlink_ext_ack *extack; @@ -151,6 +211,8 @@ struct nh_notifier_info { union { struct nh_notifier_single_info *nh; struct nh_notifier_grp_info *nh_grp; + struct nh_notifier_res_table_info *nh_res_table; + struct nh_notifier_res_bucket_info *nh_res_bucket; }; }; @@ -158,6 +220,10 @@ int register_nexthop_notifier(struct net *net, struct notifier_block *nb, struct netlink_ext_ack *extack); int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb); void nexthop_set_hw_flags(struct net *net, u32 id, bool offload, bool trap); +void nexthop_bucket_set_hw_flags(struct net *net, u32 id, u16 bucket_index, + bool offload, bool trap); +void nexthop_res_grp_activity_update(struct net *net, u32 id, u16 num_buckets, + unsigned long *activity); /* caller is holding rcu or rtnl; no reference taken to nexthop */ struct nexthop *nexthop_find_by_id(struct net *net, u32 id); @@ -212,7 +278,7 @@ static inline bool nexthop_is_multipath(const struct nexthop *nh) struct nh_group *nh_grp; nh_grp = rcu_dereference_rtnl(nh->nh_grp); - return nh_grp->mpath; + return nh_grp->is_multipath; } return false; } @@ -227,7 +293,7 @@ static inline unsigned int nexthop_num_path(const struct nexthop *nh) struct nh_group *nh_grp; nh_grp = rcu_dereference_rtnl(nh->nh_grp); - if (nh_grp->mpath) + if (nh_grp->is_multipath) rc = nh_grp->num_nh; } @@ -308,7 +374,7 @@ struct fib_nh_common *nexthop_fib_nhc(struct nexthop *nh, int nhsel) struct nh_group *nh_grp; nh_grp = rcu_dereference_rtnl(nh->nh_grp); - if (nh_grp->mpath) { + if (nh_grp->is_multipath) { nh = nexthop_mpath_select(nh_grp, nhsel); if (!nh) return NULL; diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index 15b1b30f454e..f5c1bee0cd6a 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -188,4 +188,13 @@ struct tc_taprio_qopt_offload *taprio_offload_get(struct tc_taprio_qopt_offload *offload); void taprio_offload_free(struct tc_taprio_qopt_offload *offload); +/* Ensure skb_mstamp_ns, which might have been populated with the txtime, is + * not mistaken for a software timestamp, because this will otherwise prevent + * the dispatch of hardware timestamps to the socket. + */ +static inline void skb_txtime_consumed(struct sk_buff *skb) +{ + skb->tstamp = ktime_set(0, 0); +} + #endif diff --git a/include/net/psample.h b/include/net/psample.h index 68ae16bb0a4a..e328c5127757 100644 --- a/include/net/psample.h +++ b/include/net/psample.h @@ -14,6 +14,19 @@ struct psample_group { struct rcu_head rcu; }; +struct psample_metadata { + u32 trunc_size; + int in_ifindex; + int out_ifindex; + u16 out_tc; + u64 out_tc_occ; /* bytes */ + u64 latency; /* nanoseconds */ + u8 out_tc_valid:1, + out_tc_occ_valid:1, + latency_valid:1, + unused:5; +}; + struct psample_group *psample_group_get(struct net *net, u32 group_num); void psample_group_take(struct psample_group *group); void psample_group_put(struct psample_group *group); @@ -21,15 +34,13 @@ void psample_group_put(struct psample_group *group); #if IS_ENABLED(CONFIG_PSAMPLE) void psample_sample_packet(struct psample_group *group, struct sk_buff *skb, - u32 trunc_size, int in_ifindex, int out_ifindex, - u32 sample_rate); + u32 sample_rate, const struct psample_metadata *md); #else static inline void psample_sample_packet(struct psample_group *group, - struct sk_buff *skb, u32 trunc_size, - int in_ifindex, int out_ifindex, - u32 sample_rate) + struct sk_buff *skb, u32 sample_rate, + const struct psample_metadata *md) { } diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 2d6eb60c58c8..f7a6e14491fb 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -1242,6 +1242,20 @@ static inline void psched_ratecfg_getrate(struct tc_ratespec *res, res->linklayer = (r->linklayer & TC_LINKLAYER_MASK); } +struct psched_pktrate { + u64 rate_pkts_ps; /* packets per second */ + u32 mult; + u8 shift; +}; + +static inline u64 psched_pkt2t_ns(const struct psched_pktrate *r, + unsigned int pkt_num) +{ + return ((u64)pkt_num * r->mult) >> r->shift; +} + +void psched_ppscfg_precompute(struct psched_pktrate *r, u64 pktrate64); + /* Mini Qdisc serves for specific needs of ingress/clsact Qdisc. * The fast path only needs to access filter list and to update stats */ diff --git a/include/net/selftests.h b/include/net/selftests.h new file mode 100644 index 000000000000..e65e8d230d33 --- /dev/null +++ b/include/net/selftests.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _NET_SELFTESTS +#define _NET_SELFTESTS + +#include <linux/ethtool.h> + +#if IS_ENABLED(CONFIG_NET_SELFTESTS) + +void net_selftest(struct net_device *ndev, struct ethtool_test *etest, + u64 *buf); +int net_selftest_get_count(void); +void net_selftest_get_strings(u8 *data); + +#else + +static inline void net_selftest(struct net_device *ndev, struct ethtool_test *etest, + u64 *buf) +{ +} + +static inline int net_selftest_get_count(void) +{ + return 0; +} + +static inline void net_selftest_get_strings(u8 *data) +{ +} + +#endif +#endif /* _NET_SELFTESTS */ diff --git a/include/net/sock.h b/include/net/sock.h index 8487f58da36d..42bc5e1a627f 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1118,6 +1118,7 @@ struct inet_hashinfo; struct raw_hashinfo; struct smc_hashinfo; struct module; +struct sk_psock; /* * caches using SLAB_TYPESAFE_BY_RCU should let .next pointer from nulls nodes @@ -1188,6 +1189,11 @@ struct proto { void (*unhash)(struct sock *sk); void (*rehash)(struct sock *sk); int (*get_port)(struct sock *sk, unsigned short snum); +#ifdef CONFIG_BPF_SYSCALL + int (*psock_update_sk_prot)(struct sock *sk, + struct sk_psock *psock, + bool restore); +#endif /* Keeping track of sockets in use */ #ifdef CONFIG_PROC_FS diff --git a/include/net/switchdev.h b/include/net/switchdev.h index b7fc7d0f54e2..f1a5a9a3634d 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -68,6 +68,7 @@ enum switchdev_obj_id { }; struct switchdev_obj { + struct list_head list; struct net_device *orig_dev; enum switchdev_obj_id id; u32 flags; @@ -208,6 +209,7 @@ struct switchdev_notifier_fdb_info { const unsigned char *addr; u16 vid; u8 added_by_user:1, + is_local:1, offloaded:1; }; diff --git a/include/net/tc_act/tc_police.h b/include/net/tc_act/tc_police.h index 6d1e26b709b5..72649512dcdd 100644 --- a/include/net/tc_act/tc_police.h +++ b/include/net/tc_act/tc_police.h @@ -10,10 +10,13 @@ struct tcf_police_params { s64 tcfp_burst; u32 tcfp_mtu; s64 tcfp_mtu_ptoks; + s64 tcfp_pkt_burst; struct psched_ratecfg rate; bool rate_present; struct psched_ratecfg peak; bool peak_present; + struct psched_pktrate ppsrate; + bool pps_present; struct rcu_head rcu; }; @@ -24,6 +27,7 @@ struct tcf_police { spinlock_t tcfp_lock ____cacheline_aligned_in_smp; s64 tcfp_toks; s64 tcfp_ptoks; + s64 tcfp_pkttoks; s64 tcfp_t_c; }; @@ -97,6 +101,54 @@ static inline u32 tcf_police_burst(const struct tc_action *act) return burst; } +static inline u64 tcf_police_rate_pkt_ps(const struct tc_action *act) +{ + struct tcf_police *police = to_police(act); + struct tcf_police_params *params; + + params = rcu_dereference_protected(police->params, + lockdep_is_held(&police->tcf_lock)); + return params->ppsrate.rate_pkts_ps; +} + +static inline u32 tcf_police_burst_pkt(const struct tc_action *act) +{ + struct tcf_police *police = to_police(act); + struct tcf_police_params *params; + u32 burst; + + params = rcu_dereference_protected(police->params, + lockdep_is_held(&police->tcf_lock)); + + /* + * "rate" pkts "burst" nanoseconds + * ------------ * ------------------- + * 1 second 2^6 ticks + * + * ------------------------------------ + * NSEC_PER_SEC nanoseconds + * ------------------------ + * 2^6 ticks + * + * "rate" pkts "burst" nanoseconds 2^6 ticks + * = ------------ * ------------------- * ------------------------ + * 1 second 2^6 ticks NSEC_PER_SEC nanoseconds + * + * "rate" * "burst" + * = ---------------- pkts/nanosecond + * NSEC_PER_SEC^2 + * + * + * "rate" * "burst" + * = ---------------- pkts/second + * NSEC_PER_SEC + */ + burst = div_u64(params->tcfp_pkt_burst * params->ppsrate.rate_pkts_ps, + NSEC_PER_SEC); + + return burst; +} + static inline u32 tcf_police_tcfp_mtu(const struct tc_action *act) { struct tcf_police *police = to_police(act); diff --git a/include/net/tcp.h b/include/net/tcp.h index 963cd86d12dd..d05193cb0d99 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -883,36 +883,11 @@ struct tcp_skb_cb { struct inet6_skb_parm h6; #endif } header; /* For incoming skbs */ - struct { - __u32 flags; - struct sock *sk_redir; - void *data_end; - } bpf; }; }; #define TCP_SKB_CB(__skb) ((struct tcp_skb_cb *)&((__skb)->cb[0])) -static inline void bpf_compute_data_end_sk_skb(struct sk_buff *skb) -{ - TCP_SKB_CB(skb)->bpf.data_end = skb->data + skb_headlen(skb); -} - -static inline bool tcp_skb_bpf_ingress(const struct sk_buff *skb) -{ - return TCP_SKB_CB(skb)->bpf.flags & BPF_F_INGRESS; -} - -static inline struct sock *tcp_skb_bpf_redirect_fetch(struct sk_buff *skb) -{ - return TCP_SKB_CB(skb)->bpf.sk_redir; -} - -static inline void tcp_skb_bpf_redirect_clear(struct sk_buff *skb) -{ - TCP_SKB_CB(skb)->bpf.sk_redir = NULL; -} - extern const struct inet_connection_sock_af_ops ipv4_specific; #if IS_ENABLED(CONFIG_IPV6) @@ -1060,44 +1035,56 @@ struct rate_sample { }; struct tcp_congestion_ops { - struct list_head list; - u32 key; - u32 flags; - - /* initialize private data (optional) */ - void (*init)(struct sock *sk); - /* cleanup private data (optional) */ - void (*release)(struct sock *sk); +/* fast path fields are put first to fill one cache line */ /* return slow start threshold (required) */ u32 (*ssthresh)(struct sock *sk); + /* do new cwnd calculation (required) */ void (*cong_avoid)(struct sock *sk, u32 ack, u32 acked); + /* call before changing ca_state (optional) */ void (*set_state)(struct sock *sk, u8 new_state); + /* call when cwnd event occurs (optional) */ void (*cwnd_event)(struct sock *sk, enum tcp_ca_event ev); + /* call when ack arrives (optional) */ void (*in_ack_event)(struct sock *sk, u32 flags); - /* new value of cwnd after loss (required) */ - u32 (*undo_cwnd)(struct sock *sk); + /* hook for packet ack accounting (optional) */ void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample); + /* override sysctl_tcp_min_tso_segs */ u32 (*min_tso_segs)(struct sock *sk); - /* returns the multiplier used in tcp_sndbuf_expand (optional) */ - u32 (*sndbuf_expand)(struct sock *sk); + /* call when packets are delivered to update cwnd and pacing rate, * after all the ca_state processing. (optional) */ void (*cong_control)(struct sock *sk, const struct rate_sample *rs); + + + /* new value of cwnd after loss (required) */ + u32 (*undo_cwnd)(struct sock *sk); + /* returns the multiplier used in tcp_sndbuf_expand (optional) */ + u32 (*sndbuf_expand)(struct sock *sk); + +/* control/slow paths put last */ /* get info for inet_diag (optional) */ size_t (*get_info)(struct sock *sk, u32 ext, int *attr, union tcp_cc_info *info); - char name[TCP_CA_NAME_MAX]; - struct module *owner; -}; + char name[TCP_CA_NAME_MAX]; + struct module *owner; + struct list_head list; + u32 key; + u32 flags; + + /* initialize private data (optional) */ + void (*init)(struct sock *sk); + /* cleanup private data (optional) */ + void (*release)(struct sock *sk); +} ____cacheline_aligned_in_smp; int tcp_register_congestion_control(struct tcp_congestion_ops *type); void tcp_unregister_congestion_control(struct tcp_congestion_ops *type); @@ -2222,25 +2209,26 @@ void tcp_update_ulp(struct sock *sk, struct proto *p, __MODULE_INFO(alias, alias_userspace, name); \ __MODULE_INFO(alias, alias_tcp_ulp, "tcp-ulp-" name) +#ifdef CONFIG_NET_SOCK_MSG struct sk_msg; struct sk_psock; -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SYSCALL struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); +int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore); void tcp_bpf_clone(const struct sock *sk, struct sock *newsk); -#else -static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) -{ -} -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SYSCALL */ -#ifdef CONFIG_NET_SOCK_MSG int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes, int flags); -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags); #endif /* CONFIG_NET_SOCK_MSG */ +#if !defined(CONFIG_BPF_SYSCALL) || !defined(CONFIG_NET_SOCK_MSG) +static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) +{ +} +#endif + #ifdef CONFIG_CGROUP_BPF static inline void bpf_skops_init_skb(struct bpf_sock_ops_kern *skops, struct sk_buff *skb, diff --git a/include/net/udp.h b/include/net/udp.h index a132a02b2f2c..360df454356c 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -329,6 +329,8 @@ struct sock *__udp6_lib_lookup(struct net *net, struct sk_buff *skb); struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, __be16 sport, __be16 dport); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); /* UDP uses skb->dev_scratch to cache as much information as possible and avoid * possibly multiple cache miss on dequeue() @@ -515,9 +517,33 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk, return segs; } -#ifdef CONFIG_BPF_STREAM_PARSER +static inline void udp_post_segment_fix_csum(struct sk_buff *skb) +{ + /* UDP-lite can't land here - no GRO */ + WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov); + + /* UDP packets generated with UDP_SEGMENT and traversing: + * + * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx) + * + * can reach an UDP socket with CHECKSUM_NONE, because + * __iptunnel_pull_header() converts CHECKSUM_PARTIAL into NONE. + * SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets with no UDP tunnel will + * have a valid checksum, as the GRO engine validates the UDP csum + * before the aggregation and nobody strips such info in between. + * Instead of adding another check in the tunnel fastpath, we can force + * a valid csum after the segmentation. + * Additionally fixup the UDP CB. + */ + UDP_SKB_CB(skb)->cscov = skb->len; + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) + skb->csum_valid = 1; +} + +#ifdef CONFIG_BPF_SYSCALL struct sk_psock; struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); -#endif /* BPF_STREAM_PARSER */ +int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore); +#endif #endif /* _UDP_H */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index cc17bc957548..9c0722c6d7ac 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -80,19 +80,6 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) -{ - struct xsk_map *m = container_of(map, struct xsk_map, map); - struct xdp_sock *xs; - - if (key >= map->max_entries) - return NULL; - - xs = READ_ONCE(m->xsk_map[key]); - return xs; -} - #else static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) @@ -109,12 +96,6 @@ static inline void __xsk_map_flush(void) { } -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) -{ - return NULL; -} - #endif /* CONFIG_XDP_SOCKETS */ #endif /* _LINUX_XDP_SOCK_H */ |