diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-21 08:41:32 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-21 08:41:32 -0800 |
commit | 609d3bc6230514a8ca79b377775b17e8c3d9ac93 (patch) | |
tree | ce86c28363fb90b8f87e3e81db71fb382b0001d1 | |
parent | 878cf96f686c59b82ee76c2b233c41b5fc3c0936 (diff) | |
parent | 19e72b064fc32cd58f6fc0b1eb64ac2e4f770e76 (diff) |
Merge tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, netfilter and can.
Current release - regressions:
- bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func
- rxrpc:
- fix security setting propagation
- fix null-deref in rxrpc_unuse_local()
- fix switched parameters in peer tracing
Current release - new code bugs:
- rxrpc:
- fix I/O thread startup getting skipped
- fix locking issues in rxrpc_put_peer_locked()
- fix I/O thread stop
- fix uninitialised variable in rxperf server
- fix the return value of rxrpc_new_incoming_call()
- microchip: vcap: fix initialization of value and mask
- nfp: fix unaligned io read of capabilities word
Previous releases - regressions:
- stop in-kernel socket users from corrupting socket's task_frag
- stream: purge sk_error_queue in sk_stream_kill_queues()
- openvswitch: fix flow lookup to use unmasked key
- dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
- devlink:
- hold region lock when flushing snapshots
- protect devlink dump by the instance lock
Previous releases - always broken:
- bpf:
- prevent leak of lsm program after failed attach
- resolve fext program type when checking map compatibility
- skbuff: account for tail adjustment during pull operations
- macsec: fix net device access prior to holding a lock
- bonding: switch back when high prio link up
- netfilter: flowtable: really fix NAT IPv6 offload
- enetc: avoid buffer leaks on xdp_do_redirect() failure
- unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
- dsa: microchip: remove IRQF_TRIGGER_FALLING in
request_threaded_irq"
* tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
net: fec: check the return value of build_skb()
net: simplify sk_page_frag
Treewide: Stop corrupting socket's task_frag
net: Introduce sk_use_task_frag in struct sock.
mctp: Remove device type check at unregister
net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
can: flexcan: avoid unbalanced pm_runtime_enable warning
Documentation: devlink: add missing toc entry for etas_es58x devlink doc
mctp: serial: Fix starting value for frame check sequence
nfp: fix unaligned io read of capabilities word
net: stream: purge sk_error_queue in sk_stream_kill_queues()
myri10ge: Fix an error handling path in myri10ge_probe()
net: microchip: vcap: Fix initialization of value and mask
rxrpc: Fix the return value of rxrpc_new_incoming_call()
rxrpc: rxperf: Fix uninitialised variable
rxrpc: Fix I/O thread stop
rxrpc: Fix switched parameters in peer tracing
rxrpc: Fix locking issues in rxrpc_put_peer_locked()
rxrpc: Fix I/O thread startup getting skipped
...
77 files changed, 861 insertions, 257 deletions
diff --git a/Documentation/bpf/map_sk_storage.rst b/Documentation/bpf/map_sk_storage.rst index 047e16c8aaa8..4e9d23ab9ecd 100644 --- a/Documentation/bpf/map_sk_storage.rst +++ b/Documentation/bpf/map_sk_storage.rst @@ -34,13 +34,12 @@ bpf_sk_storage_get() void *bpf_sk_storage_get(struct bpf_map *map, void *sk, void *value, u64 flags) -Socket-local storage can be retrieved using the ``bpf_sk_storage_get()`` -helper. The helper gets the storage from ``sk`` that is associated with ``map``. -If the ``BPF_LOCAL_STORAGE_GET_F_CREATE`` flag is used then -``bpf_sk_storage_get()`` will create the storage for ``sk`` if it does not -already exist. ``value`` can be used together with -``BPF_LOCAL_STORAGE_GET_F_CREATE`` to initialize the storage value, otherwise it -will be zero initialized. Returns a pointer to the storage on success, or +Socket-local storage for ``map`` can be retrieved from socket ``sk`` using the +``bpf_sk_storage_get()`` helper. If the ``BPF_LOCAL_STORAGE_GET_F_CREATE`` +flag is used then ``bpf_sk_storage_get()`` will create the storage for ``sk`` +if it does not already exist. ``value`` can be used together with +``BPF_LOCAL_STORAGE_GET_F_CREATE`` to initialize the storage value, otherwise +it will be zero initialized. Returns a pointer to the storage on success, or ``NULL`` in case of failure. .. note:: @@ -54,9 +53,9 @@ bpf_sk_storage_delete() long bpf_sk_storage_delete(struct bpf_map *map, void *sk) -Socket-local storage can be deleted using the ``bpf_sk_storage_delete()`` -helper. The helper deletes the storage from ``sk`` that is identified by -``map``. Returns ``0`` on success, or negative error in case of failure. +Socket-local storage for ``map`` can be deleted from socket ``sk`` using the +``bpf_sk_storage_delete()`` helper. Returns ``0`` on success, or negative +error in case of failure. User space ---------- @@ -68,16 +67,20 @@ bpf_map_update_elem() int bpf_map_update_elem(int map_fd, const void *key, const void *value, __u64 flags) -Socket-local storage for the socket identified by ``key`` belonging to -``map_fd`` can be added or updated using the ``bpf_map_update_elem()`` libbpf -function. ``key`` must be a pointer to a valid ``fd`` in the user space -program. The ``flags`` parameter can be used to control the update behaviour: +Socket-local storage for map ``map_fd`` can be added or updated locally to a +socket using the ``bpf_map_update_elem()`` libbpf function. The socket is +identified by a `socket` ``fd`` stored in the pointer ``key``. The pointer +``value`` has the data to be added or updated to the socket ``fd``. The type +and size of ``value`` should be the same as the value type of the map +definition. -- ``BPF_ANY`` will create storage for ``fd`` or update existing storage. -- ``BPF_NOEXIST`` will create storage for ``fd`` only if it did not already - exist, otherwise the call will fail with ``-EEXIST``. -- ``BPF_EXIST`` will update existing storage for ``fd`` if it already exists, - otherwise the call will fail with ``-ENOENT``. +The ``flags`` parameter can be used to control the update behaviour: + +- ``BPF_ANY`` will create storage for `socket` ``fd`` or update existing storage. +- ``BPF_NOEXIST`` will create storage for `socket` ``fd`` only if it did not + already exist, otherwise the call will fail with ``-EEXIST``. +- ``BPF_EXIST`` will update existing storage for `socket` ``fd`` if it already + exists, otherwise the call will fail with ``-ENOENT``. Returns ``0`` on success, or negative error in case of failure. @@ -88,10 +91,10 @@ bpf_map_lookup_elem() int bpf_map_lookup_elem(int map_fd, const void *key, void *value) -Socket-local storage for the socket identified by ``key`` belonging to -``map_fd`` can be retrieved using the ``bpf_map_lookup_elem()`` libbpf -function. ``key`` must be a pointer to a valid ``fd`` in the user space -program. Returns ``0`` on success, or negative error in case of failure. +Socket-local storage for map ``map_fd`` can be retrieved from a socket using +the ``bpf_map_lookup_elem()`` libbpf function. The storage is retrieved from +the socket identified by a `socket` ``fd`` stored in the pointer +``key``. Returns ``0`` on success, or negative error in case of failure. bpf_map_delete_elem() ~~~~~~~~~~~~~~~~~~~~~ @@ -100,9 +103,10 @@ bpf_map_delete_elem() int bpf_map_delete_elem(int map_fd, const void *key) -Socket-local storage for the socket identified by ``key`` belonging to -``map_fd`` can be deleted using the ``bpf_map_delete_elem()`` libbpf -function. Returns ``0`` on success, or negative error in case of failure. +Socket-local storage for map ``map_fd`` can be deleted from a socket using the +``bpf_map_delete_elem()`` libbpf function. The storage is deleted from the +socket identified by a `socket` ``fd`` stored in the pointer ``key``. Returns +``0`` on success, or negative error in case of failure. Examples ======== diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index 4b653d040627..fee4d3968309 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -50,6 +50,7 @@ parameters, info versions, and other features it supports. :maxdepth: 1 bnxt + etas_es58x hns3 ionic ice diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst index 1120d71f28d7..49db1d11d7c4 100644 --- a/Documentation/networking/nf_conntrack-sysctl.rst +++ b/Documentation/networking/nf_conntrack-sysctl.rst @@ -163,6 +163,39 @@ nf_conntrack_timestamp - BOOLEAN Enable connection tracking flow timestamping. +nf_conntrack_sctp_timeout_closed - INTEGER (seconds) + default 10 + +nf_conntrack_sctp_timeout_cookie_wait - INTEGER (seconds) + default 3 + +nf_conntrack_sctp_timeout_cookie_echoed - INTEGER (seconds) + default 3 + +nf_conntrack_sctp_timeout_established - INTEGER (seconds) + default 432000 (5 days) + +nf_conntrack_sctp_timeout_shutdown_sent - INTEGER (seconds) + default 0.3 + +nf_conntrack_sctp_timeout_shutdown_recd - INTEGER (seconds) + default 0.3 + +nf_conntrack_sctp_timeout_shutdown_ack_sent - INTEGER (seconds) + default 3 + +nf_conntrack_sctp_timeout_heartbeat_sent - INTEGER (seconds) + default 30 + + This timeout is used to setup conntrack entry on secondary paths. + Default is set to hb_interval. + +nf_conntrack_sctp_timeout_heartbeat_acked - INTEGER (seconds) + default 210 + + This timeout is used to setup conntrack entry on secondary paths. + Default is set to (hb_interval * path_max_retrans + rto_max) + nf_conntrack_udp_timeout - INTEGER (seconds) default 30 diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 0e58a3187345..757f4692b5bd 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -1030,6 +1030,9 @@ randomize: sock.socket->sk->sk_allocation = GFP_NOIO; msock.socket->sk->sk_allocation = GFP_NOIO; + sock.socket->sk->sk_use_task_frag = false; + msock.socket->sk->sk_use_task_frag = false; + sock.socket->sk->sk_priority = TC_PRIO_INTERACTIVE_BULK; msock.socket->sk->sk_priority = TC_PRIO_INTERACTIVE; diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index e379ccc63c52..592cfa8b765a 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -512,6 +512,7 @@ static int sock_xmit(struct nbd_device *nbd, int index, int send, noreclaim_flag = memalloc_noreclaim_save(); do { sock->sk->sk_allocation = GFP_NOIO | __GFP_MEMALLOC; + sock->sk->sk_use_task_frag = false; msg.msg_name = NULL; msg.msg_namelen = 0; msg.msg_control = NULL; diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c b/drivers/isdn/hardware/mISDN/hfcmulti.c index 4f7eaa17fb27..e840609c50eb 100644 --- a/drivers/isdn/hardware/mISDN/hfcmulti.c +++ b/drivers/isdn/hardware/mISDN/hfcmulti.c @@ -3217,6 +3217,7 @@ static int hfcm_l1callback(struct dchannel *dch, u_int cmd) { struct hfc_multi *hc = dch->hw; + struct sk_buff_head free_queue; u_long flags; switch (cmd) { @@ -3245,6 +3246,7 @@ hfcm_l1callback(struct dchannel *dch, u_int cmd) l1_event(dch->l1, HW_POWERUP_IND); break; case HW_DEACT_REQ: + __skb_queue_head_init(&free_queue); /* start deactivation */ spin_lock_irqsave(&hc->lock, flags); if (hc->ctype == HFC_TYPE_E1) { @@ -3264,20 +3266,21 @@ hfcm_l1callback(struct dchannel *dch, u_int cmd) plxsd_checksync(hc, 0); } } - skb_queue_purge(&dch->squeue); + skb_queue_splice_init(&dch->squeue, &free_queue); if (dch->tx_skb) { - dev_kfree_skb(dch->tx_skb); + __skb_queue_tail(&free_queue, dch->tx_skb); dch->tx_skb = NULL; } dch->tx_idx = 0; if (dch->rx_skb) { - dev_kfree_skb(dch->rx_skb); + __skb_queue_tail(&free_queue, dch->rx_skb); dch->rx_skb = NULL; } test_and_clear_bit(FLG_TX_BUSY, &dch->Flags); if (test_and_clear_bit(FLG_BUSY_TIMER, &dch->Flags)) del_timer(&dch->timer); spin_unlock_irqrestore(&hc->lock, flags); + __skb_queue_purge(&free_queue); break; case HW_POWERUP_REQ: spin_lock_irqsave(&hc->lock, flags); @@ -3384,6 +3387,9 @@ handle_dmsg(struct mISDNchannel *ch, struct sk_buff *skb) case PH_DEACTIVATE_REQ: test_and_clear_bit(FLG_L2_ACTIVATED, &dch->Flags); if (dch->dev.D.protocol != ISDN_P_TE_S0) { + struct sk_buff_head free_queue; + + __skb_queue_head_init(&free_queue); spin_lock_irqsave(&hc->lock, flags); if (debug & DEBUG_HFCMULTI_MSG) printk(KERN_DEBUG @@ -3405,14 +3411,14 @@ handle_dmsg(struct mISDNchannel *ch, struct sk_buff *skb) /* deactivate */ dch->state = 1; } - skb_queue_purge(&dch->squeue); + skb_queue_splice_init(&dch->squeue, &free_queue); if (dch->tx_skb) { - dev_kfree_skb(dch->tx_skb); + __skb_queue_tail(&free_queue, dch->tx_skb); dch->tx_skb = NULL; } dch->tx_idx = 0; if (dch->rx_skb) { - dev_kfree_skb(dch->rx_skb); + __skb_queue_tail(&free_queue, dch->rx_skb); dch->rx_skb = NULL; } test_and_clear_bit(FLG_TX_BUSY, &dch->Flags); @@ -3424,6 +3430,7 @@ handle_dmsg(struct mISDNchannel *ch, struct sk_buff *skb) #endif ret = 0; spin_unlock_irqrestore(&hc->lock, flags); + __skb_queue_purge(&free_queue); } else ret = l1_event(dch->l1, hh->prim); break; diff --git a/drivers/isdn/hardware/mISDN/hfcpci.c b/drivers/isdn/hardware/mISDN/hfcpci.c index e964a8dd8512..c0331b268010 100644 --- a/drivers/isdn/hardware/mISDN/hfcpci.c +++ b/drivers/isdn/hardware/mISDN/hfcpci.c @@ -1617,16 +1617,19 @@ hfcpci_l2l1D(struct mISDNchannel *ch, struct sk_buff *skb) test_and_clear_bit(FLG_L2_ACTIVATED, &dch->Flags); spin_lock_irqsave(&hc->lock, flags); if (hc->hw.protocol == ISDN_P_NT_S0) { + struct sk_buff_head free_queue; + + __skb_queue_head_init(&free_queue); /* prepare deactivation */ Write_hfc(hc, HFCPCI_STATES, 0x40); - skb_queue_purge(&dch->squeue); + skb_queue_splice_init(&dch->squeue, &free_queue); if (dch->tx_skb) { - dev_kfree_skb(dch->tx_skb); + __skb_queue_tail(&free_queue, dch->tx_skb); dch->tx_skb = NULL; } dch->tx_idx = 0; if (dch->rx_skb) { - dev_kfree_skb(dch->rx_skb); + __skb_queue_tail(&free_queue, dch->rx_skb); dch->rx_skb = NULL; } test_and_clear_bit(FLG_TX_BUSY, &dch->Flags); @@ -1639,10 +1642,12 @@ hfcpci_l2l1D(struct mISDNchannel *ch, struct sk_buff *skb) hc->hw.mst_m &= ~HFCPCI_MASTER; Write_hfc(hc, HFCPCI_MST_MODE, hc->hw.mst_m); ret = 0; + spin_unlock_irqrestore(&hc->lock, flags); + __skb_queue_purge(&free_queue); } else { ret = l1_event(dch->l1, hh->prim); + spin_unlock_irqrestore(&hc->lock, flags); } - spin_unlock_irqrestore(&hc->lock, flags); break; } if (!ret) diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 651f2f8f685b..1efd17979f24 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -326,20 +326,24 @@ hfcusb_l2l1D(struct mISDNchannel *ch, struct sk_buff *skb) test_and_clear_bit(FLG_L2_ACTIVATED, &dch->Flags); if (hw->protocol == ISDN_P_NT_S0) { + struct sk_buff_head free_queue; + + __skb_queue_head_init(&free_queue); hfcsusb_ph_command(hw, HFC_L1_DEACTIVATE_NT); spin_lock_irqsave(&hw->lock, flags); - skb_queue_purge(&dch->squeue); + skb_queue_splice_init(&dch->squeue, &free_queue); if (dch->tx_skb) { - dev_kfree_skb(dch->tx_skb); + __skb_queue_tail(&free_queue, dch->tx_skb); dch->tx_skb = NULL; } dch->tx_idx = 0; if (dch->rx_skb) { - dev_kfree_skb(dch->rx_skb); + __skb_queue_tail(&free_queue, dch->rx_skb); dch->rx_skb = NULL; } test_and_clear_bit(FLG_TX_BUSY, &dch->Flags); spin_unlock_irqrestore(&hw->lock, flags); + __skb_queue_purge(&free_queue); #ifdef FIXME if (test_and_clear_bit(FLG_L1_BUSY, &dch->Flags)) dchannel_sched_event(&hc->dch, D_CLEARBUSY); @@ -1330,7 +1334,7 @@ tx_iso_complete(struct urb *urb) printk("\n"); } - dev_kfree_skb(tx_skb); + dev_consume_skb_irq(tx_skb); tx_skb = NULL; if (fifo->dch && get_next_dframe(fifo->dch)) tx_skb = fifo->dch->tx_skb; diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index f7767afe116b..b4c65783960a 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2654,8 +2654,9 @@ static void bond_miimon_link_change(struct bonding *bond, static void bond_miimon_commit(struct bonding *bond) { - struct list_head *iter; struct slave *slave, *primary; + bool do_failover = false; + struct list_head *iter; bond_for_each_slave(bond, slave, iter) { switch (slave->link_new_state) { @@ -2699,8 +2700,9 @@ static void bond_miimon_commit(struct bonding *bond) bond_miimon_link_change(bond, slave, BOND_LINK_UP); - if (!bond->curr_active_slave || slave == primary) - goto do_failover; + if (!rcu_access_pointer(bond->curr_active_slave) || slave == primary || + slave->prio > rcu_dereference(bond->curr_active_slave)->prio) + do_failover = true; continue; @@ -2721,7 +2723,7 @@ static void bond_miimon_commit(struct bonding *bond) bond_miimon_link_change(bond, slave, BOND_LINK_DOWN); if (slave == rcu_access_pointer(bond->curr_active_slave)) - goto do_failover; + do_failover = true; continue; @@ -2732,8 +2734,9 @@ static void bond_miimon_commit(struct bonding *bond) continue; } + } -do_failover: + if (do_failover) { block_netpoll_tx(); bond_select_active_slave(bond); unblock_netpoll_tx(); @@ -3531,6 +3534,7 @@ static int bond_ab_arp_inspect(struct bonding *bond) */ static void bond_ab_arp_commit(struct bonding *bond) { + bool do_failover = false; struct list_head *iter; unsigned long last_tx; struct slave *slave; @@ -3560,8 +3564,9 @@ static void bond_ab_arp_commit(struct bonding *bond) slave_info(bond->dev, slave->dev, "link status definitely up\n"); if (!rtnl_dereference(bond->curr_active_slave) || - slave == rtnl_dereference(bond->primary_slave)) - goto do_failover; + slave == rtnl_dereference(bond->primary_slave) || + slave->prio > rtnl_dereference(bond->curr_active_slave)->prio) + do_failover = true; } @@ -3580,7 +3585,7 @@ static void bond_ab_arp_commit(struct bonding *bond) if (slave == rtnl_dereference(bond->curr_active_slave)) { RCU_INIT_POINTER(bond->current_arp_slave, NULL); - goto do_failover; + do_failover = true; } continue; @@ -3604,8 +3609,9 @@ static void bond_ab_arp_commit(struct bonding *bond) slave->link_new_state); continue; } + } -do_failover: + if (do_failover) { block_netpoll_tx(); bond_select_active_slave(bond); unblock_netpoll_tx(); diff --git a/drivers/net/can/flexcan/flexcan-core.c b/drivers/net/can/flexcan/flexcan-core.c index 0aeff34e5ae1..6d638c93977b 100644 --- a/drivers/net/can/flexcan/flexcan-core.c +++ b/drivers/net/can/flexcan/flexcan-core.c @@ -2349,9 +2349,15 @@ static int __maybe_unused flexcan_noirq_resume(struct device *device) if (netif_running(dev)) { int err; - err = pm_runtime_force_resume(device); - if (err) - return err; + /* For the wakeup in auto stop mode, no need to gate on the + * clock here, hardware will do this automatically. + */ + if (!(device_may_wakeup(device) && + priv->devtype_data.quirks & FLEXCAN_QUIRK_AUTO_STOP_MODE)) { + err = pm_runtime_force_resume(device); + if (err) + return err; + } if (device_may_wakeup(device)) flexcan_enable_wakeup_irq(priv, false); diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c index f688124d6d66..ef341c4254fc 100644 --- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c +++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c @@ -545,6 +545,7 @@ static int kvaser_usb_hydra_send_simple_cmd(struct kvaser_usb *dev, u8 cmd_no, int channel) { struct kvaser_cmd *cmd; + size_t cmd_len; int err; cmd = kzalloc(sizeof(*cmd), GFP_KERNEL); @@ -552,6 +553,7 @@ static int kvaser_usb_hydra_send_simple_cmd(struct kvaser_usb *dev, return -ENOMEM; cmd->header.cmd_no = cmd_no; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); if (channel < 0) { kvaser_usb_hydra_set_cmd_dest_he (cmd, KVASER_USB_HYDRA_HE_ADDRESS_ILLEGAL); @@ -568,7 +570,7 @@ static int kvaser_usb_hydra_send_simple_cmd(struct kvaser_usb *dev, kvaser_usb_hydra_set_cmd_transid (cmd, kvaser_usb_hydra_get_next_transid(dev)); - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); if (err) goto end; @@ -584,6 +586,7 @@ kvaser_usb_hydra_send_simple_cmd_async(struct kvaser_usb_net_priv *priv, { struct kvaser_cmd *cmd; struct kvaser_usb *dev = priv->dev; + size_t cmd_len; int err; cmd = kzalloc(sizeof(*cmd), GFP_ATOMIC); @@ -591,14 +594,14 @@ kvaser_usb_hydra_send_simple_cmd_async(struct kvaser_usb_net_priv *priv, return -ENOMEM; cmd->header.cmd_no = cmd_no; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); kvaser_usb_hydra_set_cmd_dest_he (cmd, dev->card_data.hydra.channel_to_he[priv->channel]); kvaser_usb_hydra_set_cmd_transid (cmd, kvaser_usb_hydra_get_next_transid(dev)); - err = kvaser_usb_send_cmd_async(priv, cmd, - kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd_async(priv, cmd, cmd_len); if (err) kfree(cmd); @@ -742,6 +745,7 @@ static int kvaser_usb_hydra_get_single_capability(struct kvaser_usb *dev, { struct kvaser_usb_dev_card_data *card_data = &dev->card_data; struct kvaser_cmd *cmd; + size_t cmd_len; u32 value = 0; u32 mask = 0; u16 cap_cmd_res; @@ -753,13 +757,14 @@ static int kvaser_usb_hydra_get_single_capability(struct kvaser_usb *dev, return -ENOMEM; cmd->header.cmd_no = CMD_GET_CAPABILITIES_REQ; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); cmd->cap_req.cap_cmd = cpu_to_le16(cap_cmd_req); kvaser_usb_hydra_set_cmd_dest_he(cmd, card_data->hydra.sysdbg_he); kvaser_usb_hydra_set_cmd_transid (cmd, kvaser_usb_hydra_get_next_transid(dev)); - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); if (err) goto end; @@ -1578,6 +1583,7 @@ static int kvaser_usb_hydra_get_busparams(struct kvaser_usb_net_priv *priv, struct kvaser_usb *dev = priv->dev; struct kvaser_usb_net_hydra_priv *hydra = priv->sub_priv; struct kvaser_cmd *cmd; + size_t cmd_len; int err; if (!hydra) @@ -1588,6 +1594,7 @@ static int kvaser_usb_hydra_get_busparams(struct kvaser_usb_net_priv *priv, return -ENOMEM; cmd->header.cmd_no = CMD_GET_BUSPARAMS_REQ; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); kvaser_usb_hydra_set_cmd_dest_he (cmd, dev->card_data.hydra.channel_to_he[priv->channel]); kvaser_usb_hydra_set_cmd_transid @@ -1597,7 +1604,7 @@ static int kvaser_usb_hydra_get_busparams(struct kvaser_usb_net_priv *priv, reinit_completion(&priv->get_busparams_comp); - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); if (err) return err; @@ -1624,6 +1631,7 @@ static int kvaser_usb_hydra_set_bittiming(const struct net_device *netdev, struct kvaser_cmd *cmd; struct kvaser_usb_net_priv *priv = netdev_priv(netdev); struct kvaser_usb *dev = priv->dev; + size_t cmd_len; int err; cmd = kzalloc(sizeof(*cmd), GFP_KERNEL); @@ -1631,6 +1639,7 @@ static int kvaser_usb_hydra_set_bittiming(const struct net_device *netdev, return -ENOMEM; cmd->header.cmd_no = CMD_SET_BUSPARAMS_REQ; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); memcpy(&cmd->set_busparams_req.busparams_nominal, busparams, sizeof(cmd->set_busparams_req.busparams_nominal)); @@ -1639,7 +1648,7 @@ static int kvaser_usb_hydra_set_bittiming(const struct net_device *netdev, kvaser_usb_hydra_set_cmd_transid (cmd, kvaser_usb_hydra_get_next_transid(dev)); - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); kfree(cmd); @@ -1652,6 +1661,7 @@ static int kvaser_usb_hydra_set_data_bittiming(const struct net_device *netdev, struct kvaser_cmd *cmd; struct kvaser_usb_net_priv *priv = netdev_priv(netdev); struct kvaser_usb *dev = priv->dev; + size_t cmd_len; int err; cmd = kzalloc(sizeof(*cmd), GFP_KERNEL); @@ -1659,6 +1669,7 @@ static int kvaser_usb_hydra_set_data_bittiming(const struct net_device *netdev, return -ENOMEM; cmd->header.cmd_no = CMD_SET_BUSPARAMS_FD_REQ; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); memcpy(&cmd->set_busparams_req.busparams_data, busparams, sizeof(cmd->set_busparams_req.busparams_data)); @@ -1676,7 +1687,7 @@ static int kvaser_usb_hydra_set_data_bittiming(const struct net_device *netdev, kvaser_usb_hydra_set_cmd_transid (cmd, kvaser_usb_hydra_get_next_transid(dev)); - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); kfree(cmd); @@ -1804,6 +1815,7 @@ static int kvaser_usb_hydra_get_software_info(struct kvaser_usb *dev) static int kvaser_usb_hydra_get_software_details(struct kvaser_usb *dev) { struct kvaser_cmd *cmd; + size_t cmd_len; int err; u32 flags; struct kvaser_usb_dev_card_data *card_data = &dev->card_data; @@ -1813,6 +1825,7 @@ static int kvaser_usb_hydra_get_software_details(struct kvaser_usb *dev) return -ENOMEM; cmd->header.cmd_no = CMD_GET_SOFTWARE_DETAILS_REQ; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); cmd->sw_detail_req.use_ext_cmd = 1; kvaser_usb_hydra_set_cmd_dest_he (cmd, KVASER_USB_HYDRA_HE_ADDRESS_ILLEGAL); @@ -1820,7 +1833,7 @@ static int kvaser_usb_hydra_get_software_details(struct kvaser_usb *dev) kvaser_usb_hydra_set_cmd_transid (cmd, kvaser_usb_hydra_get_next_transid(dev)); - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); if (err) goto end; @@ -1938,6 +1951,7 @@ static int kvaser_usb_hydra_set_opt_mode(const struct kvaser_usb_net_priv *priv) { struct kvaser_usb *dev = priv->dev; struct kvaser_cmd *cmd; + size_t cmd_len; int err; if ((priv->can.ctrlmode & @@ -1953,6 +1967,7 @@ static int kvaser_usb_hydra_set_opt_mode(const struct kvaser_usb_net_priv *priv) return -ENOMEM; cmd->header.cmd_no = CMD_SET_DRIVERMODE_REQ; + cmd_len = kvaser_usb_hydra_cmd_size(cmd); kvaser_usb_hydra_set_cmd_dest_he (cmd, dev->card_data.hydra.channel_to_he[priv->channel]); kvaser_usb_hydra_set_cmd_transid @@ -1962,7 +1977,7 @@ static int kvaser_usb_hydra_set_opt_mode(const struct kvaser_usb_net_priv *priv) else cmd->set_ctrlmode.mode = KVASER_USB_HYDRA_CTRLMODE_NORMAL; - err = kvaser_usb_send_cmd(dev, cmd, kvaser_usb_hydra_cmd_size(cmd)); + err = kvaser_usb_send_cmd(dev, cmd, cmd_len); kfree(cmd); return err; diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 423f944cc34c..9b20c2ee6d62 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -1996,8 +1996,7 @@ static int ksz_irq_common_setup(struct ksz_device *dev, struct ksz_irq *kirq) irq_create_mapping(kirq->domain, n); ret = request_threaded_irq(kirq->irq_num, NULL, ksz_irq_thread_fn, - IRQF_ONESHOT | IRQF_TRIGGER_FALLING, - kirq->name, kirq); + IRQF_ONESHOT, kirq->name, kirq); if (ret) goto out; diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index e74c6b406172..908fa89444c9 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -2919,9 +2919,6 @@ static void mt753x_phylink_get_caps(struct dsa_switch *ds, int port, config->mac_capabilities = MAC_ASYM_PAUSE | MAC_SYM_PAUSE | MAC_10 | MAC_100 | MAC_1000FD; - if ((priv->id == ID_MT7531) && mt753x_is_mac_port(port)) - config->mac_capabilities |= MAC_2500FD; - /* This driver does not make use of the speed, duplex, pause or the * advertisement in its mac_config, so it is safe to mark this driver * as non-legacy. diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index ba4fff8690aa..242b8b325504 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -689,13 +689,12 @@ static void mv88e6352_phylink_get_caps(struct mv88e6xxx_chip *chip, int port, /* Port 4 supports automedia if the serdes is associated with it. */ if (port == 4) { - mv88e6xxx_reg_lock(chip); err = mv88e6352_g2_scratch_port_has_serdes(chip, port); if (err < 0) dev_err(chip->dev, "p%d: failed to read scratch\n", port); if (err <= 0) - goto unlock; + return; cmode = mv88e6352_get_port4_serdes_cmode(chip); if (cmode < 0) @@ -703,8 +702,6 @@ static void mv88e6352_phylink_get_caps(struct mv88e6xxx_chip *chip, int port, port); else mv88e6xxx_translate_cmode(cmode, supported); -unlock: - mv88e6xxx_reg_unlock(chip); } } @@ -831,7 +828,9 @@ static void mv88e6xxx_get_caps(struct dsa_switch *ds, int port, { struct mv88e6xxx_chip *chip = ds->priv; + mv88e6xxx_reg_lock(chip); chip->info->ops->phylink_get_caps(chip, port, config); + mv88e6xxx_reg_unlock(chip); if (mv88e6xxx_phy_is_internal(ds, port)) { __set_bit(PHY_INTERFACE_MODE_INTERNAL, @@ -3307,7 +3306,7 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port) struct phylink_config pl_config = {}; unsigned long caps; - mv88e6xxx_get_caps(ds, port, &pl_config); + chip->info->ops->phylink_get_caps(chip, port, &pl_config); caps = pl_config.mac_capabilities; diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 8671591cb750..3a79ead5219a 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1489,23 +1489,6 @@ static void enetc_xdp_drop(struct enetc_bdr *rx_ring, int rx_ring_first, rx_ring->stats.xdp_drops++; } -static void enetc_xdp_free(struct enetc_bdr *rx_ring, int rx_ring_first, - int rx_ring_last) -{ - while (rx_ring_first != rx_ring_last) { - struct enetc_rx_swbd *rx_swbd = &rx_ring->rx_swbd[rx_ring_first]; - - if (rx_swbd->page) { - dma_unmap_page(rx_ring->dev, rx_swbd->dma, PAGE_SIZE, - rx_swbd->dir); - __free_page(rx_swbd->page); - rx_swbd->page = NULL; - } - enetc_bdr_idx_inc(rx_ring, &rx_ring_first); - } - rx_ring->stats.xdp_redirect_failures++; -} - static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, struct napi_struct *napi, int work_limit, struct bpf_prog *prog) @@ -1527,8 +1510,8 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, int orig_i, orig_cleaned_cnt; struct xdp_buff xdp_buff; struct sk_buff *skb; - int tmp_orig_i, err; u32 bd_status; + int err; rxbd = enetc_rxbd(rx_ring, i); bd_status = le32_to_cpu(rxbd->r.lstatus); @@ -1615,18 +1598,16 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, break; } - tmp_orig_i = orig_i; - - while (orig_i != i) { - enetc_flip_rx_buff(rx_ring, - &rx_ring->rx_swbd[orig_i]); - enetc_bdr_idx_inc(rx_ring, &orig_i); - } - err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); if (unlikely(err)) { - enetc_xdp_free(rx_ring, tmp_orig_i, i); + enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_redirect_failures++; } else { + while (orig_i != i) { + enetc_flip_rx_buff(rx_ring, + &rx_ring->rx_swbd[orig_i]); + enetc_bdr_idx_inc(rx_ring, &orig_i); + } xdp_redirect_frm_cnt++; rx_ring->stats.xdp_redirect++; } diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 5528b0af82ae..644f3c963730 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1674,6 +1674,14 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) * bridging applications. */ skb = build_skb(page_address(page), PAGE_SIZE); + if (unlikely(!skb)) { + page_pool_recycle_direct(rxq->page_pool, page); + ndev->stats.rx_dropped++; + + netdev_err_once(ndev, "build_skb failed!\n"); + goto rx_processing_done; + } + skb_reserve(skb, data_start); skb_put(skb, pkt_len - sub_len); skb_mark_for_recycle(skb); diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 97290fc0fddd..3c0c35ecea10 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -7525,7 +7525,7 @@ static void igb_vf_reset_msg(struct igb_adapter *adapter, u32 vf) { struct e1000_hw *hw = &adapter->hw; unsigned char *vf_mac = adapter->vf_data[vf].vf_mac_addresses; - u32 reg, msgbuf[3]; + u32 reg, msgbuf[3] = {}; u8 *addr = (u8 *)(&msgbuf[1]); /* process all the same items cleared in a function level reset */ diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 1e7e7071f64d..df3e26c0cf01 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -94,6 +94,8 @@ struct igc_ring { u8 queue_index; /* logical index of the ring*/ u8 reg_idx; /* physical index of the ring */ bool launchtime_enable; /* true if LaunchTime is enabled */ + ktime_t last_tx_cycle; /* end of the cycle with a launchtime transmission */ + ktime_t last_ff_cycle; /* Last cycle with an active first flag */ u32 start_time; u32 end_time; @@ -182,6 +184,7 @@ struct igc_adapter { ktime_t base_time; ktime_t cycle_time; + bool qbv_enable; /* OS defined structs */ struct pci_dev *pdev; diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h index f7311aeb293b..a7b22639cfcd 100644 --- a/drivers/net/ethernet/intel/igc/igc_defines.h +++ b/drivers/net/ethernet/intel/igc/igc_defines.h @@ -321,6 +321,8 @@ #define IGC_ADVTXD_L4LEN_SHIFT 8 /* Adv ctxt L4LEN shift */ #define IGC_ADVTXD_MSS_SHIFT 16 /* Adv ctxt MSS shift */ +#define IGC_ADVTXD_TSN_CNTX_FIRST 0x00000080 + /* Transmit Control */ #define IGC_TCTL_EN 0x00000002 /* enable Tx */ #define IGC_TCTL_PSP 0x00000008 /* pad short packets */ diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 1586e1e435c6..44b1740dc098 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1000,25 +1000,118 @@ static int igc_write_mc_addr_list(struct net_device *netdev) return netdev_mc_count(netdev); } -static __le32 igc_tx_launchtime(struct igc_adapter *adapter, ktime_t txtime) +static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime, + bool *first_flag, bool *insert_empty) { + struct igc_adapter *adapter = netdev_priv(ring->netdev); ktime_t cycle_time = adapter->cycle_time; ktime_t base_time = adapter->base_time; + ktime_t now = ktime_get_clocktai(); + ktime_t baset_est, end_of_cycle; u32 launchtime; + s64 n; - /* FIXME: when using ETF together with taprio, we may have a - * case where 'delta' is larger than the cycle_time, this may - * cause problems if we don't read the current value of - * IGC_BASET, as the value writen into the launchtime - * descriptor field may be misinterpreted. + n = div64_s64(ktime_sub_ns(now, base_time), cycle_time); + + baset_est = ktime_add_ns(base_time, cycle_time * (n)); + end_of_cycle = ktime_add_ns(baset_est, cycle_time); + + if (ktime_compare(txtime, end_of_cycle) >= 0) { + if (baset_est != ring->last_ff_cycle) { + *first_flag = true; + ring->last_ff_cycle = baset_est; + + if (ktime_compare(txtime, ring->last_tx_cycle) > 0) + *insert_empty = true; + } + } + + /* Introducing a window at end of cycle on which packets + * potentially not honor launchtime. Window of 5us chosen + * considering software update the tail pointer and packets + * are dma'ed to packet buffer. */ - div_s64_rem(ktime_sub_ns(txtime, base_time), cycle_time, &launchtime); + if ((ktime_sub_ns(end_of_cycle, now) < 5 * NSEC_PER_USEC)) + netdev_warn(ring->netdev, "Packet with txtime=%llu may not be honoured\n", + txtime); + + ring->last_tx_cycle = end_of_cycle; + + launchtime = ktime_sub_ns(txtime, baset_est); + if (launchtime > 0) + div_s64_rem(launchtime, cycle_time, &launchtime); + else + launchtime = 0; return cpu_to_le32(launchtime); } +static int igc_init_empty_frame(struct igc_ring *ring, + struct igc_tx_buffer *buffer, + struct sk_buff *skb) +{ + unsigned int size; + dma_addr_t dma; + + size = skb_headlen(skb); + + dma = dma_map_single(ring->dev, skb->data, size, DMA_TO_DEVICE); + if (dma_mapping_error(ring->dev, dma)) { + netdev_err_once(ring->netdev, "Failed to map DMA for TX\n"); + return -ENOMEM; + } + + buffer->skb = skb; + buffer->protocol = 0; + buffer->bytecount = skb->len; + buffer->gso_segs = 1; + buffer->time_stamp = jiffies; + dma_unmap_len_set(buffer, len, skb->len); + dma_unmap_addr_set(buffer, dma, dma); + + return 0; +} + +static int igc_init_tx_empty_descriptor(struct igc_ring *ring, + struct sk_buff *skb, + struct igc_tx_buffer *first) +{ + union igc_adv_tx_desc *desc; + u32 cmd_type, olinfo_status; + int err; + + if (!igc_desc_unused(ring)) + return -EBUSY; + + err = igc_init_empty_frame(ring, first, skb); + if (err) + return err; + + cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT | + IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD | + first->bytecount; + olinfo_status = first->bytecount << IGC_ADVTXD_PAYLEN_SHIFT; + + desc = IGC_TX_DESC(ring, ring->next_to_use); + desc->read.cmd_type_len = cpu_to_le32(cmd_type); + desc->read.olinfo_status = cpu_to_le32(olinfo_status); + desc->read.buffer_addr = cpu_to_le64(dma_unmap_addr(first, dma)); + + netdev_tx_sent_queue(txring_txq(ring), skb->len); + + first->next_to_watch = desc; + + ring->next_to_use++; + if (ring->next_to_use == ring->count) + ring->next_to_use = 0; + + return 0; +} + +#define IGC_EMPTY_FRAME_SIZE 60 + static void igc_tx_ctxtdesc(struct igc_ring *tx_ring, - struct igc_tx_buffer *first, + __le32 launch_time, bool first_flag, u32 vlan_macip_lens, u32 type_tucmd, u32 mss_l4len_idx) { @@ -1037,26 +1130,17 @@ static void igc_tx_ctxtdesc(struct igc_ring *tx_ring, if (test_bit(IGC_RING_FLAG_TX_CTX_IDX, &tx_ring->flags)) mss_l4len_idx |= tx_ring->reg_idx << 4; + if (first_flag) + mss_l4len_idx |= IGC_ADVTXD_TSN_CNTX_FIRST; + context_desc->vlan_macip_lens = cpu_to_le32(vlan_macip_lens); context_desc->type_tucmd_mlhl = cpu_to_le32(type_tucmd); context_desc->mss_l4len_idx = cpu_to_le32(mss_l4len_idx); - - /* We assume there is always a valid Tx time available. Invalid times - * should have been handled by the upper layers. - */ - if (tx_ring->launchtime_enable) { - struct igc_adapter *adapter = netdev_priv(tx_ring->netdev); - ktime_t txtime = first->skb->tstamp; - - skb_txtime_consumed(first->skb); - context_desc->launch_time = igc_tx_launchtime(adapter, - txtime); - } else { - context_desc->launch_time = 0; - } + context_desc->launch_time = launch_time; } -static void igc_tx_csum(struct igc_ring *tx_ring, struct igc_tx_buffer *first) +static void igc_tx_csum(struct igc_ring *tx_ring, struct igc_tx_buffer *first, + __le32 launch_time, bool first_flag) { struct sk_buff *skb = first->skb; u32 vlan_macip_lens = 0; @@ -1096,7 +1180,8 @@ no_csum: vlan_macip_lens |= skb_network_offset(skb) << IGC_ADVTXD_MACLEN_SHIFT; vlan_macip_lens |= first->tx_flags & IGC_TX_FLAGS_VLAN_MASK; - igc_tx_ctxtdesc(tx_ring, first, vlan_macip_lens, type_tucmd, 0); + igc_tx_ctxtdesc(tx_ring, launch_time, first_flag, + vlan_macip_lens, type_tucmd, 0); } static int __igc_maybe_stop_tx(struct igc_ring *tx_ring, const u16 size) @@ -1320,6 +1405,7 @@ dma_error: static int igc_tso(struct igc_ring *tx_ring, struct igc_tx_buffer *first, + __le32 launch_time, bool first_flag, u8 *hdr_len) { u32 vlan_macip_lens, type_tucmd, mss_l4len_idx; @@ -1406,8 +1492,8 @@ static int igc_tso(struct igc_ring *tx_ring, vlan_macip_lens |= (ip.hdr - skb->data) << IGC_ADVTXD_MACLEN_SHIFT; vlan_macip_lens |= first->tx_flags & IGC_TX_FLAGS_VLAN_MASK; - igc_tx_ctxtdesc(tx_ring, first, vlan_macip_lens, - type_tucmd, mss_l4len_idx); + igc_tx_ctxtdesc(tx_ring, launch_time, first_flag, + vlan_macip_lens, type_tucmd, mss_l4len_idx); return 1; } @@ -1415,11 +1501,14 @@ static int igc_tso(struct igc_ring *tx_ring, static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, struct igc_ring *tx_ring) { + bool first_flag = false, insert_empty = false; u16 count = TXD_USE_COUNT(skb_headlen(skb)); __be16 protocol = vlan_get_protocol(skb); struct igc_tx_buffer *first; + __le32 launch_time = 0; u32 tx_flags = 0; unsigned short f; + ktime_t txtime; u8 hdr_len = 0; int tso = 0; @@ -1433,11 +1522,40 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, count += TXD_USE_COUNT(skb_frag_size( &skb_shinfo(skb)->frags[f])); - if (igc_maybe_stop_tx(tx_ring, count + 3)) { + if (igc_maybe_stop_tx(tx_ring, count + 5)) { /* this is a hard error */ return NETDEV_TX_BUSY; } + if (!tx_ring->launchtime_enable) + goto done; + + txtime = skb->tstamp; + skb->tstamp = ktime_set(0, 0); + launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty); + + if (insert_empty) { + struct igc_tx_buffer *empty_info; + struct sk_buff *empty; + void *data; + + empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; + empty = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); + if (!empty) + goto done; + + data = skb_put(empty, IGC_EMPTY_FRAME_SIZE); + memset(data, 0, IGC_EMPTY_FRAME_SIZE); + + igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); + + if (igc_init_tx_empty_descriptor(tx_ring, + empty, + empty_info) < 0) + dev_kfree_skb_any(empty); + } + +done: /* record the location of the first descriptor for this packet */ first = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; first->type = IGC_TX_BUFFER_TYPE_SKB; @@ -1474,11 +1592,11 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, first->tx_flags = tx_flags; first->protocol = protocol; - tso = igc_tso(tx_ring, first, &hdr_len); + tso = igc_tso(tx_ring, first, launch_time, first_flag, &hdr_len); if (tso < 0) goto out_drop; else if (!tso) - igc_tx_csum(tx_ring, first); + igc_tx_csum(tx_ring, first, launch_time, first_flag); igc_tx_map(tx_ring, first, hdr_len); @@ -5925,10 +6043,16 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, bool queue_configured[IGC_MAX_TX_QUEUES] = { }; u32 start_time = 0, end_time = 0; size_t n; + int i; + + adapter->qbv_enable = qopt->enable; if (!qopt->enable) return igc_tsn_clear_schedule(adapter); + if (qopt->base_time < 0) + return -ERANGE; + if (adapter->base_time) return -EALREADY; @@ -5940,10 +6064,24 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, for (n = 0; n < qopt->num_entries; n++) { struct tc_taprio_sched_entry *e = &qopt->entries[n]; - int i; end_time += e->interval; + /* If any of the conditions below are true, we need to manually + * control the end time of the cycle. + * 1. Qbv users can specify a cycle time that is not equal + * to the total GCL intervals. Hence, recalculation is + * necessary here to exclude the time interval that + * exceeds the cycle time. + * 2. According to IEEE Std. 802.1Q-2018 section 8.6.9.2, + * once the end of the list is reached, it will switch + * to the END_OF_CYCLE state and leave the gates in the + * same state until the next cycle is started. + */ + if (end_time > adapter->cycle_time || + n + 1 == qopt->num_entries) + end_time = adapter->cycle_time; + for (i = 0; i < adapter->num_tx_queues; i++) { struct igc_ring *ring = adapter->tx_ring[i]; @@ -5964,6 +6102,18 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, start_time += e->interval; } + /* Check whether a queue gets configured. + * If not, set the start and end time to be end time. + */ + for (i = 0; i < adapter->num_tx_queues; i++) { + if (!queue_configured[i]) { + struct igc_ring *ring = adapter->tx_ring[i]; + + ring->start_time = end_time; + ring->end_time = end_time; + } + } + return 0; } diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index f975ed807da1..bb10d7b65232 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -36,7 +36,7 @@ static unsigned int igc_tsn_new_flags(struct igc_adapter *adapter) { unsigned int new_flags = adapter->flags & ~IGC_FLAG_TSN_ANY_ENABLED; - if (adapter->base_time) + if (adapter->qbv_enable) new_flags |= IGC_FLAG_TSN_QBV_ENABLED; if (is_any_launchtime(adapter)) @@ -140,15 +140,8 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter) wr32(IGC_STQT(i), ring->start_time); wr32(IGC_ENDQT(i), ring->end_time); - if (adapter->base_time) { - /* If we have a base_time we are in "taprio" - * mode and we need to be strict about the - * cycles: only transmit a packet if it can be - * completed during that cycle. - */ - txqctl |= IGC_TXQCTL_STRICT_CYCLE | - IGC_TXQCTL_STRICT_END; - } + txqctl |= IGC_TXQCTL_STRICT_CYCLE | + IGC_TXQCTL_STRICT_END; if (ring->launchtime_enable) txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT; diff --git a/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs.c b/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs.c index 895bfff550d2..e0b206247f2e 100644 --- a/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs.c +++ b/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs.c @@ -83,6 +83,8 @@ static void vcap_debugfs_show_rule_keyfield(struct vcap_control *vctrl, hex = true; break; case VCAP_FIELD_U128: + value = data->u128.value; + mask = data->u128.mask; if (key == VCAP_KF_L3_IP6_SIP || key == VCAP_KF_L3_IP6_DIP) { u8 nvalue[16], nmask[16]; diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c index 8073d7a90a26..c5687d94ea88 100644 --- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c +++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c @@ -3912,6 +3912,7 @@ abort_with_slices: myri10ge_free_slices(mgp); abort_with_firmware: + kfree(mgp->msix_vectors); myri10ge_dummy_rdma(mgp, 0); abort_with_ioremap: diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c index 2314cf55e821..09053373288f 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -2509,7 +2509,7 @@ static int nfp_net_read_caps(struct nfp_net *nn) { /* Get some of the read-only fields from the BAR */ nn->cap = nn_readl(nn, NFP_NET_CFG_CAP); - nn->cap_w1 = nn_readq(nn, NFP_NET_CFG_CAP_WORD1); + nn->cap_w1 = nn_readl(nn, NFP_NET_CFG_CAP_WORD1); nn->max_mtu = nn_readl(nn, NFP_NET_CFG_MAX_MTU); /* ABI 4.x and ctrl vNIC always use chained metadata, in other cases diff --git a/drivers/net/ethernet/rdc/r6040.c b/drivers/net/ethernet/rdc/r6040.c index eecd52ed1ed2..f4d434c379e7 100644 --- a/drivers/net/ethernet/rdc/r6040.c +++ b/drivers/net/ethernet/rdc/r6040.c @@ -1159,10 +1159,12 @@ static int r6040_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) err = register_netdev(dev); if (err) { dev_err(&pdev->dev, "Failed to register net device\n"); - goto err_out_mdio_unregister; + goto err_out_phy_disconnect; } return 0; +err_out_phy_disconnect: + phy_disconnect(dev->phydev); err_out_mdio_unregister: mdiobus_unregister(lp->mii_bus); err_out_mdio: @@ -1186,6 +1188,7 @@ static void r6040_remove_one(struct pci_dev *pdev) struct r6040_private *lp = netdev_priv(dev); unregister_netdev(dev); + phy_disconnect(dev->phydev); mdiobus_unregister(lp->mii_bus); mdiobus_free(lp->mii_bus); netif_napi_del(&lp->napi); diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 33f723a9f471..b4e0fc7f65bd 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -2903,12 +2903,12 @@ static int ravb_remove(struct platform_device *pdev) priv->desc_bat_dma); /* Set reset mode */ ravb_write(ndev, CCC_OPC_RESET, CCC); - pm_runtime_put_sync(&pdev->dev); unregister_netdev(ndev); if (info->nc_queues) netif_napi_del(&priv->napi[RAVB_NC]); netif_napi_del(&priv->napi[RAVB_BE]); ravb_mdio_release(priv); + pm_runtime_put_sync(&pdev->dev); pm_runtime_disable(&pdev->dev); reset_control_assert(priv->rstc); free_netdev(ndev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index ec64b65dee34..c6951c976f5d 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -7099,6 +7099,7 @@ int stmmac_dvr_probe(struct device *device, priv->wq = create_singlethread_workqueue("stmmac_wq"); if (!priv->wq) { dev_err(priv->device, "failed to create workqueue\n"); + ret = -ENOMEM; goto error_wq_init; } diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index 9decb0c7961b..ecbde83b5243 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -2878,7 +2878,6 @@ static int am65_cpsw_nuss_remove(struct platform_device *pdev) return 0; } -#ifdef CONFIG_PM_SLEEP static int am65_cpsw_nuss_suspend(struct device *dev) { struct am65_cpsw_common *common = dev_get_drvdata(dev); @@ -2964,10 +2963,9 @@ static int am65_cpsw_nuss_resume(struct device *dev) return 0; } -#endif /* CONFIG_PM_SLEEP */ static const struct dev_pm_ops am65_cpsw_nuss_dev_pm_ops = { - SET_SYSTEM_SLEEP_PM_OPS(am65_cpsw_nuss_suspend, am65_cpsw_nuss_resume) + SYSTEM_SLEEP_PM_OPS(am65_cpsw_nuss_suspend, am65_cpsw_nuss_resume) }; static struct platform_driver am65_cpsw_nuss_driver = { diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 937f5b1f04ff..bf8ac7a3ded7 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -2593,7 +2593,7 @@ static int macsec_upd_offload(struct sk_buff *skb, struct genl_info *info) const struct macsec_ops *ops; struct macsec_context ctx; struct macsec_dev *macsec; - int ret; + int ret = 0; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -2606,28 +2606,36 @@ static int macsec_upd_offload(struct sk_buff *skb, struct genl_info *info) macsec_genl_offload_policy, NULL)) return -EINVAL; + rtnl_lock(); + dev = get_dev_from_nl(genl_info_net(info), attrs); - if (IS_ERR(dev)) - return PTR_ERR(dev); + if (IS_ERR(dev)) { + ret = PTR_ERR(dev); + goto out; + } macsec = macsec_priv(dev); - if (!tb_offload[MACSEC_OFFLOAD_ATTR_TYPE]) - return -EINVAL; + if (!tb_offload[MACSEC_OFFLOAD_ATTR_TYPE]) { + ret = -EINVAL; + goto out; + } offload = nla_get_u8(tb_offload[MACSEC_OFFLOAD_ATTR_TYPE]); if (macsec->offload == offload) - return 0; + goto out; /* Check if the offloading mode is supported by the underlying layers */ if (offload != MACSEC_OFFLOAD_OFF && - !macsec_check_offload(offload, macsec)) - return -EOPNOTSUPP; + !macsec_check_offload(offload, macsec)) { + ret = -EOPNOTSUPP; + goto out; + } /* Check if the net device is busy. */ - if (netif_running(dev)) - return -EBUSY; - - rtnl_lock(); + if (netif_running(dev)) { + ret = -EBUSY; + goto out; + } prev_offload = macsec->offload; macsec->offload = offload; @@ -2662,7 +2670,7 @@ static int macsec_upd_offload(struct sk_buff *skb, struct genl_info *info) rollback: macsec->offload = prev_offload; - +out: rtnl_unlock(); return ret; } diff --git a/drivers/net/mctp/mctp-serial.c b/drivers/net/mctp/mctp-serial.c index 7cd103fd34ef..9f9eaf896047 100644 --- a/drivers/net/mctp/mctp-serial.c +++ b/drivers/net/mctp/mctp-serial.c @@ -35,6 +35,8 @@ #define BYTE_FRAME 0x7e #define BYTE_ESC 0x7d +#define FCS_INIT 0xffff + static DEFINE_IDA(mctp_serial_ida); enum mctp_serial_state { @@ -123,7 +125,7 @@ static void mctp_serial_tx_work(struct work_struct *work) buf[2] = dev->txlen; if (!dev->txpos) - dev->txfcs = crc_ccitt(0, buf + 1, 2); + dev->txfcs = crc_ccitt(FCS_INIT, buf + 1, 2); txlen = write_chunk(dev, buf + dev->txpos, 3 - dev->txpos); if (txlen <= 0) { @@ -303,7 +305,7 @@ static void mctp_serial_push_header(struct mctp_serial *dev, unsigned char c) case 1: if (c == MCTP_SERIAL_VERSION) { dev->rxpos++; - dev->rxfcs = crc_ccitt_byte(0, c); + dev->rxfcs = crc_ccitt_byte(FCS_INIT, c); } else { dev->rxstate = STATE_ERR; } diff --git a/drivers/net/wireguard/timers.c b/drivers/net/wireguard/timers.c index b5706b6718b1..53d8a57a0dfa 100644 --- a/drivers/net/wireguard/timers.c +++ b/drivers/net/wireguard/timers.c @@ -46,7 +46,7 @@ static void wg_expired_retransmit_handshake(struct timer_list *timer) if (peer->timer_handshake_attempts > MAX_TIMER_HANDSHAKES) { pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d attempts, giving up\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, MAX_TIMER_HANDSHAKES + 2); + &peer->endpoint.addr, (int)MAX_TIMER_HANDSHAKES + 2); del_timer(&peer->timer_send_keepalive); /* We drop all packets without a keypair and don't try again, @@ -64,7 +64,7 @@ static void wg_expired_retransmit_handshake(struct timer_list *timer) ++peer->timer_handshake_attempts; pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d seconds, retrying (try %d)\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, REKEY_TIMEOUT, + &peer->endpoint.addr, (int)REKEY_TIMEOUT, peer->timer_handshake_attempts + 1); /* We clear the endpoint address src address, in case this is @@ -94,7 +94,7 @@ static void wg_expired_new_handshake(struct timer_list *timer) pr_debug("%s: Retrying handshake with peer %llu (%pISpfsc) because we stopped hearing back after %d seconds\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, KEEPALIVE_TIMEOUT + REKEY_TIMEOUT); + &peer->endpoint.addr, (int)(KEEPALIVE_TIMEOUT + REKEY_TIMEOUT)); /* We clear the endpoint address src address, in case this is the cause * of trouble. */ @@ -126,7 +126,7 @@ static void wg_queued_expired_zero_key_material(struct work_struct *work) pr_debug("%s: Zeroing out all keys for peer %llu (%pISpfsc), since we haven't received a new one in %d seconds\n", peer->device->dev->name, peer->internal_id, - &peer->endpoint.addr, REJECT_AFTER_TIME * 3); + &peer->endpoint.addr, (int)REJECT_AFTER_TIME * 3); wg_noise_handshake_clear(&peer->handshake); wg_noise_keypairs_clear(&peer->keypairs); wg_peer_put(peer); diff --git a/drivers/nfc/pn533/pn533.c b/drivers/nfc/pn533/pn533.c index d9f6367b9993..f0cac1900552 100644 --- a/drivers/nfc/pn533/pn533.c +++ b/drivers/nfc/pn533/pn533.c @@ -1295,6 +1295,8 @@ static int pn533_poll_dep_complete(struct pn533 *dev, void *arg, if (IS_ERR(resp)) return PTR_ERR(resp); + memset(&nfc_target, 0, sizeof(struct nfc_target)); + rsp = (struct pn533_cmd_jump_dep_response *)resp->data; rc = rsp->status & PN533_CMD_RET_MASK; @@ -1926,6 +1928,8 @@ static int pn533_in_dep_link_up_complete(struct pn533 *dev, void *arg, dev_dbg(dev->dev, "Creating new target\n"); + memset(&nfc_target, 0, sizeof(struct nfc_target)); + nfc_target.supported_protocols = NFC_PROTO_NFC_DEP_MASK; nfc_target.nfcid1_len = 10; memcpy(nfc_target.nfcid1, rsp->nfcid3t, nfc_target.nfcid1_len); diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index b69b89166b6b..8cedc1ef496c 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1537,6 +1537,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid) queue->sock->sk->sk_rcvtimeo = 10 * HZ; queue->sock->sk->sk_allocation = GFP_ATOMIC; + queue->sock->sk->sk_use_task_frag = false; nvme_tcp_set_queue_io_cpu(queue); queue->request = NULL; queue->data_remaining = 0; diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 5fb1f364e815..1d1cf641937c 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -738,6 +738,7 @@ iscsi_sw_tcp_conn_bind(struct iscsi_cls_session *cls_session, sk->sk_reuse = SK_CAN_REUSE; sk->sk_sndtimeo = 15 * HZ; /* FIXME: make it configurable */ sk->sk_allocation = GFP_ATOMIC; + sk->sk_use_task_frag = false; sk_set_memalloc(sk); sock_no_linger(sk); diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index f8b326eed54d..a2b2da1255dd 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -315,6 +315,7 @@ int usbip_recv(struct socket *sock, void *buf, int size) do { sock->sk->sk_allocation = GFP_NOIO; + sock->sk->sk_use_task_frag = false; result = sock_recvmsg(sock, &msg, MSG_WAITALL); if (result <= 0) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index e80252a83225..7bc7b5e03c51 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2944,6 +2944,7 @@ generic_ip_connect(struct TCP_Server_Info *server) cifs_dbg(FYI, "Socket created\n"); server->ssocket = socket; socket->sk->sk_allocation = GFP_NOFS; + socket->sk->sk_use_task_frag = false; if (sfamily == AF_INET6) cifs_reclassify_socket6(socket); else diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 8b80ca0cd65f..4450721ec83c 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -645,6 +645,7 @@ static void add_sock(struct socket *sock, struct connection *con) if (dlm_config.ci_protocol == DLM_PROTO_SCTP) sk->sk_state_change = lowcomms_state_change; sk->sk_allocation = GFP_NOFS; + sk->sk_use_task_frag = false; sk->sk_error_report = lowcomms_error_report; release_sock(sk); } @@ -1769,6 +1770,7 @@ static int dlm_listen_for_all(void) listen_con.sock = sock; sock->sk->sk_allocation = GFP_NOFS; + sock->sk->sk_use_task_frag = false; sock->sk->sk_data_ready = lowcomms_listen_data_ready; release_sock(sock->sk); diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index 37d222bdfc8c..a07b24d170f2 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -1602,6 +1602,7 @@ static void o2net_start_connect(struct work_struct *work) sc->sc_sock = sock; /* freed by sc_kref_release */ sock->sk->sk_allocation = GFP_ATOMIC; + sock->sk->sk_use_task_frag = false; myaddr.sin_family = AF_INET; myaddr.sin_addr.s_addr = mynode->nd_ipv4_address; diff --git a/include/net/sock.h b/include/net/sock.h index ecea3dcc2217..dcd72e6285b2 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -318,6 +318,9 @@ struct sk_filter; * @sk_stamp: time stamp of last packet received * @sk_stamp_seq: lock for accessing sk_stamp on 32 bit architectures only * @sk_tsflags: SO_TIMESTAMPING flags + * @sk_use_task_frag: allow sk_page_frag() to use current->task_frag. + * Sockets that can be used under memory reclaim should + * set this to false. * @sk_bind_phc: SO_TIMESTAMPING bind PHC index of PTP virtual clock * for timestamping * @sk_tskey: counter to disambiguate concurrent tstamp requests @@ -512,6 +515,7 @@ struct sock { u8 sk_txtime_deadline_mode : 1, sk_txtime_report_errors : 1, sk_txtime_unused : 6; + bool sk_use_task_frag; struct socket *sk_socket; void *sk_user_data; @@ -2560,16 +2564,14 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) * Both direct reclaim and page faults can nest inside other * socket operations and end up recursing into sk_page_frag() * while it's already in use: explicitly avoid task page_frag - * usage if the caller is potentially doing any of them. - * This assumes that page fault handlers use the GFP_NOFS flags. + * when users disable sk_use_task_frag. * * Return: a per task page_frag if context allows that, * otherwise a per socket one. */ static inline struct page_frag *sk_page_frag(struct sock *sk) { - if ((sk->sk_allocation & (__GFP_DIRECT_RECLAIM | __GFP_MEMALLOC | __GFP_FS)) == - (__GFP_DIRECT_RECLAIM | __GFP_FS)) + if (sk->sk_use_task_frag) return ¤t->task_frag; return &sk->sk_frag; diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index 049b52e7aa6a..c6cfed00d0c6 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -471,7 +471,7 @@ TRACE_EVENT(rxrpc_peer, TP_STRUCT__entry( __field(unsigned int, peer ) __field(int, ref ) - __field(int, why ) + __field(enum rxrpc_peer_trace, why ) ), TP_fast_assign( diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 6cca66b39d01..ba3fff17e2f9 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2090,6 +2090,7 @@ static unsigned int __bpf_prog_ret0_warn(const void *ctx, bool bpf_prog_map_compatible(struct bpf_map *map, const struct bpf_prog *fp) { + enum bpf_prog_type prog_type = resolve_prog_type(fp); bool ret; if (fp->kprobe_override) @@ -2100,12 +2101,12 @@ bool bpf_prog_map_compatible(struct bpf_map *map, /* There's no owner yet where we could check for * compatibility. */ - map->owner.type = fp->type; + map->owner.type = prog_type; map->owner.jited = fp->jited; map->owner.xdp_has_frags = fp->aux->xdp_has_frags; ret = true; } else { - ret = map->owner.type == fp->type && + ret = map->owner.type == prog_type && map->owner.jited == fp->jited && map->owner.xdp_has_frags == fp->aux->xdp_has_frags; } diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c index c19719f48ce0..fa3e9225aedc 100644 --- a/kernel/bpf/dispatcher.c +++ b/kernel/bpf/dispatcher.c @@ -125,6 +125,11 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs) __BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func); + /* Make sure all the callers executing the previous/old half of the + * image leave it, so following update call can modify it safely. + */ + synchronize_rcu(); + if (new) d->image_off = noff; } diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 35972afb6850..64131f88c553 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3518,9 +3518,9 @@ static int bpf_prog_attach(const union bpf_attr *attr) case BPF_PROG_TYPE_LSM: if (ptype == BPF_PROG_TYPE_LSM && prog->expected_attach_type != BPF_LSM_CGROUP) - return -EINVAL; - - ret = cgroup_bpf_prog_attach(attr, ptype, prog); + ret = -EINVAL; + else + ret = cgroup_bpf_prog_attach(attr, ptype, prog); break; default: ret = -EINVAL; diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 07db2f436d44..d9120f14684b 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -868,6 +868,7 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket) } csocket->sk->sk_allocation = GFP_NOIO; + csocket->sk->sk_use_task_frag = false; file = sock_alloc_file(csocket, 0, NULL); if (IS_ERR(file)) { pr_err("%s (%d): failed to map fd\n", diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index dfa237fbd5a3..1d06e114ba3f 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -446,6 +446,7 @@ int ceph_tcp_connect(struct ceph_connection *con) if (ret) return ret; sock->sk->sk_allocation = GFP_NOFS; + sock->sk->sk_use_task_frag = false; #ifdef CONFIG_LOCKDEP lockdep_set_class(&sock->sk->sk_lock, &socket_class); diff --git a/net/core/devlink.c b/net/core/devlink.c index 6004bd0ccee4..032d6d0a5ce6 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -1648,10 +1648,13 @@ static int devlink_nl_cmd_get_dumpit(struct sk_buff *msg, continue; } + devl_lock(devlink); err = devlink_nl_fill(msg, devlink, DEVLINK_CMD_NEW, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NLM_F_MULTI); + devl_unlock(devlink); devlink_put(devlink); + if (err) goto out; idx++; @@ -11925,8 +11928,10 @@ void devl_region_destroy(struct devlink_region *region) devl_assert_locked(devlink); /* Free all snapshots of region */ + mutex_lock(®ion->snapshot_lock); list_for_each_entry_safe(snapshot, ts, ®ion->snapshot_list, list) devlink_region_snapshot_del(region, snapshot); + mutex_unlock(®ion->snapshot_lock); list_del(®ion->list); mutex_destroy(®ion->snapshot_lock); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 3cbba7099c0f..4a0eb5593275 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2482,6 +2482,9 @@ void *__pskb_pull_tail(struct sk_buff *skb, int delta) insp = list; } else { /* Eaten partially. */ + if (skb_is_gso(skb) && !list->head_frag && + skb_headlen(list)) + skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY; if (skb_shared(list)) { /* Sucks! We need to fork list. :-( */ diff --git a/net/core/sock.c b/net/core/sock.c index d2587d8712db..f954d5893e79 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3390,6 +3390,7 @@ void sock_init_data(struct socket *sock, struct sock *sk) sk->sk_rcvbuf = READ_ONCE(sysctl_rmem_default); sk->sk_sndbuf = READ_ONCE(sysctl_wmem_default); sk->sk_state = TCP_CLOSE; + sk->sk_use_task_frag = true; sk_set_socket(sk, sock); sock_set_flag(sk, SOCK_ZAPPED); diff --git a/net/core/stream.c b/net/core/stream.c index 5b1fe2b82eac..cd06750dd329 100644 --- a/net/core/stream.c +++ b/net/core/stream.c @@ -196,6 +196,12 @@ void sk_stream_kill_queues(struct sock *sk) /* First the read buffer. */ __skb_queue_purge(&sk->sk_receive_queue); + /* Next, the error queue. + * We need to use queue lock, because other threads might + * add packets to the queue without socket lock being held. + */ + skb_queue_purge(&sk->sk_error_queue); + /* Next, the write queue. */ WARN_ON_ONCE(!skb_queue_empty(&sk->sk_write_queue)); diff --git a/net/mctp/device.c b/net/mctp/device.c index 99a3bda8852f..acb97b257428 100644 --- a/net/mctp/device.c +++ b/net/mctp/device.c @@ -429,12 +429,6 @@ static void mctp_unregister(struct net_device *dev) struct mctp_dev *mdev; mdev = mctp_dev_get_rtnl(dev); - if (mdev && !mctp_known(dev)) { - // Sanity check, should match what was set in mctp_register - netdev_warn(dev, "%s: BUG mctp_ptr set for unknown type %d", - __func__, dev->type); - return; - } if (!mdev) return; @@ -451,14 +445,8 @@ static int mctp_register(struct net_device *dev) struct mctp_dev *mdev; /* Already registered? */ - mdev = rtnl_dereference(dev->mctp_ptr); - - if (mdev) { - if (!mctp_known(dev)) - netdev_warn(dev, "%s: BUG mctp_ptr set for unknown type %d", - __func__, dev->type); + if (rtnl_dereference(dev->mctp_ptr)) return 0; - } /* only register specific types */ if (!mctp_known(dev)) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index c9f598505642..2a5ed71c82c3 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2841,6 +2841,11 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, sockptr_t ptr, unsigned int len) break; case IP_VS_SO_SET_DELDEST: ret = ip_vs_del_dest(svc, &udest); + break; + default: + WARN_ON_ONCE(1); + ret = -EINVAL; + break; } out_unlock: diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c index 0fdcdb2c9ae4..4d9b99abe37d 100644 --- a/net/netfilter/nf_flow_table_offload.c +++ b/net/netfilter/nf_flow_table_offload.c @@ -383,12 +383,12 @@ static void flow_offload_ipv6_mangle(struct nf_flow_rule *flow_rule, const __be32 *addr, const __be32 *mask) { struct flow_action_entry *entry; - int i, j; + int i; - for (i = 0, j = 0; i < sizeof(struct in6_addr) / sizeof(u32); i += sizeof(u32), j++) { + for (i = 0; i < sizeof(struct in6_addr) / sizeof(u32); i++) { entry = flow_action_entry_next(flow_rule); flow_offload_mangle(entry, FLOW_ACT_MANGLE_HDR_TYPE_IP6, - offset + i, &addr[j], mask); + offset + i * sizeof(u32), &addr[i], mask); } } diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index 932bcf766d63..9ca721c9fa71 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -973,6 +973,7 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info) struct sw_flow_mask mask; struct sk_buff *reply; struct datapath *dp; + struct sw_flow_key *key; struct sw_flow_actions *acts; struct sw_flow_match match; u32 ufid_flags = ovs_nla_get_ufid_flags(a[OVS_FLOW_ATTR_UFID_FLAGS]); @@ -1000,24 +1001,26 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info) } /* Extract key. */ - ovs_match_init(&match, &new_flow->key, false, &mask); + key = kzalloc(sizeof(*key), GFP_KERNEL); + if (!key) { + error = -ENOMEM; + goto err_kfree_key; + } + + ovs_match_init(&match, key, false, &mask); error = ovs_nla_get_match(net, &match, a[OVS_FLOW_ATTR_KEY], a[OVS_FLOW_ATTR_MASK], log); if (error) goto err_kfree_flow; + ovs_flow_mask_key(&new_flow->key, key, true, &mask); + /* Extract flow identifier. */ error = ovs_nla_get_identifier(&new_flow->id, a[OVS_FLOW_ATTR_UFID], - &new_flow->key, log); + key, log); if (error) goto err_kfree_flow; - /* unmasked key is needed to match when ufid is not used. */ - if (ovs_identifier_is_key(&new_flow->id)) - match.key = new_flow->id.unmasked_key; - - ovs_flow_mask_key(&new_flow->key, &new_flow->key, true, &mask); - /* Validate actions. */ error = ovs_nla_copy_actions(net, a[OVS_FLOW_ATTR_ACTIONS], &new_flow->key, &acts, log); @@ -1044,7 +1047,7 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info) if (ovs_identifier_is_ufid(&new_flow->id)) flow = ovs_flow_tbl_lookup_ufid(&dp->table, &new_flow->id); if (!flow) - flow = ovs_flow_tbl_lookup(&dp->table, &new_flow->key); + flow = ovs_flow_tbl_lookup(&dp->table, key); if (likely(!flow)) { rcu_assign_pointer(new_flow->sf_acts, acts); @@ -1114,6 +1117,8 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info) if (reply) ovs_notify(&dp_flow_genl_family, reply, info); + + kfree(key); return 0; err_unlock_ovs: @@ -1123,6 +1128,8 @@ err_kfree_acts: ovs_nla_free_flow_actions(acts); err_kfree_flow: ovs_flow_free(new_flow, false); +err_kfree_key: + kfree(key); error: return error; } diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index e7dccab7b741..18092526d3c8 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -287,6 +287,7 @@ struct rxrpc_local { struct hlist_node link; struct socket *socket; /* my UDP socket */ struct task_struct *io_thread; + struct completion io_thread_ready; /* Indication that the I/O thread started */ struct rxrpc_sock __rcu *service; /* Service(s) listening on this endpoint */ struct rw_semaphore defrag_sem; /* control re-enablement of IP DF bit */ struct sk_buff_head rx_queue; /* Received packets */ @@ -811,9 +812,9 @@ extern struct workqueue_struct *rxrpc_workqueue; */ int rxrpc_service_prealloc(struct rxrpc_sock *, gfp_t); void rxrpc_discard_prealloc(struct rxrpc_sock *); -bool rxrpc_new_incoming_call(struct rxrpc_local *, struct rxrpc_peer *, - struct rxrpc_connection *, struct sockaddr_rxrpc *, - struct sk_buff *); +int rxrpc_new_incoming_call(struct rxrpc_local *, struct rxrpc_peer *, + struct rxrpc_connection *, struct sockaddr_rxrpc *, + struct sk_buff *); void rxrpc_accept_incoming_calls(struct rxrpc_local *); int rxrpc_user_charge_accept(struct rxrpc_sock *, unsigned long); @@ -1072,7 +1073,6 @@ void rxrpc_destroy_all_peers(struct rxrpc_net *); struct rxrpc_peer *rxrpc_get_peer(struct rxrpc_peer *, enum rxrpc_peer_trace); struct rxrpc_peer *rxrpc_get_peer_maybe(struct rxrpc_peer *, enum rxrpc_peer_trace); void rxrpc_put_peer(struct rxrpc_peer *, enum rxrpc_peer_trace); -void rxrpc_put_peer_locked(struct rxrpc_peer *, enum rxrpc_peer_trace); /* * proc.c diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c index d1850863507f..c02401656fa9 100644 --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -326,11 +326,11 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx, * If we want to report an error, we mark the skb with the packet type and * abort code and return false. */ -bool rxrpc_new_incoming_call(struct rxrpc_local *local, - struct rxrpc_peer *peer, - struct rxrpc_connection *conn, - struct sockaddr_rxrpc *peer_srx, - struct sk_buff *skb) +int rxrpc_new_incoming_call(struct rxrpc_local *local, + struct rxrpc_peer *peer, + struct rxrpc_connection *conn, + struct sockaddr_rxrpc *peer_srx, + struct sk_buff *skb) { const struct rxrpc_security *sec = NULL; struct rxrpc_skb_priv *sp = rxrpc_skb(skb); @@ -342,7 +342,7 @@ bool rxrpc_new_incoming_call(struct rxrpc_local *local, /* Don't set up a call for anything other than the first DATA packet. */ if (sp->hdr.seq != 1 || sp->hdr.type != RXRPC_PACKET_TYPE_DATA) - return true; /* Just discard */ + return 0; /* Just discard */ rcu_read_lock(); @@ -413,7 +413,7 @@ bool rxrpc_new_incoming_call(struct rxrpc_local *local, _leave(" = %p{%d}", call, call->debug_id); rxrpc_input_call_event(call, skb); rxrpc_put_call(call, rxrpc_call_put_input); - return true; + return 0; unsupported_service: trace_rxrpc_abort(0, "INV", sp->hdr.cid, sp->hdr.callNumber, sp->hdr.seq, @@ -425,10 +425,10 @@ no_call: reject: rcu_read_unlock(); _leave(" = f [%u]", skb->mark); - return false; + return -EPROTO; discard: rcu_read_unlock(); - return true; + return 0; } /* diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c index be5eb8cdf549..89dcf60b1158 100644 --- a/net/rxrpc/call_object.c +++ b/net/rxrpc/call_object.c @@ -217,6 +217,7 @@ static struct rxrpc_call *rxrpc_alloc_client_call(struct rxrpc_sock *rx, call->tx_total_len = p->tx_total_len; call->key = key_get(cp->key); call->local = rxrpc_get_local(cp->local, rxrpc_local_get_call); + call->security_level = cp->security_level; if (p->kernel) __set_bit(RXRPC_CALL_KERNEL, &call->flags); if (cp->upgrade) diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c index a08e33c9e54b..87efa0373aed 100644 --- a/net/rxrpc/conn_client.c +++ b/net/rxrpc/conn_client.c @@ -551,8 +551,6 @@ static void rxrpc_activate_one_channel(struct rxrpc_connection *conn, call->conn = rxrpc_get_connection(conn, rxrpc_conn_get_activate_call); call->cid = conn->proto.cid | channel; call->call_id = call_id; - call->security = conn->security; - call->security_ix = conn->security_ix; call->dest_srx.srx_service = conn->service_id; trace_rxrpc_connect_call(call); diff --git a/net/rxrpc/io_thread.c b/net/rxrpc/io_thread.c index d83ae3193032..1ad067d66fb6 100644 --- a/net/rxrpc/io_thread.c +++ b/net/rxrpc/io_thread.c @@ -292,7 +292,7 @@ protocol_error: skb->mark = RXRPC_SKB_MARK_REJECT_ABORT; reject_packet: rxrpc_reject_packet(local, skb); - return ret; + return 0; } /* @@ -384,7 +384,7 @@ static int rxrpc_input_packet_on_conn(struct rxrpc_connection *conn, if (rxrpc_to_client(sp)) goto bad_message; if (rxrpc_new_incoming_call(conn->local, conn->peer, conn, - peer_srx, skb)) + peer_srx, skb) == 0) return 0; goto reject_packet; } @@ -425,6 +425,9 @@ int rxrpc_io_thread(void *data) struct rxrpc_local *local = data; struct rxrpc_call *call; struct sk_buff *skb; + bool should_stop; + + complete(&local->io_thread_ready); skb_queue_head_init(&rx_queue); @@ -476,13 +479,14 @@ int rxrpc_io_thread(void *data) } set_current_state(TASK_INTERRUPTIBLE); + should_stop = kthread_should_stop(); if (!skb_queue_empty(&local->rx_queue) || !list_empty(&local->call_attend_q)) { __set_current_state(TASK_RUNNING); continue; } - if (kthread_should_stop()) + if (should_stop) break; schedule(); } diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 44222923c0d1..270b63d8f37a 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -97,6 +97,7 @@ static struct rxrpc_local *rxrpc_alloc_local(struct rxrpc_net *rxnet, local->rxnet = rxnet; INIT_HLIST_NODE(&local->link); init_rwsem(&local->defrag_sem); + init_completion(&local->io_thread_ready); skb_queue_head_init(&local->rx_queue); INIT_LIST_HEAD(&local->call_attend_q); local->client_bundles = RB_ROOT; @@ -189,6 +190,7 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) goto error_sock; } + wait_for_completion(&local->io_thread_ready); local->io_thread = io_thread; _leave(" = 0"); return 0; @@ -357,10 +359,11 @@ struct rxrpc_local *rxrpc_use_local(struct rxrpc_local *local, */ void rxrpc_unuse_local(struct rxrpc_local *local, enum rxrpc_local_trace why) { - unsigned int debug_id = local->debug_id; + unsigned int debug_id; int r, u; if (local) { + debug_id = local->debug_id; r = refcount_read(&local->ref); u = atomic_dec_return(&local->active_users); trace_rxrpc_local(debug_id, why, r, u); diff --git a/net/rxrpc/peer_event.c b/net/rxrpc/peer_event.c index 6685bf917aa6..552ba84a255c 100644 --- a/net/rxrpc/peer_event.c +++ b/net/rxrpc/peer_event.c @@ -235,6 +235,7 @@ static void rxrpc_peer_keepalive_dispatch(struct rxrpc_net *rxnet, struct rxrpc_peer *peer; const u8 mask = ARRAY_SIZE(rxnet->peer_keepalive) - 1; time64_t keepalive_at; + bool use; int slot; spin_lock(&rxnet->peer_hash_lock); @@ -247,9 +248,10 @@ static void rxrpc_peer_keepalive_dispatch(struct rxrpc_net *rxnet, if (!rxrpc_get_peer_maybe(peer, rxrpc_peer_get_keepalive)) continue; - if (__rxrpc_use_local(peer->local, rxrpc_local_use_peer_keepalive)) { - spin_unlock(&rxnet->peer_hash_lock); + use = __rxrpc_use_local(peer->local, rxrpc_local_use_peer_keepalive); + spin_unlock(&rxnet->peer_hash_lock); + if (use) { keepalive_at = peer->last_tx_at + RXRPC_KEEPALIVE_TIME; slot = keepalive_at - base; _debug("%02x peer %u t=%d {%pISp}", @@ -270,9 +272,11 @@ static void rxrpc_peer_keepalive_dispatch(struct rxrpc_net *rxnet, spin_lock(&rxnet->peer_hash_lock); list_add_tail(&peer->keepalive_link, &rxnet->peer_keepalive[slot & mask]); + spin_unlock(&rxnet->peer_hash_lock); rxrpc_unuse_local(peer->local, rxrpc_local_unuse_peer_keepalive); } - rxrpc_put_peer_locked(peer, rxrpc_peer_put_keepalive); + rxrpc_put_peer(peer, rxrpc_peer_put_keepalive); + spin_lock(&rxnet->peer_hash_lock); } spin_unlock(&rxnet->peer_hash_lock); diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index 608946dcc505..4eecea2be307 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -226,7 +226,7 @@ struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp, rxrpc_peer_init_rtt(peer); peer->cong_ssthresh = RXRPC_TX_MAX_WINDOW; - trace_rxrpc_peer(peer->debug_id, why, 1); + trace_rxrpc_peer(peer->debug_id, 1, why); } _leave(" = %p", peer); @@ -382,7 +382,7 @@ struct rxrpc_peer *rxrpc_get_peer(struct rxrpc_peer *peer, enum rxrpc_peer_trace int r; __refcount_inc(&peer->ref, &r); - trace_rxrpc_peer(peer->debug_id, why, r + 1); + trace_rxrpc_peer(peer->debug_id, r + 1, why); return peer; } @@ -439,25 +439,6 @@ void rxrpc_put_peer(struct rxrpc_peer *peer, enum rxrpc_peer_trace why) } /* - * Drop a ref on a peer record where the caller already holds the - * peer_hash_lock. - */ -void rxrpc_put_peer_locked(struct rxrpc_peer *peer, enum rxrpc_peer_trace why) -{ - unsigned int debug_id = peer->debug_id; - bool dead; - int r; - - dead = __refcount_dec_and_test(&peer->ref, &r); - trace_rxrpc_peer(debug_id, r - 1, why); - if (dead) { - hash_del_rcu(&peer->hash_link); - list_del_init(&peer->keepalive_link); - rxrpc_free_peer(peer); - } -} - -/* * Make sure all peer records have been discarded. */ void rxrpc_destroy_all_peers(struct rxrpc_net *rxnet) diff --git a/net/rxrpc/rxperf.c b/net/rxrpc/rxperf.c index 66f5eea291ff..d33a109e846c 100644 --- a/net/rxrpc/rxperf.c +++ b/net/rxrpc/rxperf.c @@ -275,7 +275,7 @@ static void rxperf_deliver_to_call(struct work_struct *work) struct rxperf_call *call = container_of(work, struct rxperf_call, work); enum rxperf_call_state state; u32 abort_code, remote_abort = 0; - int ret; + int ret = 0; if (call->state == RXPERF_CALL_COMPLETE) return; diff --git a/net/rxrpc/security.c b/net/rxrpc/security.c index 209f2c25a0da..ab968f65a490 100644 --- a/net/rxrpc/security.c +++ b/net/rxrpc/security.c @@ -67,13 +67,13 @@ const struct rxrpc_security *rxrpc_security_lookup(u8 security_index) */ int rxrpc_init_client_call_security(struct rxrpc_call *call) { - const struct rxrpc_security *sec; + const struct rxrpc_security *sec = &rxrpc_no_security; struct rxrpc_key_token *token; struct key *key = call->key; int ret; if (!key) - return 0; + goto found; ret = key_validate(key); if (ret < 0) @@ -88,7 +88,7 @@ int rxrpc_init_client_call_security(struct rxrpc_call *call) found: call->security = sec; - _leave(" = 0"); + call->security_ix = sec->security_index; return 0; } diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c index 9fa7e37f7155..cde1e65f16b4 100644 --- a/net/rxrpc/sendmsg.c +++ b/net/rxrpc/sendmsg.c @@ -625,7 +625,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len) if (call->tx_total_len != -1 || call->tx_pending || call->tx_top != 0) - goto error_put; + goto out_put_unlock; call->tx_total_len = p.call.tx_total_len; } } diff --git a/net/sched/ematch.c b/net/sched/ematch.c index 4ce681361851..5c1235e6076a 100644 --- a/net/sched/ematch.c +++ b/net/sched/ematch.c @@ -255,6 +255,8 @@ static int tcf_em_validate(struct tcf_proto *tp, * the value carried. */ if (em_hdr->flags & TCF_EM_SIMPLE) { + if (em->ops->datalen > 0) + goto errout; if (data_len < sizeof(u32)) goto errout; em->data = *(u32 *) data; diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index c0506d0d7478..aaa5b2741b79 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1882,6 +1882,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt, sk->sk_write_space = xs_udp_write_space; sk->sk_state_change = xs_local_state_change; sk->sk_error_report = xs_error_report; + sk->sk_use_task_frag = false; xprt_clear_connected(xprt); @@ -2082,6 +2083,7 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) sk->sk_user_data = xprt; sk->sk_data_ready = xs_data_ready; sk->sk_write_space = xs_udp_write_space; + sk->sk_use_task_frag = false; xprt_set_connected(xprt); @@ -2249,6 +2251,7 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) sk->sk_state_change = xs_tcp_state_change; sk->sk_write_space = xs_tcp_write_space; sk->sk_error_report = xs_error_report; + sk->sk_use_task_frag = false; /* socket options */ sock_reset_flag(sk, SOCK_LINGER); diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index ede2b2a140a4..f0c2293f1d3b 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1999,13 +1999,20 @@ restart_locked: unix_state_lock(sk); err = 0; - if (unix_peer(sk) == other) { + if (sk->sk_type == SOCK_SEQPACKET) { + /* We are here only when racing with unix_release_sock() + * is clearing @other. Never change state to TCP_CLOSE + * unlike SOCK_DGRAM wants. + */ + unix_state_unlock(sk); + err = -EPIPE; + } else if (unix_peer(sk) == other) { unix_peer(sk) = NULL; unix_dgram_peer_wake_disconnect_wakeup(sk, other); + sk->sk_state = TCP_CLOSE; unix_state_unlock(sk); - sk->sk_state = TCP_CLOSE; unix_dgram_disconnected(sk, other); sock_put(other); err = -ECONNREFUSED; diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index d6fece1ed982..74a54295c164 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -489,6 +489,7 @@ static int espintcp_init_sk(struct sock *sk) /* avoid using task_frag */ sk->sk_allocation = GFP_ATOMIC; + sk->sk_use_task_frag = false; return 0; diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 612f699dc4f7..63cd4ab70171 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -16,6 +16,7 @@ CONFIG_CRYPTO_USER_API_HASH=y CONFIG_DYNAMIC_FTRACE=y CONFIG_FPROBE=y CONFIG_FTRACE_SYSCALLS=y +CONFIG_FUNCTION_ERROR_INJECTION=y CONFIG_FUNCTION_TRACER=y CONFIG_GENEVE=y CONFIG_IKCONFIG=y diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c index d1e32e792536..20f5fa0fcec9 100644 --- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c +++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c @@ -4,6 +4,8 @@ #include <network_helpers.h> #include <bpf/btf.h> #include "bind4_prog.skel.h" +#include "freplace_progmap.skel.h" +#include "xdp_dummy.skel.h" typedef int (*test_cb)(struct bpf_object *obj); @@ -500,6 +502,50 @@ cleanup: bind4_prog__destroy(skel); } +static void test_func_replace_progmap(void) +{ + struct bpf_cpumap_val value = { .qsize = 1 }; + struct freplace_progmap *skel = NULL; + struct xdp_dummy *tgt_skel = NULL; + __u32 key = 0; + int err; + + skel = freplace_progmap__open(); + if (!ASSERT_OK_PTR(skel, "prog_open")) + return; + + tgt_skel = xdp_dummy__open_and_load(); + if (!ASSERT_OK_PTR(tgt_skel, "tgt_prog_load")) + goto out; + + err = bpf_program__set_attach_target(skel->progs.xdp_cpumap_prog, + bpf_program__fd(tgt_skel->progs.xdp_dummy_prog), + "xdp_dummy_prog"); + if (!ASSERT_OK(err, "set_attach_target")) + goto out; + + err = freplace_progmap__load(skel); + if (!ASSERT_OK(err, "obj_load")) + goto out; + + /* Prior to fixing the kernel, loading the PROG_TYPE_EXT 'redirect' + * program above will cause the map owner type of 'cpumap' to be set to + * PROG_TYPE_EXT. This in turn will cause the bpf_map_update_elem() + * below to fail, because the program we are inserting into the map is + * of PROG_TYPE_XDP. After fixing the kernel, the initial ownership will + * be correctly resolved to the *target* of the PROG_TYPE_EXT program + * (i.e., PROG_TYPE_XDP) and the map update will succeed. + */ + value.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_drop_prog); + err = bpf_map_update_elem(bpf_map__fd(skel->maps.cpu_map), + &key, &value, 0); + ASSERT_OK(err, "map_update"); + +out: + xdp_dummy__destroy(tgt_skel); + freplace_progmap__destroy(skel); +} + /* NOTE: affect other tests, must run in serial mode */ void serial_test_fexit_bpf2bpf(void) { @@ -525,4 +571,6 @@ void serial_test_fexit_bpf2bpf(void) test_func_replace_global_func(); if (test__start_subtest("fentry_to_cgroup_bpf")) test_fentry_to_cgroup_bpf(); + if (test__start_subtest("func_replace_progmap")) + test_func_replace_progmap(); } diff --git a/tools/testing/selftests/bpf/progs/freplace_progmap.c b/tools/testing/selftests/bpf/progs/freplace_progmap.c new file mode 100644 index 000000000000..81b56b9aa7d6 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/freplace_progmap.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +struct { + __uint(type, BPF_MAP_TYPE_CPUMAP); + __type(key, __u32); + __type(value, struct bpf_cpumap_val); + __uint(max_entries, 1); +} cpu_map SEC(".maps"); + +SEC("xdp/cpumap") +int xdp_drop_prog(struct xdp_md *ctx) +{ + return XDP_DROP; +} + +SEC("freplace") +int xdp_cpumap_prog(struct xdp_md *ctx) +{ + return bpf_redirect_map(&cpu_map, 0, XDP_PASS); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/rcu_read_lock.c b/tools/testing/selftests/bpf/progs/rcu_read_lock.c index 125f908024d3..5cecbdbbb16e 100644 --- a/tools/testing/selftests/bpf/progs/rcu_read_lock.c +++ b/tools/testing/selftests/bpf/progs/rcu_read_lock.c @@ -288,13 +288,13 @@ out: SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") int task_untrusted_non_rcuptr(void *ctx) { - struct task_struct *task, *last_wakee; + struct task_struct *task, *group_leader; task = bpf_get_current_task_btf(); bpf_rcu_read_lock(); - /* the pointer last_wakee marked as untrusted */ - last_wakee = task->real_parent->last_wakee; - (void)bpf_task_storage_get(&map_a, last_wakee, 0, 0); + /* the pointer group_leader marked as untrusted */ + group_leader = task->real_parent->group_leader; + (void)bpf_task_storage_get(&map_a, group_leader, 0, 0); bpf_rcu_read_unlock(); return 0; } diff --git a/tools/testing/selftests/bpf/progs/task_kfunc_failure.c b/tools/testing/selftests/bpf/progs/task_kfunc_failure.c index 87fa1db9d9b5..1b47b94dbca0 100644 --- a/tools/testing/selftests/bpf/progs/task_kfunc_failure.c +++ b/tools/testing/selftests/bpf/progs/task_kfunc_failure.c @@ -73,7 +73,7 @@ int BPF_PROG(task_kfunc_acquire_trusted_walked, struct task_struct *task, u64 cl struct task_struct *acquired; /* Can't invoke bpf_task_acquire() on a trusted pointer obtained from walking a struct. */ - acquired = bpf_task_acquire(task->last_wakee); + acquired = bpf_task_acquire(task->group_leader); bpf_task_release(acquired); return 0; diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile index 0f3921908b07..8e3b786a748f 100644 --- a/tools/testing/selftests/drivers/net/bonding/Makefile +++ b/tools/testing/selftests/drivers/net/bonding/Makefile @@ -7,7 +7,8 @@ TEST_PROGS := \ bond-lladdr-target.sh \ dev_addr_lists.sh \ mode-1-recovery-updelay.sh \ - mode-2-recovery-updelay.sh + mode-2-recovery-updelay.sh \ + option_prio.sh TEST_FILES := \ lag_lib.sh \ diff --git a/tools/testing/selftests/drivers/net/bonding/option_prio.sh b/tools/testing/selftests/drivers/net/bonding/option_prio.sh new file mode 100755 index 000000000000..c32eebff5005 --- /dev/null +++ b/tools/testing/selftests/drivers/net/bonding/option_prio.sh @@ -0,0 +1,245 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test bonding option prio +# + +ALL_TESTS=" + prio_arp_ip_target_test + prio_miimon_test +" + +REQUIRE_MZ=no +REQUIRE_JQ=no +NUM_NETIFS=0 +lib_dir=$(dirname "$0") +source "$lib_dir"/net_forwarding_lib.sh + +destroy() +{ + ip link del bond0 &>/dev/null + ip link del br0 &>/dev/null + ip link del veth0 &>/dev/null + ip link del veth1 &>/dev/null + ip link del veth2 &>/dev/null + ip netns del ns1 &>/dev/null + ip link del veth3 &>/dev/null +} + +cleanup() +{ + pre_cleanup + + destroy +} + +skip() +{ + local skip=1 + ip link add name bond0 type bond mode 1 miimon 100 &>/dev/null + ip link add name veth0 type veth peer name veth0_p + ip link set veth0 master bond0 + + # check if iproute support prio option + ip link set dev veth0 type bond_slave prio 10 + [[ $? -ne 0 ]] && skip=0 + + # check if bonding support prio option + ip -d link show veth0 | grep -q "prio 10" + [[ $? -ne 0 ]] && skip=0 + + ip link del bond0 &>/dev/null + ip link del veth0 + + return $skip +} + +active_slave="" +check_active_slave() +{ + local target_active_slave=$1 + active_slave="$(cat /sys/class/net/bond0/bonding/active_slave)" + test "$active_slave" = "$target_active_slave" + check_err $? "Current active slave is $active_slave but not $target_active_slave" +} + + +# Test bonding prio option with mode=$mode monitor=$monitor +# and primary_reselect=$primary_reselect +prio_test() +{ + RET=0 + + local monitor=$1 + local mode=$2 + local primary_reselect=$3 + + local bond_ip4="192.169.1.2" + local peer_ip4="192.169.1.1" + local bond_ip6="2009:0a:0b::02" + local peer_ip6="2009:0a:0b::01" + + + # create veths + ip link add name veth0 type veth peer name veth0_p + ip link add name veth1 type veth peer name veth1_p + ip link add name veth2 type veth peer name veth2_p + + # create bond + if [[ "$monitor" == "miimon" ]];then + ip link add name bond0 type bond mode $mode miimon 100 primary veth1 primary_reselect $primary_reselect + elif [[ "$monitor" == "arp_ip_target" ]];then + ip link add name bond0 type bond mode $mode arp_interval 1000 arp_ip_target $peer_ip4 primary veth1 primary_reselect $primary_reselect + elif [[ "$monitor" == "ns_ip6_target" ]];then + ip link add name bond0 type bond mode $mode arp_interval 1000 ns_ip6_target $peer_ip6 primary veth1 primary_reselect $primary_reselect + fi + ip link set bond0 up + ip link set veth0 master bond0 + ip link set veth1 master bond0 + ip link set veth2 master bond0 + # check bonding member prio value + ip link set dev veth0 type bond_slave prio 0 + ip link set dev veth1 type bond_slave prio 10 + ip link set dev veth2 type bond_slave prio 11 + ip -d link show veth0 | grep -q 'prio 0' + check_err $? "veth0 prio is not 0" + ip -d link show veth1 | grep -q 'prio 10' + check_err $? "veth0 prio is not 10" + ip -d link show veth2 | grep -q 'prio 11' + check_err $? "veth0 prio is not 11" + + ip link set veth0 up + ip link set veth1 up + ip link set veth2 up + ip link set veth0_p up + ip link set veth1_p up + ip link set veth2_p up + + # prepare ping target + ip link add name br0 type bridge + ip link set br0 up + ip link set veth0_p master br0 + ip link set veth1_p master br0 + ip link set veth2_p master br0 + ip link add name veth3 type veth peer name veth3_p + ip netns add ns1 + ip link set veth3_p master br0 up + ip link set veth3 netns ns1 up + ip netns exec ns1 ip addr add $peer_ip4/24 dev veth3 + ip netns exec ns1 ip addr add $peer_ip6/64 dev veth3 + ip addr add $bond_ip4/24 dev bond0 + ip addr add $bond_ip6/64 dev bond0 + sleep 5 + + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 1." + ping6 $peer_ip6 -c5 -I bond0 &>/dev/null + check_err $? "ping6 failed 1." + + # active salve should be the primary slave + check_active_slave veth1 + + # active slave should be the higher prio slave + ip link set $active_slave down + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 2." + check_active_slave veth2 + + # when only 1 slave is up + ip link set $active_slave down + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 3." + check_active_slave veth0 + + # when a higher prio slave change to up + ip link set veth2 up + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 4." + case $primary_reselect in + "0") + check_active_slave "veth2" + ;; + "1") + check_active_slave "veth0" + ;; + "2") + check_active_slave "veth0" + ;; + esac + local pre_active_slave=$active_slave + + # when the primary slave change to up + ip link set veth1 up + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 5." + case $primary_reselect in + "0") + check_active_slave "veth1" + ;; + "1") + check_active_slave "$pre_active_slave" + ;; + "2") + check_active_slave "$pre_active_slave" + ip link set $active_slave down + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 6." + check_active_slave "veth1" + ;; + esac + + # Test changing bond salve prio + if [[ "$primary_reselect" == "0" ]];then + ip link set dev veth0 type bond_slave prio 1000000 + ip link set dev veth1 type bond_slave prio 0 + ip link set dev veth2 type bond_slave prio -50 + ip -d link show veth0 | grep -q 'prio 1000000' + check_err $? "veth0 prio is not 1000000" + ip -d link show veth1 | grep -q 'prio 0' + check_err $? "veth1 prio is not 0" + ip -d link show veth2 | grep -q 'prio -50' + check_err $? "veth3 prio is not -50" + check_active_slave "veth1" + + ip link set $active_slave down + ping $peer_ip4 -c5 -I bond0 &>/dev/null + check_err $? "ping failed 7." + check_active_slave "veth0" + fi + + cleanup + + log_test "prio_test" "Test bonding option 'prio' with mode=$mode monitor=$monitor and primary_reselect=$primary_reselect" +} + +prio_miimon_test() +{ + local mode + local primary_reselect + + for mode in 1 5 6; do + for primary_reselect in 0 1 2; do + prio_test "miimon" $mode $primary_reselect + done + done +} + +prio_arp_ip_target_test() +{ + local primary_reselect + + for primary_reselect in 0 1 2; do + prio_test "arp_ip_target" 1 $primary_reselect + done +} + +if skip;then + log_test_skip "option_prio.sh" "Current iproute doesn't support 'prio'." + exit 0 +fi + +trap cleanup EXIT + +tests_run + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh index 9de1d123f4f5..a08c02abde12 100755 --- a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh +++ b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh @@ -496,8 +496,8 @@ dummy_reporter_test() check_reporter_info dummy healthy 3 3 10 true - echo 8192> $DEBUGFS_DIR/health/binary_len - check_fail $? "Failed set dummy reporter binary len to 8192" + echo 8192 > $DEBUGFS_DIR/health/binary_len + check_err $? "Failed set dummy reporter binary len to 8192" local dump=$(devlink health dump show $DL_HANDLE reporter dummy -j) check_err $? "Failed show dump of dummy reporter" diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh index 109900c817be..b64d98ca0df7 100755 --- a/tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh +++ b/tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh @@ -47,6 +47,17 @@ if [ -d "${NETDEVSIM_PATH}/devices/netdevsim${DEV_ADDR}" ]; then exit 1 fi +check_netdev_down() +{ + state=$(cat /sys/class/net/${NETDEV}/flags) + + if [ $((state & 1)) -ne 0 ]; then + echo "WARNING: unexpected interface UP, disable NetworkManager?" + + ip link set dev $NETDEV down + fi +} + init_test() { RET=0 @@ -151,6 +162,7 @@ trap_stats_test() RET=0 + check_netdev_down for trap_name in $(devlink_traps_get); do devlink_trap_stats_idle_test $trap_name check_err $? "Stats of trap $trap_name not idle when netdev down" @@ -254,6 +266,7 @@ trap_group_stats_test() RET=0 + check_netdev_down for group_name in $(devlink_trap_groups_get); do devlink_trap_group_stats_idle_test $group_name check_err $? "Stats of trap group $group_name not idle when netdev down" |