summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/freescale
AgeCommit message (Collapse)Author
2023-03-22net: enetc: fix aggregate RMON counters not showing the rangesVladimir Oltean
When running "ethtool -S eno0 --groups rmon" without an explicit "--src emac|pmac" argument, the kernel will not report rx-rmon-etherStatsPkts64to64Octets, rx-rmon-etherStatsPkts65to127Octets, etc. This is because on ETHTOOL_MAC_STATS_SRC_AGGREGATE, we do not populate the "ranges" argument. ocelot_port_get_rmon_stats() does things differently and things work there. I had forgotten to make sure that the code is structured the same way in both drivers, so do that now. Fixes: cf52bd238b75 ("net: enetc: add support for MAC Merge statistics counters") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230321232831.1200905-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-16net: Use of_property_read_bool() for boolean propertiesRob Herring
It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. Convert reading boolean properties to of_property_read_bool(). Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can Acked-by: Kalle Valo <kvalo@kernel.org> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-24Merge tag 'phy-for-6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "This features a bunch of new device support, a couple of new drivers, yaml conversion and updates of a few drivers. Core support: - New devm_of_phy_optional_get() API with users and conversion New hardware support: - Mediatek MT7986 phy support - Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250 USB3 phy support, SM6350 combo phy support, SM6125 UFS PHY support amd SM8350 & SM8450 combo phy support - Qualcomm SNPS eUSB2 eUSB2 repeater drivers - Allwinner F1C100s USB PHY support - Tegra xusb support for Tegra234 Updates: - Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy - G4 mode support in Qualcomm UFS phy and support for various SoCs - Yaml conversion for Meson usb2 phy - TI Type C support for usb phy for j721 - Yaml conversion for Tegra xusb binding" * tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (106 commits) phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater phy: qcom: Add QCOM SNPS eUSB2 repeater driver dt-bindings: phy: qcom,snps-eusb2-phy: Add phys property for the repeater dt-bindings: phy: Add qcom,snps-eusb2-repeater schema file dt-bindings: phy: amlogic,g12a-usb3-pcie-phy: add missing optional phy-supply property phy: rockchip-typec: Fix unsigned comparison with less than zero phy: rockchip-typec: fix tcphy_get_mode error case phy: qcom: snps-eusb2: Add missing headers phy: qcom-qmp-combo: Add support for SM8550 phy: qcom-qmp: Add v6 DP register offsets phy: qcom-qmp: pcs-usb: Add v6 register offsets dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp: Document SM8550 compatible phy: qcom: Add QCOM SNPS eUSB2 driver dt-bindings: phy: Add qcom,snps-eusb2-phy schema file phy: qcom-qmp-pcie: Add support for SM8550 g3x2 and g4x2 PCIEs phy: qcom-qmp: qserdes-lane-shared: Add v6 register offsets phy: qcom-qmp: qserdes-txrx: Add v6.20 register offsets phy: qcom-qmp: pcs-pcie: Add v6.20 register offsets phy: qcom-qmp: pcs-pcie: Add v6 register offsets phy: qcom-qmp: pcs: Add v6.20 register offsets ...
2023-02-20net: dpaa2-eth: do not always set xsk support in xdp_features flagLorenzo Bianconi
Do not always add NETDEV_XDP_ACT_XSK_ZEROCOPY bit in xdp_features flag but check if the NIC really supports it. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/3dba6ea42dc343a9f2d7d1a6a6a6c173235e1ebf.1676471386.git.lorenzo@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-10Daniel Borkmann says:Jakub Kicinski
==================== pull-request: bpf-next 2023-02-11 We've added 96 non-merge commits during the last 14 day(s) which contain a total of 152 files changed, 4884 insertions(+), 962 deletions(-). There is a minor conflict in drivers/net/ethernet/intel/ice/ice_main.c between commit 5b246e533d01 ("ice: split probe into smaller functions") from the net-next tree and commit 66c0e13ad236 ("drivers: net: turn on XDP features") from the bpf-next tree. Remove the hunk given ice_cfg_netdev() is otherwise there a 2nd time, and add XDP features to the existing ice_cfg_netdev() one: [...] ice_set_netdev_features(netdev); netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | NETDEV_XDP_ACT_XSK_ZEROCOPY; ice_set_ops(netdev); [...] Stephen's merge conflict mail: https://lore.kernel.org/bpf/20230207101951.21a114fa@canb.auug.org.au/ The main changes are: 1) Add support for BPF trampoline on s390x which finally allows to remove many test cases from the BPF CI's DENYLIST.s390x, from Ilya Leoshkevich. 2) Add multi-buffer XDP support to ice driver, from Maciej Fijalkowski. 3) Add capability to export the XDP features supported by the NIC. Along with that, add a XDP compliance test tool, from Lorenzo Bianconi & Marek Majtyka. 4) Add __bpf_kfunc tag for marking kernel functions as kfuncs, from David Vernet. 5) Add a deep dive documentation about the verifier's register liveness tracking algorithm, from Eduard Zingerman. 6) Fix and follow-up cleanups for resolve_btfids to be compiled as a host program to avoid cross compile issues, from Jiri Olsa & Ian Rogers. 7) Batch of fixes to the BPF selftest for xdp_hw_metadata which resulted when testing on different NICs, from Jesper Dangaard Brouer. 8) Fix libbpf to better detect kernel version code on Debian, from Hao Xiang. 9) Extend libbpf to add an option for when the perf buffer should wake up, from Jon Doron. 10) Follow-up fix on xdp_metadata selftest to just consume on TX completion, from Stanislav Fomichev. 11) Extend the kfuncs.rst document with description on kfunc lifecycle & stability expectations, from David Vernet. 12) Fix bpftool prog profile to skip attaching to offline CPUs, from Tonghao Zhang. ==================== Link: https://lore.kernel.org/r/20230211002037.8489-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07net: enetc: add support for MAC Merge statistics countersVladimir Oltean
Add PF driver support for the following: - Viewing the standardized MAC Merge layer counters. - Viewing the standardized Ethernet MAC and RMON counters associated with the pMAC. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230206094531.444988-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07net: enetc: add support for MAC Merge layerVladimir Oltean
Add PF driver support for viewing and changing the MAC Merge sublayer parameters, and seeing the verification state machine's current state. The verification handshake with the link partner is driven by hardware. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230206094531.444988-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-06net: enetc: act upon mqprio queue config in taprio offloadVladimir Oltean
We assume that the mqprio queue configuration from taprio has a simple 1:1 mapping between prio and traffic class, and one TX queue per TC. That might not be the case. Actually parse and act upon the mqprio config. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06net: enetc: act upon the requested mqprio queue configurationVladimir Oltean
Regardless of the requested queue count per traffic class, the enetc driver allocates a number of TX rings equal to the number of TCs, and hardcodes a queue configuration of "1@0 1@1 ... 1@max-tc". Other configurations are silently ignored and treated the same. Improve that by allowing what the user requests to be actually fulfilled. This allows more than one TX ring per traffic class. For example: $ tc qdisc add dev eno0 root handle 1: mqprio num_tc 4 \ map 0 0 1 1 2 2 3 3 queues 2@0 2@2 2@4 2@6 [ 146.267648] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0 [ 146.273451] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0 [ 146.283280] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 1 [ 146.293987] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 1 [ 146.300467] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 2 [ 146.306866] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 2 [ 146.313261] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 3 [ 146.319622] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 3 $ tc qdisc del dev eno0 root [ 178.238418] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0 [ 178.244369] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0 [ 178.251486] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 0 [ 178.258006] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 0 [ 178.265038] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 0 [ 178.271557] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 0 [ 178.277910] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 0 [ 178.284281] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 0 $ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1 [ 186.113162] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0 [ 186.118764] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 1 [ 186.124374] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 2 [ 186.130765] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 3 [ 186.136404] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 4 [ 186.142049] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 5 [ 186.147674] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 6 [ 186.153305] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 7 The driver used to set TC_MQPRIO_HW_OFFLOAD_TCS, near which there is this comment in the UAPI header: TC_MQPRIO_HW_OFFLOAD_TCS, /* offload TCs, no queue counts */ which is what enetc was doing up until now (and no longer is; we offload queue counts too), remove that assignment. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06net: enetc: request mqprio to validate the queue countsVladimir Oltean
The enetc driver does not validate the mqprio queue configuration, so it currently allows things like this: $ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 queues 3@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1 But also things like this, completely omitting the queue configuration: $ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 hw 1 By requesting validation via the mqprio capability structure, this is no longer allowed, and we bring what is accepted by hardware in line with what is accepted by software. The check that num_tc <= real_num_tx_queues also becomes superfluous and can be dropped, because mqprio_validate_queue_counts() validates that no TXQ range exceeds real_num_tx_queues. That is a stronger check, because there is at least 1 TXQ per TC, so there are at least as many TXQs as TCs. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-03net: enetc: ensure we always have a minimum number of TXQs for stackVladimir Oltean
Currently it can happen that an mqprio qdisc is installed with num_tc 8, and this will reserve 8 (out of 8) TXQs for the network stack. Then we can attach an XDP program, and this will crop 2 TXQs, leaving just 6 for mqprio. That's not what the user requested, and we should fail it. On the other hand, if mqprio isn't requested, we still give the 8 TXQs to the network stack (with hashing among a single traffic class), but then, cropping 2 TXQs for XDP is fine, because the user didn't explicitly ask for any number of TXQs, so no expectations are violated. Simply put, the logic that mqprio should impose a minimum number of TXQs for the network never existed. Let's say (more or less arbitrarily) that without mqprio, the driver expects a minimum number of TXQs equal to the number of CPUs (on NXP LS1028A, that is either 1, or 2). And with mqprio, mqprio gives the minimum required number of TXQs. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-03net: enetc: recalculate num_real_tx_queues when XDP program attachesVladimir Oltean
Since the blamed net-next commit, enetc_setup_xdp_prog() no longer goes through enetc_open(), and therefore, the function which was supposed to detect whether a BPF program exists (in order to crop some TX queues from network stack usage), enetc_num_stack_tx_queues(), no longer gets called. We can move the netif_set_real_num_rx_queues() call to enetc_alloc_msix() (probe time), since it is a runtime invariant. We can do the same thing with netif_set_real_num_tx_queues(), and let enetc_reconfigure_xdp_cb() explicitly recalculate and change the number of stack TX queues. Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-03net: enetc: allow the enetc_reconfigure() callback to failVladimir Oltean
enetc_reconfigure() was modified in commit c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") to take an optional callback that runs while the netdev is down, but this callback currently cannot fail. Code up the error handling so that the interface is restarted with the old resources if the callback fails. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-03net: enetc: simplify enetc_num_stack_tx_queues()Vladimir Oltean
We keep a pointer to the xdp_prog in the private netdev structure as well; what's replicated per RX ring is done so just for more convenient access from the NAPI poll procedure. Simplify enetc_num_stack_tx_queues() by looking at priv->xdp_prog rather than iterating through the information replicated per RX ring. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-03net: fman: memac: Convert to devm_of_phy_optional_get()Geert Uytterhoeven
Use the new devm_of_phy_optional_get() helper instead of open-coding the same operation. As devm_of_phy_optional_get() returns NULL if either the PHY cannot be found, or if support for the PHY framework is not enabled, it is no longer needed to check for -ENODEV or -ENOSYS. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Link: https://lore.kernel.org/r/f2d801cd73cca36a7162819289480d7fc91fcc7e.1674584626.git.geert+renesas@glider.be Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-02-02net: fec: do not double-parse 'phy-reset-active-high' propertyDmitry Torokhov
Conversion to gpiod API done in commit 468ba54bd616 ("fec: convert to gpio descriptor") clashed with gpiolib applying the same quirk to the reset GPIO polarity (introduced in commit b02c85c9458c). This results in the reset line being left active/device being left in reset state when reset line is "active low". Remove handling of 'phy-reset-active-high' property from the driver and rely on gpiolib to apply needed adjustments to avoid ending up with the double inversion/flipped logic. Fixes: 468ba54bd616 ("fec: convert to gpio descriptor") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230201215320.528319-2-dmitry.torokhov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-02net: fec: restore handling of PHY reset line as optionalDmitry Torokhov
Conversion of the driver to gpiod API done in 468ba54bd616 ("fec: convert to gpio descriptor") incorrectly made reset line mandatory and resulted in aborting driver probe in cases where reset line was not specified (note: this way of specifying PHY reset line is actually deprecated). Switch to using devm_gpiod_get_optional() and skip manipulating reset line if it can not be located. Fixes: 468ba54bd616 ("fec: convert to gpio descriptor") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230201215320.528319-1-dmitry.torokhov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-02drivers: net: turn on XDP featuresMarek Majtyka
A summary of the flags being set for various drivers is given below. Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features that can be turned off and on at runtime. This means that these flags may be set and unset under RTNL lock protection by the driver. Hence, READ_ONCE must be used by code loading the flag value. Also, these flags are not used for synchronization against the availability of XDP resources on a device. It is merely a hint, and hence the read may race with the actual teardown of XDP resources on the device. This may change in the future, e.g. operations taking a reference on the XDP resources of the driver, and in turn inhibiting turning off this flag. However, for now, it can only be used as a hint to check whether device supports becoming a redirection target. Turn 'hw-offload' feature flag on for: - netronome (nfp) - netdevsim. Turn 'native' and 'zerocopy' features flags on for: - intel (i40e, ice, ixgbe, igc) - mellanox (mlx5). - stmmac - netronome (nfp) Turn 'native' features flags on for: - amazon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2, enetc) - funeth - intel (igb) - marvell (mvneta, mvpp2, octeontx2) - mellanox (mlx4) - mtk_eth_soc - qlogic (qede) - sfc - socionext (netsec) - ti (cpsw) - tap - tsnep - veth - xen - virtio_net. Turn 'basic' (tx, pass, aborted and drop) features flags on for: - netronome (nfp) - cavium (thunder) - hyperv. Turn 'redirect_target' feature flag on for: - amanzon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2) - intel (i40e, ice, igb, ixgbe) - ti (cpsw) - marvell (mvneta, mvpp2) - sfc - socionext (netsec) - qlogic (qede) - mellanox (mlx5) - tap - veth - virtio_net - xen Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Marek Majtyka <alardam@gmail.com> Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/core/gro.c 7d2c89b32587 ("skb: Do mix page pool and page referenced frags in GRO") b1a78b9b9886 ("net: add support for ipv4 big tcp") https://lore.kernel.org/all/20230203094454.5766f160@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31net: fman: memac: free mdio device if lynx_pcs_create() failsVladimir Oltean
When memory allocation fails in lynx_pcs_create() and it returns NULL, there remains a dangling reference to the mdiodev returned by of_mdio_find_device() which is leaked as soon as memac_pcs_create() returns empty-handed. Fixes: a7c2a32e7f22 ("net: fman: memac: Use lynx pcs driver") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://lore.kernel.org/r/20230130193051.563315-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30fec: convert to gpio descriptorArnd Bergmann
The driver can be trivially converted, as it only triggers the gpio pin briefly to do a reset, and it already only supports DT. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/intel/ice/ice_main.c 418e53401e47 ("ice: move devlink port creation/deletion") 643ef23bd9dd ("ice: Introduce local var for readability") https://lore.kernel.org/all/20230127124025.0dacef40@canb.auug.org.au/ https://lore.kernel.org/all/20230124005714.3996270-1-anthony.l.nguyen@intel.com/ drivers/net/ethernet/engleder/tsnep_main.c 3d53aaef4332 ("tsnep: Fix TX queue stop/wake for multiple queues") 25faa6a4c5ca ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/20230127123604.36bb3e99@canb.auug.org.au/ net/netfilter/nf_conntrack_proto_sctp.c 13bd9b31a969 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") a44b7651489f ("netfilter: conntrack: unify established states for SCTP paths") f71cb8f45d09 ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/20230127125052.674281f9@canb.auug.org.au/ https://lore.kernel.org/all/d36076f3-6add-a442-6d4b-ead9f7ffff86@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-27dpaa2-eth: execute xdp_do_flush() before napi_complete_done()Magnus Karlsson
Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: d678be1dc1ec ("dpaa2-eth: add XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-27dpaa_eth: execute xdp_do_flush() before napi_complete_done()Magnus Karlsson
Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: a1e031ffb422 ("dpaa_eth: add XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/ Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-23net: enetc: stop auto-configuring the port pMACVladimir Oltean
The pMAC (ENETC_PFPMR_PMACE) is probably unconditionally enabled in the enetc driver to allow RX of preemptible packets and not see them as error frames. I don't know why TX preemption (ENETC_MMCSR_ME) is enabled though. With no way to say which traffic classes are preemptible (all are express by default), no preemptible frames would be transmitted anyway. Lastly, it may have been believed that the register write lock-step mode (now deleted) needed the pMAC to be enabled at all times. I don't know if that's true. However, I've checked that driver writes to PM1 registers do propagate through to the ENETC IP even when the pMAC is disabled. With such incomplete support for frame preemption, it's best to just remove whatever exists right now and come with something more coherent later. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: implement software lockstep for port MAC registersVladimir Oltean
Currently the enetc driver duplicates its writes to the PM0 registers also to PM1, but it doesn't do this consistently - for example we write to ENETC_PM0_MAXFRM but not to ENETC_PM1_MAXFRM. Create enetc_port_mac_wr() which writes both the PM0 and PM1 register with the same value (if frame preemption is supported on this port). Also create enetc_port_mac_rd() which reads from PM0 - the assumption being that PM1 contains just the same value. This will be necessary when we enable the MAC Merge layer properly, and the pMAC becomes operational. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: stop configuring pMAC in lockstep with eMACVladimir Oltean
The MWLM bit (MAC write lock-step mode) allows register writes to the pMAC to be auto-performed whenever the corresponding eMAC register is written by the driver. This allows their configuration to remain in sync. The driver has set this bit since the initial commit, but it doesn't do anything, since the hardware feature doesn't work (and the bit has been removed from more recent versions of the documentation). The driver does attempt, more or less, to keep those MAC registers in sync by writing the same value once to e.g. ENETC_PM0_CMD_CFG (eMAC) and once to ENETC_PM1_CMD_CFG (pMAC). Because the lockstep feature doesn't work, that's what it will stick to. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: add definition for offset between eMAC and pMAC regsVladimir Oltean
This is a preliminary patch which replaces the hardcoded 0x1000 present in other PM1 (port MAC 1, aka pMAC) register definitions, which is an offset to the PM0 (port MAC 0, aka eMAC) equivalent register. This definition will be used in more places by future code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: detect frame preemption hardware capabilityVladimir Oltean
Similar to other TSN features, query the Station Interface capability register to see whether preemption is supported on this port or not. On LS1028A, preemption is available on ports 0 and 2, but not on 1 and 3. This will allow us in the future to write the pMAC registers only on the ENETC ports where a pMAC actually exists. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: enetc: build common object files into a separate moduleVladimir Oltean
The build system is complaining about the following: enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_cbdr.o is added to multiple modules: fsl-enetc fsl-enetc-vf enetc_ethtool.o is added to multiple modules: fsl-enetc fsl-enetc-vf Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: fec: Use page_pool_put_full_page when freeing rx buffersWei Fang
The page_pool_release_page was used when freeing rx buffers, and this function just unmaps the page (if mapped) and does not recycle the page. So after hundreds of down/up the eth0, the system will out of memory. For more details, please refer to the following reproduce steps and bug logs. To solve this issue and refer to the doc of page pool, the page_pool_put_full_page should be used to replace page_pool_release_page. Because this API will try to recycle the page if the page refcnt equal to 1. After testing 20000 times, the issue can not be reproduced anymore (about testing 391 times the issue will occur on i.MX8MN-EVK before). Reproduce steps: Create the test script and run the script. The script content is as follows: LOOPS=20000 i=1 while [ $i -le $LOOPS ] do echo "TINFO:ENET $curface up and down test $i times" org_macaddr=$(cat /sys/class/net/eth0/address) ifconfig eth0 down ifconfig eth0 hw ether $org_macaddr up i=$(expr $i + 1) done sleep 5 if cat /sys/class/net/eth0/operstate | grep 'up';then echo "TEST PASS" else echo "TEST FAIL" fi Bug detail logs: TINFO:ENET up and down test 391 times [ 850.471205] Qualcomm Atheros AR8031/AR8033 30be0000.ethernet-1:00: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:00, irq=POLL) [ 853.535318] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 853.541694] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 870.590531] page_pool_release_retry() stalled pool shutdown 199 inflight 60 sec [ 931.006557] page_pool_release_retry() stalled pool shutdown 199 inflight 120 sec TINFO:ENET up and down test 392 times [ 991.426544] page_pool_release_retry() stalled pool shutdown 192 inflight 181 sec [ 1051.838531] page_pool_release_retry() stalled pool shutdown 170 inflight 241 sec [ 1093.751217] Qualcomm Atheros AR8031/AR8033 30be0000.ethernet-1:00: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:00, irq=POLL) [ 1096.446520] page_pool_release_retry() stalled pool shutdown 308 inflight 60 sec [ 1096.831245] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 1096.839092] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 1112.254526] page_pool_release_retry() stalled pool shutdown 103 inflight 302 sec [ 1156.862533] page_pool_release_retry() stalled pool shutdown 308 inflight 120 sec [ 1172.674516] page_pool_release_retry() stalled pool shutdown 103 inflight 362 sec [ 1217.278532] page_pool_release_retry() stalled pool shutdown 308 inflight 181 sec TINFO:ENET up and down test 393 times [ 1233.086535] page_pool_release_retry() stalled pool shutdown 103 inflight 422 sec [ 1277.698513] page_pool_release_retry() stalled pool shutdown 308 inflight 241 sec [ 1293.502525] page_pool_release_retry() stalled pool shutdown 86 inflight 483 sec [ 1338.110518] page_pool_release_retry() stalled pool shutdown 308 inflight 302 sec [ 1353.918540] page_pool_release_retry() stalled pool shutdown 32 inflight 543 sec [ 1361.179205] Qualcomm Atheros AR8031/AR8033 30be0000.ethernet-1:00: attached PHY driver (mii_bus:phy_addr=30be0000.ethernet-1:00, irq=POLL) [ 1364.255298] fec 30be0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx [ 1364.263189] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 1371.998532] page_pool_release_retry() stalled pool shutdown 310 inflight 60 sec [ 1398.530542] page_pool_release_retry() stalled pool shutdown 308 inflight 362 sec [ 1414.334539] page_pool_release_retry() stalled pool shutdown 16 inflight 604 sec [ 1432.414520] page_pool_release_retry() stalled pool shutdown 310 inflight 120 sec [ 1458.942523] page_pool_release_retry() stalled pool shutdown 308 inflight 422 sec [ 1474.750521] page_pool_release_retry() stalled pool shutdown 16 inflight 664 sec TINFO:ENET up and down test 394 times [ 1492.830522] page_pool_release_retry() stalled pool shutdown 310 inflight 181 sec [ 1519.358519] page_pool_release_retry() stalled pool shutdown 308 inflight 483 sec [ 1535.166545] page_pool_release_retry() stalled pool shutdown 2 inflight 724 sec [ 1537.090278] eth_test2.sh invoked oom-killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0 [ 1537.101192] CPU: 3 PID: 2379 Comm: eth_test2.sh Tainted: G C 6.1.1+g56321e101aca #1 [ 1537.110249] Hardware name: NXP i.MX8MNano EVK board (DT) [ 1537.115561] Call trace: [ 1537.118005] dump_backtrace.part.0+0xe0/0xf0 [ 1537.122289] show_stack+0x18/0x40 [ 1537.125608] dump_stack_lvl+0x64/0x80 [ 1537.129276] dump_stack+0x18/0x34 [ 1537.132592] dump_header+0x44/0x208 [ 1537.136083] oom_kill_process+0x2b4/0x2c0 [ 1537.140097] out_of_memory+0xe4/0x594 [ 1537.143766] __alloc_pages+0xb68/0xd00 [ 1537.147521] alloc_pages+0xac/0x160 [ 1537.151013] __get_free_pages+0x14/0x40 [ 1537.154851] pgd_alloc+0x1c/0x30 [ 1537.158082] mm_init+0xf8/0x1d0 [ 1537.161228] mm_alloc+0x48/0x60 [ 1537.164368] alloc_bprm+0x7c/0x240 [ 1537.167777] do_execveat_common.isra.0+0x70/0x240 [ 1537.172486] __arm64_sys_execve+0x40/0x54 [ 1537.176502] invoke_syscall+0x48/0x114 [ 1537.180255] el0_svc_common.constprop.0+0xcc/0xec [ 1537.184964] do_el0_svc+0x2c/0xd0 [ 1537.188280] el0_svc+0x2c/0x84 [ 1537.191340] el0t_64_sync_handler+0xf4/0x120 [ 1537.195613] el0t_64_sync+0x18c/0x190 [ 1537.199334] Mem-Info: [ 1537.201620] active_anon:342 inactive_anon:10343 isolated_anon:0 [ 1537.201620] active_file:54 inactive_file:112 isolated_file:0 [ 1537.201620] unevictable:0 dirty:0 writeback:0 [ 1537.201620] slab_reclaimable:2620 slab_unreclaimable:7076 [ 1537.201620] mapped:1489 shmem:2473 pagetables:466 [ 1537.201620] sec_pagetables:0 bounce:0 [ 1537.201620] kernel_misc_reclaimable:0 [ 1537.201620] free:136672 free_pcp:96 free_cma:129241 [ 1537.240419] Node 0 active_anon:1368kB inactive_anon:41372kB active_file:216kB inactive_file:5052kB unevictable:0kB isolated(anon):0kB isolated(file):0kB s [ 1537.271422] Node 0 DMA free:541636kB boost:0kB min:30000kB low:37500kB high:45000kB reserved_highatomic:0KB active_anon:1368kB inactive_anon:41372kB actiB [ 1537.300219] lowmem_reserve[]: 0 0 0 0 [ 1537.303929] Node 0 DMA: 1015*4kB (UMEC) 743*8kB (UMEC) 417*16kB (UMEC) 235*32kB (UMEC) 116*64kB (UMEC) 25*128kB (UMEC) 4*256kB (UC) 2*512kB (UC) 0*1024kBB [ 1537.323938] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 1537.332708] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB [ 1537.341292] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 1537.349776] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB [ 1537.358087] 2939 total pagecache pages [ 1537.361876] 0 pages in swap cache [ 1537.365229] Free swap = 0kB [ 1537.368147] Total swap = 0kB [ 1537.371065] 516096 pages RAM [ 1537.373959] 0 pages HighMem/MovableOnly [ 1537.377834] 17302 pages reserved [ 1537.381103] 163840 pages cma reserved [ 1537.384809] 0 pages hwpoisoned [ 1537.387902] Tasks state (memory values in pages): [ 1537.392652] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ 1537.401356] [ 201] 993 201 1130 72 45056 0 0 rpcbind [ 1537.409772] [ 202] 0 202 4529 1640 77824 0 -250 systemd-journal [ 1537.418861] [ 222] 0 222 4691 801 69632 0 -1000 systemd-udevd [ 1537.427787] [ 248] 994 248 20914 130 65536 0 0 systemd-timesyn [ 1537.436884] [ 497] 0 497 620 31 49152 0 0 atd [ 1537.444938] [ 500] 0 500 854 77 53248 0 0 crond [ 1537.453165] [ 503] 997 503 1470 160 49152 0 -900 dbus-daemon [ 1537.461908] [ 505] 0 505 633 24 40960 0 0 firmwared [ 1537.470491] [ 513] 0 513 2507 180 61440 0 0 ofonod [ 1537.478800] [ 514] 990 514 69640 137 81920 0 0 parsec [ 1537.487120] [ 533] 0 533 599 39 40960 0 0 syslogd [ 1537.495518] [ 534] 0 534 4546 148 65536 0 0 systemd-logind [ 1537.504560] [ 535] 0 535 690 24 45056 0 0 tee-supplicant [ 1537.513564] [ 540] 996 540 2769 168 61440 0 0 systemd-network [ 1537.522680] [ 566] 0 566 3878 228 77824 0 0 connmand [ 1537.531168] [ 645] 998 645 1538 133 57344 0 0 avahi-daemon [ 1537.540004] [ 646] 998 646 1461 64 57344 0 0 avahi-daemon [ 1537.548846] [ 648] 992 648 781 41 45056 0 0 rpc.statd [ 1537.557415] [ 650] 64371 650 590 23 45056 0 0 ninfod [ 1537.565754] [ 653] 61563 653 555 24 45056 0 0 rdisc [ 1537.573971] [ 655] 0 655 374569 2999 290816 0 -999 containerd [ 1537.582621] [ 658] 0 658 1311 20 49152 0 0 agetty [ 1537.590922] [ 663] 0 663 1529 97 49152 0 0 login [ 1537.599138] [ 666] 0 666 3430 202 69632 0 0 wpa_supplicant [ 1537.608147] [ 667] 0 667 2344 96 61440 0 0 systemd-userdbd [ 1537.617240] [ 677] 0 677 2964 314 65536 0 100 systemd [ 1537.625651] [ 679] 0 679 3720 646 73728 0 100 (sd-pam) [ 1537.634138] [ 687] 0 687 1289 403 45056 0 0 sh [ 1537.642108] [ 789] 0 789 970 93 45056 0 0 eth_test2.sh [ 1537.650955] [ 2355] 0 2355 2346 94 61440 0 0 systemd-userwor [ 1537.660046] [ 2356] 0 2356 2346 94 61440 0 0 systemd-userwor [ 1537.669137] [ 2358] 0 2358 2346 95 57344 0 0 systemd-userwor [ 1537.678258] [ 2379] 0 2379 970 93 45056 0 0 eth_test2.sh [ 1537.687098] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/user@0.service,tas0 [ 1537.703009] Out of memory: Killed process 679 ((sd-pam)) total-vm:14880kB, anon-rss:2584kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:72kB oom_score_ad0 [ 1553.246526] page_pool_release_retry() stalled pool shutdown 310 inflight 241 sec Fixes: 95698ff6177b ("net: fec: using page pool to manage RX buffers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: shenwei wang <Shenwei.wang@nxp.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ipa/ipa_interrupt.c drivers/net/ipa/ipa_interrupt.h 9ec9b2a30853 ("net: ipa: disable ipa interrupt during suspend") 8e461e1f092b ("net: ipa: introduce ipa_interrupt_enable()") d50ed3558719 ("net: ipa: enable IPA interrupt handlers separate from registration") https://lore.kernel.org/all/20230119114125.5182c7ab@canb.auug.org.au/ https://lore.kernel.org/all/79e46152-8043-a512-79d9-c3b905462774@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-19net: phy: Remove probe_capabilitiesAndrew Lunn
Deciding if to probe of PHYs using C45 is now determine by if the bus provides the C45 read method. This makes probe_capabilities redundant so remove it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Andrew Jeffery <andrew@aj.id.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-18net: enetc: prioritize ability to go down over packet processingVladimir Oltean
napi_synchronize() from enetc_stop() waits until the softirq has finished execution and no longer wants to be rescheduled. However under high traffic load, this will never happen, and the interface can never be closed. The problem is the fact that the NAPI poll routine is written to update the consumer index which makes the device want to put more buffers in the RX ring, which restarts the madness again. Browsing around, it seems that some drivers like i40e keep a bit (__I40E_VSI_DOWN) which they use as communication between the control path and the data path. But that isn't my first choice, because complications ensue - since the enetc hardirq may trigger while we are in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks it, but enetc_poll() never unmasks it. To prevent a stall in that case, one would need to schedule all NAPI instances when ENETC_DOWN gets cleared, to process what's pending. I find it more desirable for the control path - enetc_stop() - to just quiesce the RX ring and let the softirq finish what remains there, without any explicit communication, just by making hardware not provide any more packets. This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]). I can't seem to find an exact definition of what this bit does, but when the RX ring is disabled, the port seems to no longer update the producer index, and not react to software updates of the consumer index. In fact, the RBaMR[EN] bit is already toggled by the driver, but too late for what we want: enetc_close() -> enetc_stop() -> napi_synchronize() -> enetc_clear_bdrs() -> enetc_clear_rxbdr() The enetc_clear_bdrs() function contains not only logic to disable the RX and TX rings, but also logic to wait for the TX ring stop being busy. We split enetc_clear_bdrs() into enetc_disable_bdrs() and enetc_wait_bdrs(). One needs to run before napi_synchronize() and the other after (NAPI also processes TX completions, so we maximize our chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is stuck for some reason, ofc). We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call this from the mirror position in enetc_start() compared to enetc_stop(), i.e. right before netif_tx_start_all_queues(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: set up XDP program under enetc_reconfigure()Vladimir Oltean
Offloading a BPF program to the RX path of the driver suffers from the same problems as the PTP reconfiguration - improper error checking can leave the driver in an invalid state, and the link on the PHY is lost. Reuse the enetc_reconfigure() procedure, but here, we need to run some code in the middle of the ring reconfiguration procedure - while the interface is still down. Introduce a callback which makes that possible. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: rename "xdp" and "dev" in enetc_setup_bpf()Vladimir Oltean
Follow the convention from this driver, which is to name "struct net_device *" as "ndev", and the convention from other drivers, to name "struct netdev_bpf *" as "bpf". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: implement ring reconfiguration procedure for PTP RX timestampingVladimir Oltean
The crude enetc_stop() -> enetc_open() mechanism suffers from 2 problems: 1. improper error checking 2. it involves phylink_stop() -> phylink_start() which loses the link Right now, the driver is prepared to offer a better alternative: a ring reconfiguration procedure which takes the RX BD size (normal or extended) as argument. It allocates new resources (failing if that fails), stops the traffic, and assigns the new resources to the rings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: move phylink_start/stop out of enetc_start/stopVladimir Oltean
We want to introduce a fast interface reconfiguration procedure, which involves temporarily stopping the rings. But we want enetc_start() and enetc_stop() to not restart PHY autoneg, because that can take a few seconds until it completes again. So we need part of enetc_start() and enetc_stop(), but not all of them. Move phylink_start() right next to phylink_of_phy_connect(), and phylink_stop() right next to phylink_disconnect_phy(), both still in ndo_open() and ndo_stop(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: split ring resource allocation from assignmentVladimir Oltean
We have a few instances in the enetc driver where the ring resources (BD ring iomem, software BD ring, software TSO headers, basically everything except RX buffers) need to be reallocated. For example, when RX timestamping is enabled, the RX BD format changes to an extended one (twice as large). Currently, this is done using a simplistic enetc_close() -> enetc_open() procedure. But this is quite crude, since it also invokes phylink_stop() -> phylink_start(), the link is lost, and a few seconds need to pass for autoneg to complete again. In fact it's bad also due to the improper (yolo) error checking. In case we fail to allocate new resources, we've already freed the old ones, so the interface is more or less stuck. To avoid that, we need a system where reconfiguration is possible in a way in which resources are allocated upfront. This means that there will be a higher memory usage temporarily, but the assignment of resources to rings can be done when both the old and new resources are still available. Introduce a struct enetc_bdr_resource which holds the resources for a ring, be it RX or TX. This structure duplicates a lot of fields from struct enetc_bdr (and access to the same fields in the ring structure was left duplicated, to not change cache characteristics in the fast path). When enetc_alloc_tx_resources() runs, it returns an array of resource elements (one per TX ring), in addition to the existing priv->tx_res. To populate priv->tx_res with that array, one must call enetc_assign_tx_resources(), and this also frees the old resources. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: bring "bool extended" to top-level in enetc_open()Vladimir Oltean
Extended RX buffer descriptors are necessary if they carry RX timestamps, which will be true when PTP timestamping is enabled. Right now, the rx_ring->ext_en is set from the function that allocates ring resources (enetc_alloc_rx_resources() -> enetc_alloc_rxbdr()), and also used later, in enetc_setup_rxbdr(). It is also used in the enetc_rxbd() and enetc_rxbd_next() fast path helpers. We want to decouple resource allocation from BD ring setup, but both procedures depend on BD size (extended or not). Move the "extended" boolean to enetc_open() and pass it both to the RX allocation procedure as well as to the RX ring setup procedure. The latter will set rx_ring->ext_en from now on. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: drop redundant enetc_free_tx_frame() call from enetc_free_txbdr()Vladimir Oltean
The call path in enetc_close() is: enetc_close() -> enetc_free_rxtx_rings() -> enetc_free_tx_ring() -> enetc_free_tx_frame() -> enetc_free_tx_resources() -> enetc_free_txbdr() -> enetc_free_tx_frame() The enetc_free_tx_frame() function is written such that the second call exits without doing anything, but nonetheless, it is completely redundant. Delete it. This makes the TX teardown path more similar to the RX one, where rx_swbd freeing is done in enetc_free_rx_ring(), not in enetc_free_rxbdr(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: rx_swbd and tx_swbd are never NULL in enetc_free_rxtx_rings()Vladimir Oltean
The call path in enetc_close() is: enetc_close() -> enetc_free_rxtx_rings() -> enetc_free_rx_ring() -> tests whether rx_ring->rx_swbd is NULL -> enetc_free_tx_ring() -> tests whether tx_ring->tx_swbd is NULL -> enetc_free_rx_resources() -> enetc_free_rxbdr() -> sets rxr->rx_swbd to NULL -> enetc_free_tx_resources() -> enetc_free_txbdr() -> setx txr->tx_swbd to NULL From the above, it is clear that due to the function ordering, the checks for NULL are redundant, since the software buffer descriptor arrays have not yet been set to NULL. Drop these checks. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: create enetc_dma_free_bdr()Vladimir Oltean
This is a refactoring change which introduces the opposite function of enetc_dma_alloc_bdr(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: set up RX ring indices from enetc_setup_rxbdr()Vladimir Oltean
There is only one place which needs to set up indices in the RX ring. Be consistent with what was done in the TX path and do this in enetc_setup_rxbdr(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18net: enetc: set next_to_clean/next_to_use just from enetc_setup_txbdr()Vladimir Oltean
enetc_alloc_txbdr() deals with allocating resources necessary for a TX ring to work (the array of software BDs and the array of TSO headers). The next_to_clean and next_to_use pointers are overwritten with proper values which are read from hardware here: enetc_open -> enetc_alloc_tx_resources -> enetc_alloc_txbdr -> set to zero -> enetc_setup_bdrs -> enetc_setup_txbdr -> read from hardware So their initialization with zeroes is pointless and confusing. Delete it. Consequently, since enetc_setup_txbdr() has no opposite cleanup function, also delete the resetting of these indices from enetc_free_tx_ring(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-13enetc: Separate C22 and C45 transactionsAndrew Lunn
The enetc MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. This driver is shared with the Felix DSA switch, so update that at the same time. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-13net: enetc: avoid deadlock in enetc_tx_onestep_tstamp()Vladimir Oltean
This lockdep splat says it better than I could: ================================ WARNING: inconsistent lock state 6.2.0-rc2-07010-ga9b9500ffaac-dirty #967 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. kworker/1:3/179 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff3ec4036ce098 (_xmit_ETHER#2){+.?.}-{3:3}, at: netif_freeze_queues+0x5c/0xc0 {IN-SOFTIRQ-W} state was registered at: _raw_spin_lock+0x5c/0xc0 sch_direct_xmit+0x148/0x37c __dev_queue_xmit+0x528/0x111c ip6_finish_output2+0x5ec/0xb7c ip6_finish_output+0x240/0x3f0 ip6_output+0x78/0x360 ndisc_send_skb+0x33c/0x85c ndisc_send_rs+0x54/0x12c addrconf_rs_timer+0x154/0x260 call_timer_fn+0xb8/0x3a0 __run_timers.part.0+0x214/0x26c run_timer_softirq+0x3c/0x74 __do_softirq+0x14c/0x5d8 ____do_softirq+0x10/0x20 call_on_irq_stack+0x2c/0x5c do_softirq_own_stack+0x1c/0x30 __irq_exit_rcu+0x168/0x1a0 irq_exit_rcu+0x10/0x40 el1_interrupt+0x38/0x64 irq event stamp: 7825 hardirqs last enabled at (7825): [<ffffdf1f7200cae4>] exit_to_kernel_mode+0x34/0x130 hardirqs last disabled at (7823): [<ffffdf1f708105f0>] __do_softirq+0x550/0x5d8 softirqs last enabled at (7824): [<ffffdf1f7081050c>] __do_softirq+0x46c/0x5d8 softirqs last disabled at (7811): [<ffffdf1f708166e0>] ____do_softirq+0x10/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(_xmit_ETHER#2); <Interrupt> lock(_xmit_ETHER#2); *** DEADLOCK *** 3 locks held by kworker/1:3/179: #0: ffff3ec400004748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0 #1: ffff80000a0bbdc8 ((work_completion)(&priv->tx_onestep_tstamp)){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c0 #2: ffff3ec4036cd438 (&dev->tx_global_lock){+.+.}-{3:3}, at: netif_tx_lock+0x1c/0x34 Workqueue: events enetc_tx_onestep_tstamp Call trace: print_usage_bug.part.0+0x208/0x22c mark_lock+0x7f0/0x8b0 __lock_acquire+0x7c4/0x1ce0 lock_acquire.part.0+0xe0/0x220 lock_acquire+0x68/0x84 _raw_spin_lock+0x5c/0xc0 netif_freeze_queues+0x5c/0xc0 netif_tx_lock+0x24/0x34 enetc_tx_onestep_tstamp+0x20/0x100 process_one_work+0x28c/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x11c but I'll say it anyway: the enetc_tx_onestep_tstamp() work item runs in process context, therefore with softirqs enabled (i.o.w., it can be interrupted by a softirq). If we hold the netif_tx_lock() when there is an interrupt, and the NET_TX softirq then gets scheduled, this will take the netif_tx_lock() a second time and deadlock the kernel. To solve this, use netif_tx_lock_bh(), which blocks softirqs from running. Fixes: 7294380c5211 ("enetc: support PTP Sync packet one-step timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230112105440.1786799-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-12net: remove redundant config PCI dependency for some network driver configsLukas Bulwahn
While reviewing dependencies in some Kconfig files, I noticed the redundant dependency "depends on PCI && PCI_MSI". The config PCI_MSI has always, since its introduction, been dependent on the config PCI. So, it is sufficient to just depend on PCI_MSI, and know that the dependency on PCI is implicitly implied. Reduce the dependencies of some network driver configs. No functional change and effective change of Kconfig dependendencies. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Dimitris Michailidis <dmichail@fungible.com> Link: https://lore.kernel.org/r/20230111125855.19020-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: fec: Separate C22 and C45 transactionsAndrew Lunn
The fec MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-10net: mdio: xgmac_mdio: Separate C22 and C45 transactionsAndrew Lunn
The xgmac MDIO bus driver can perform both C22 and C45 transfers. Create separate functions for each and register the C45 versions using the new API calls where appropriate. While at it, remove the misleading comment. According to Vladimir Oltean: - miimcom is a register accessed by fsl_pq_mdio.c, not by xgmac_mdio.c - "device dev" doesn't really refer to anything (maybe "dev_addr"). - I don't understand what is meant by the comment "All PHY configuration has to be done through the TSEC1 MIIM regs". Or rather said, I think I understand, but it is irrelevant to the driver for 2 reasons: * TSEC devices use the fsl_pq_mdio.c driver, not this one * It doesn't matter to this driver whose TSEC registers are used for MDIO access. The driver just works with the registers it's given, which is a concern for the device tree. - barring the above, the rest just describes the MDIO bus API, which is superfluous Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>