From ed872f92fd0946ba30f2acd05fc57e29cac29cd2 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn Date: Wed, 1 Jun 2022 06:57:38 +0200 Subject: MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal Commit 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") removes all files in the directory drivers/net/ethernet/mellanox/mlx5/core/accel/, but misses to adjust its reference in MAINTAINERS. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a broken reference. Remove the file entry to the removed directory in MELLANOX ETHERNET INNOVA DRIVERS. Signed-off-by: Lukas Bulwahn Signed-off-by: Saeed Mahameed --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 033a01b07f8f..bab9e131ec9c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12651,7 +12651,6 @@ L: netdev@vger.kernel.org S: Supported W: http://www.mellanox.com Q: https://patchwork.kernel.org/project/netdevbpf/list/ -F: drivers/net/ethernet/mellanox/mlx5/core/accel/* F: drivers/net/ethernet/mellanox/mlx5/core/en_accel/* F: drivers/net/ethernet/mellanox/mlx5/core/fpga/* F: include/linux/mlx5/mlx5_ifc_fpga.h -- cgit v1.2.3-58-ga151 From 4d995c1b9d49ee657e879745aa5e445f031c0dba Mon Sep 17 00:00:00 2001 From: Saeed Mahameed Date: Fri, 3 Jun 2022 14:33:03 -0700 Subject: Revert "net/mlx5e: Allow relaxed ordering over VFs" FW is not ready, fix was sent too soon. This reverts commit f05ec8d9d0d62367b6e1f2cb50d7d2a45e7747cf. Fixes: f05ec8d9d0d6 ("net/mlx5e: Allow relaxed ordering over VFs") Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 3 ++- drivers/net/ethernet/mellanox/mlx5/core/en_common.c | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index 68364484a435..3c1edfa33aa7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -565,7 +565,8 @@ static void mlx5e_build_rx_cq_param(struct mlx5_core_dev *mdev, static u8 rq_end_pad_mode(struct mlx5_core_dev *mdev, struct mlx5e_params *params) { bool lro_en = params->packet_merge.type == MLX5E_PACKET_MERGE_LRO; - bool ro = MLX5_CAP_GEN(mdev, relaxed_ordering_write); + bool ro = pcie_relaxed_ordering_enabled(mdev->pdev) && + MLX5_CAP_GEN(mdev, relaxed_ordering_write); return ro && lro_en ? MLX5_WQ_END_PAD_MODE_NONE : MLX5_WQ_END_PAD_MODE_ALIGN; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c index 43a536cb81db..c0f409c195bf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c @@ -38,11 +38,12 @@ void mlx5e_mkey_set_relaxed_ordering(struct mlx5_core_dev *mdev, void *mkc) { + bool ro_pci_enable = pcie_relaxed_ordering_enabled(mdev->pdev); bool ro_write = MLX5_CAP_GEN(mdev, relaxed_ordering_write); bool ro_read = MLX5_CAP_GEN(mdev, relaxed_ordering_read); - MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_read); - MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_write); + MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enable && ro_read); + MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enable && ro_write); } static int mlx5e_create_mkey(struct mlx5_core_dev *mdev, u32 pdn, -- cgit v1.2.3-58-ga151 From 15ef9efa855cf405fadd78272e1e5d04e09a1cf3 Mon Sep 17 00:00:00 2001 From: Paul Blakey Date: Tue, 29 Mar 2022 18:37:18 +0300 Subject: net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules CT cleanup assumes that all tc rules were deleted first, and so is free to delete the CT shared resources (e.g the dr_action fwd_action which is shared for all tuples). But currently for uplink, this is happens in reverse, causing the below trace. CT cleanup is called from: mlx5e_cleanup_rep_tx()->mlx5e_cleanup_uplink_rep_tx()-> mlx5e_rep_tc_cleanup()->mlx5e_tc_esw_cleanup()-> mlx5_tc_ct_clean() Only afterwards, tc cleanup is called from: mlx5e_cleanup_rep_tx()->mlx5e_tc_ht_cleanup() which would have deleted all the tc ct rules, and so delete all the offloaded tuples. Fix this reversing the order of init and on cleanup, which will result in tc cleanup then ct cleanup. [ 9443.593347] WARNING: CPU: 2 PID: 206774 at drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1882 mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core] [ 9443.593349] Modules linked in: act_ct nf_flow_table rdma_ucm(O) rdma_cm(O) iw_cm(O) ib_ipoib(O) ib_cm(O) ib_umad(O) mlx5_core(O-) mlxfw(O) mlxdevm(O) auxiliary(O) ib_uverbs(O) psample ib_core(O) mlx_compat(O) ip_gre gre ip_tunnel act_vlan bonding geneve esp6_offload esp6 esp4_offload esp4 act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel act_mirred act_skbedit act_gact cls_flower sch_ingress nfnetlink_cttimeout nfnetlink xfrm_user xfrm_algo 8021q garp stp ipmi_devintf mrp ipmi_msghandler llc openvswitch nsh nf_conncount nf_nat mst_pciconf(O) dm_multipath sbsa_gwdt uio_pdrv_genirq uio mlxbf_pmc mlxbf_pka mlx_trio mlx_bootctl(O) bluefield_edac sch_fq_codel ip_tables ipv6 crc_ccitt btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 crct10dif_ce i2c_mlxbf gpio_mlxbf2 mlxbf_gige aes_neon_bs aes_neon_blk [last unloaded: mlx5_ib] [ 9443.593419] CPU: 2 PID: 206774 Comm: modprobe Tainted: G O 5.4.0-1023.24.gc14613d-bluefield #1 [ 9443.593422] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:143ebaf Jan 11 2022 [ 9443.593424] pstate: 20000005 (nzCv daif -PAN -UAO) [ 9443.593489] pc : mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core] [ 9443.593545] lr : mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core] [ 9443.593546] sp : ffff8000135dbab0 [ 9443.593548] x29: ffff8000135dbab0 x28: ffff0003a6ab8e80 [ 9443.593550] x27: 0000000000000000 x26: ffff0003e07d7000 [ 9443.593552] x25: ffff800009609de0 x24: ffff000397fb2120 [ 9443.593554] x23: ffff0003975c0000 x22: 0000000000000000 [ 9443.593556] x21: ffff0003975f08c0 x20: ffff800009609de0 [ 9443.593558] x19: ffff0003c8a13380 x18: 0000000000000014 [ 9443.593560] x17: 0000000067f5f125 x16: 000000006529c620 [ 9443.593561] x15: 000000000000000b x14: 0000000000000000 [ 9443.593563] x13: 0000000000000002 x12: 0000000000000001 [ 9443.593565] x11: ffff800011108868 x10: 0000000000000000 [ 9443.593567] x9 : 0000000000000000 x8 : ffff8000117fb270 [ 9443.593569] x7 : ffff0003ebc01288 x6 : 0000000000000000 [ 9443.593571] x5 : ffff800009591ab8 x4 : fffffe000f6d9a20 [ 9443.593572] x3 : 0000000080040001 x2 : fffffe000f6d9a20 [ 9443.593574] x1 : ffff8000095901d8 x0 : 0000000000000025 [ 9443.593577] Call trace: [ 9443.593634] mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core] [ 9443.593688] mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core] [ 9443.593743] mlx5_tc_ct_clean+0x34/0xa8 [mlx5_core] [ 9443.593797] mlx5e_tc_esw_cleanup+0x58/0x88 [mlx5_core] [ 9443.593851] mlx5e_rep_tc_cleanup+0x24/0x30 [mlx5_core] [ 9443.593905] mlx5e_cleanup_rep_tx+0x6c/0x78 [mlx5_core] [ 9443.593959] mlx5e_detach_netdev+0x74/0x98 [mlx5_core] [ 9443.594013] mlx5e_netdev_change_profile+0x70/0x180 [mlx5_core] [ 9443.594067] mlx5e_netdev_attach_nic_profile+0x34/0x40 [mlx5_core] [ 9443.594122] mlx5e_vport_rep_unload+0x15c/0x1a8 [mlx5_core] [ 9443.594177] mlx5_eswitch_unregister_vport_reps+0x228/0x298 [mlx5_core] [ 9443.594231] mlx5e_rep_remove+0x2c/0x38 [mlx5_core] [ 9443.594236] auxiliary_bus_remove+0x30/0x50 [auxiliary] [ 9443.594246] device_release_driver_internal+0x108/0x1d0 [ 9443.594248] driver_detach+0x5c/0xe8 [ 9443.594250] bus_remove_driver+0x64/0xd8 [ 9443.594253] driver_unregister+0x38/0x60 [ 9443.594255] auxiliary_driver_unregister+0x24/0x38 [auxiliary] [ 9443.594311] mlx5e_rep_cleanup+0x20/0x38 [mlx5_core] [ 9443.594365] mlx5e_cleanup+0x18/0x30 [mlx5_core] [ 9443.594419] cleanup+0xc/0x20cc [mlx5_core] [ 9443.594424] __arm64_sys_delete_module+0x154/0x2b0 [ 9443.594429] el0_svc_common.constprop.0+0xf4/0x200 [ 9443.594432] el0_svc_handler+0x38/0xa8 [ 9443.594435] el0_svc+0x10/0x26c Fixes: d1a3138f7913 ("net/mlx5e: TC, Move flow hashtable to be per rep") Signed-off-by: Paul Blakey Reviewed-by: Oz Shlomo Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 31 ++++++++++++------------ 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index eb90e79388f1..f797fd97d305 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -950,6 +950,13 @@ err_event_reg: return err; } +static void mlx5e_cleanup_uplink_rep_tx(struct mlx5e_rep_priv *rpriv) +{ + mlx5e_rep_tc_netdevice_event_unregister(rpriv); + mlx5e_rep_bond_cleanup(rpriv); + mlx5e_rep_tc_cleanup(rpriv); +} + static int mlx5e_init_rep_tx(struct mlx5e_priv *priv) { struct mlx5e_rep_priv *rpriv = priv->ppriv; @@ -961,42 +968,36 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv) return err; } - err = mlx5e_tc_ht_init(&rpriv->tc_ht); - if (err) - goto err_ht_init; - if (rpriv->rep->vport == MLX5_VPORT_UPLINK) { err = mlx5e_init_uplink_rep_tx(rpriv); if (err) goto err_init_tx; } + err = mlx5e_tc_ht_init(&rpriv->tc_ht); + if (err) + goto err_ht_init; + return 0; -err_init_tx: - mlx5e_tc_ht_cleanup(&rpriv->tc_ht); err_ht_init: + if (rpriv->rep->vport == MLX5_VPORT_UPLINK) + mlx5e_cleanup_uplink_rep_tx(rpriv); +err_init_tx: mlx5e_destroy_tises(priv); return err; } -static void mlx5e_cleanup_uplink_rep_tx(struct mlx5e_rep_priv *rpriv) -{ - mlx5e_rep_tc_netdevice_event_unregister(rpriv); - mlx5e_rep_bond_cleanup(rpriv); - mlx5e_rep_tc_cleanup(rpriv); -} - static void mlx5e_cleanup_rep_tx(struct mlx5e_priv *priv) { struct mlx5e_rep_priv *rpriv = priv->ppriv; - mlx5e_destroy_tises(priv); + mlx5e_tc_ht_cleanup(&rpriv->tc_ht); if (rpriv->rep->vport == MLX5_VPORT_UPLINK) mlx5e_cleanup_uplink_rep_tx(rpriv); - mlx5e_tc_ht_cleanup(&rpriv->tc_ht); + mlx5e_destroy_tises(priv); } static void mlx5e_rep_enable(struct mlx5e_priv *priv) -- cgit v1.2.3-58-ga151 From 3008e6a0049361e731b803c60fe8f3ab44e1d73f Mon Sep 17 00:00:00 2001 From: Mark Bloch Date: Thu, 26 May 2022 08:15:28 +0300 Subject: net/mlx5: E-Switch, pair only capable devices OFFLOADS paring using devcom is possible only on devices that support LAG. Filter based on lag capabilities. This fixes an issue where mlx5_get_next_phys_dev() was called without holding the interface lock. This issue was found when commit bc4c2f2e0179 ("net/mlx5: Lag, filter non compatible devices") added an assert that verifies the interface lock is held. WARNING: CPU: 9 PID: 1706 at drivers/net/ethernet/mellanox/mlx5/core/dev.c:642 mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core] Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm ib_uverbs ib_core overlay fuse [last unloaded: mlx5_core] CPU: 9 PID: 1706 Comm: devlink Not tainted 5.18.0-rc7+ #11 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core] Code: 02 00 75 48 48 8b 85 80 04 00 00 5d c3 31 c0 5d c3 be ff ff ff ff 48 c7 c7 08 41 5b a0 e8 36 87 28 e3 85 c0 0f 85 6f ff ff ff <0f> 0b e9 68 ff ff ff 48 c7 c7 0c 91 cc 84 e8 cb 36 6f e1 e9 4d ff RSP: 0018:ffff88811bf47458 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88811b398000 RCX: 0000000000000001 RDX: 0000000080000000 RSI: ffffffffa05b4108 RDI: ffff88812daaaa78 RBP: ffff88812d050380 R08: 0000000000000001 R09: ffff88811d6b3437 R10: 0000000000000001 R11: 00000000fddd3581 R12: ffff88815238c000 R13: ffff88812d050380 R14: ffff8881018aa7e0 R15: ffff88811d6b3428 FS: 00007fc82e18ae80(0000) GS:ffff88842e080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f9630d1b421 CR3: 0000000149802004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_esw_offloads_devcom_event+0x99/0x3b0 [mlx5_core] mlx5_devcom_send_event+0x167/0x1d0 [mlx5_core] esw_offloads_enable+0x1153/0x1500 [mlx5_core] ? mlx5_esw_offloads_controller_valid+0x170/0x170 [mlx5_core] ? wait_for_completion_io_timeout+0x20/0x20 ? mlx5_rescan_drivers_locked+0x318/0x810 [mlx5_core] mlx5_eswitch_enable_locked+0x586/0xc50 [mlx5_core] ? mlx5_eswitch_disable_pf_vf_vports+0x1d0/0x1d0 [mlx5_core] ? mlx5_esw_try_lock+0x1b/0xb0 [mlx5_core] ? mlx5_eswitch_enable+0x270/0x270 [mlx5_core] ? __debugfs_create_file+0x260/0x3e0 mlx5_devlink_eswitch_mode_set+0x27e/0x870 [mlx5_core] ? mutex_lock_io_nested+0x12c0/0x12c0 ? esw_offloads_disable+0x250/0x250 [mlx5_core] ? devlink_nl_cmd_trap_get_dumpit+0x470/0x470 ? rcu_read_lock_sched_held+0x3f/0x70 devlink_nl_cmd_eswitch_set_doit+0x217/0x620 Fixes: dd3fddb82780 ("net/mlx5: E-Switch, handle devcom events only for ports on the same device") Signed-off-by: Mark Bloch Reviewed-by: Roi Dayan Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/dev.c | 18 ------------------ .../net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 9 ++++++--- drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 10 ++++++++++ drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h | 1 - 4 files changed, 16 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c index 0eb9d74547f8..50422b56a64d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c @@ -579,17 +579,6 @@ static void *pci_get_other_drvdata(struct device *this, struct device *other) return pci_get_drvdata(to_pci_dev(other)); } -static int next_phys_dev(struct device *dev, const void *data) -{ - struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data; - - mdev = pci_get_other_drvdata(this->device, dev); - if (!mdev) - return 0; - - return _next_phys_dev(mdev, data); -} - static int next_phys_dev_lag(struct device *dev, const void *data) { struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data; @@ -623,13 +612,6 @@ static struct mlx5_core_dev *mlx5_get_next_dev(struct mlx5_core_dev *dev, return pci_get_drvdata(to_pci_dev(next)); } -/* Must be called with intf_mutex held */ -struct mlx5_core_dev *mlx5_get_next_phys_dev(struct mlx5_core_dev *dev) -{ - lockdep_assert_held(&mlx5_intf_mutex); - return mlx5_get_next_dev(dev, &next_phys_dev); -} - /* Must be called with intf_mutex held */ struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 217cac29057f..2ce3728576d1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -2690,9 +2690,6 @@ static int mlx5_esw_offloads_devcom_event(int event, switch (event) { case ESW_OFFLOADS_DEVCOM_PAIR: - if (mlx5_get_next_phys_dev(esw->dev) != peer_esw->dev) - break; - if (mlx5_eswitch_vport_match_metadata_enabled(esw) != mlx5_eswitch_vport_match_metadata_enabled(peer_esw)) break; @@ -2744,6 +2741,9 @@ static void esw_offloads_devcom_init(struct mlx5_eswitch *esw) if (!MLX5_CAP_ESW(esw->dev, merged_eswitch)) return; + if (!mlx5_is_lag_supported(esw->dev)) + return; + mlx5_devcom_register_component(devcom, MLX5_DEVCOM_ESW_OFFLOADS, mlx5_esw_offloads_devcom_event, @@ -2761,6 +2761,9 @@ static void esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw) if (!MLX5_CAP_ESW(esw->dev, merged_eswitch)) return; + if (!mlx5_is_lag_supported(esw->dev)) + return; + mlx5_devcom_send_event(devcom, MLX5_DEVCOM_ESW_OFFLOADS, ESW_OFFLOADS_DEVCOM_UNPAIR, esw); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h index 72f70fad4641..c81b173156d2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h @@ -74,6 +74,16 @@ struct mlx5_lag { struct lag_mpesw lag_mpesw; }; +static inline bool mlx5_is_lag_supported(struct mlx5_core_dev *dev) +{ + if (!MLX5_CAP_GEN(dev, vport_group_manager) || + !MLX5_CAP_GEN(dev, lag_master) || + MLX5_CAP_GEN(dev, num_lag_ports) < 2 || + MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_MAX_PORTS) + return false; + return true; +} + static inline struct mlx5_lag * mlx5_lag_dev(struct mlx5_core_dev *dev) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 484cb1e4fc7f..9cc7afea2758 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -209,7 +209,6 @@ int mlx5_attach_device(struct mlx5_core_dev *dev); void mlx5_detach_device(struct mlx5_core_dev *dev); int mlx5_register_device(struct mlx5_core_dev *dev); void mlx5_unregister_device(struct mlx5_core_dev *dev); -struct mlx5_core_dev *mlx5_get_next_phys_dev(struct mlx5_core_dev *dev); struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev); void mlx5_dev_list_lock(void); void mlx5_dev_list_unlock(void); -- cgit v1.2.3-58-ga151 From 8bf94e6414c9481bfa28269022688ab445d0081d Mon Sep 17 00:00:00 2001 From: Feras Daoud Date: Sat, 19 Mar 2022 21:47:48 +0200 Subject: net/mlx5: Rearm the FW tracer after each tracer event The current design does not arm the tracer if traces are available before the tracer string database is fully loaded, leading to an unfunctional tracer. This fix will rearm the tracer every time the FW triggers tracer event regardless of the tracer strings database status. Fixes: c71ad41ccb0c ("net/mlx5: FW tracer, events handling") Signed-off-by: Feras Daoud Signed-off-by: Roy Novich Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c index eae9aa9c0811..978a2bb8e122 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c @@ -675,6 +675,9 @@ static void mlx5_fw_tracer_handle_traces(struct work_struct *work) if (!tracer->owner) return; + if (unlikely(!tracer->str_db.loaded)) + goto arm; + block_count = tracer->buff.size / TRACER_BLOCK_SIZE_BYTE; start_offset = tracer->buff.consumer_index * TRACER_BLOCK_SIZE_BYTE; @@ -732,6 +735,7 @@ static void mlx5_fw_tracer_handle_traces(struct work_struct *work) &tmp_trace_block[TRACES_PER_BLOCK - 1]); } +arm: mlx5_fw_tracer_arm(dev); } @@ -1136,8 +1140,7 @@ static int fw_tracer_event(struct notifier_block *nb, unsigned long action, void queue_work(tracer->work_queue, &tracer->ownership_change_work); break; case MLX5_TRACER_SUBTYPE_TRACES_AVAILABLE: - if (likely(tracer->str_db.loaded)) - queue_work(tracer->work_queue, &tracer->handle_traces_work); + queue_work(tracer->work_queue, &tracer->handle_traces_work); break; default: mlx5_core_dbg(dev, "FWTracer: Event with unrecognized subtype: sub_type %d\n", -- cgit v1.2.3-58-ga151 From 8fa5e7b20e01042b14f8cd684d2da9b638460c74 Mon Sep 17 00:00:00 2001 From: Mark Bloch Date: Mon, 30 May 2022 10:46:59 +0300 Subject: net/mlx5: fs, fail conflicting actions When combining two steering rules into one check not only do they share the same actions but those actions are also the same. This resolves an issue where when creating two different rules with the same match the actions are overwritten and one of the rules is deleted a FW syndrome can be seen in dmesg. mlx5_core 0000:03:00.0: mlx5_cmd_check:819:(pid 2105): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444) Fixes: 0d235c3fabb7 ("net/mlx5: Add hash table to search FTEs in a flow-group") Signed-off-by: Mark Bloch Reviewed-by: Maor Gottlieb Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 35 +++++++++++++++++++++-- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index fdcf7f529330..21e5c709b2d3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -1574,9 +1574,22 @@ static struct mlx5_flow_rule *find_flow_rule(struct fs_fte *fte, return NULL; } -static bool check_conflicting_actions(u32 action1, u32 action2) +static bool check_conflicting_actions_vlan(const struct mlx5_fs_vlan *vlan0, + const struct mlx5_fs_vlan *vlan1) { - u32 xored_actions = action1 ^ action2; + return vlan0->ethtype != vlan1->ethtype || + vlan0->vid != vlan1->vid || + vlan0->prio != vlan1->prio; +} + +static bool check_conflicting_actions(const struct mlx5_flow_act *act1, + const struct mlx5_flow_act *act2) +{ + u32 action1 = act1->action; + u32 action2 = act2->action; + u32 xored_actions; + + xored_actions = action1 ^ action2; /* if one rule only wants to count, it's ok */ if (action1 == MLX5_FLOW_CONTEXT_ACTION_COUNT || @@ -1593,6 +1606,22 @@ static bool check_conflicting_actions(u32 action1, u32 action2) MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH_2)) return true; + if (action1 & MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT && + act1->pkt_reformat != act2->pkt_reformat) + return true; + + if (action1 & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR && + act1->modify_hdr != act2->modify_hdr) + return true; + + if (action1 & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH && + check_conflicting_actions_vlan(&act1->vlan[0], &act2->vlan[0])) + return true; + + if (action1 & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH_2 && + check_conflicting_actions_vlan(&act1->vlan[1], &act2->vlan[1])) + return true; + return false; } @@ -1600,7 +1629,7 @@ static int check_conflicting_ftes(struct fs_fte *fte, const struct mlx5_flow_context *flow_context, const struct mlx5_flow_act *flow_act) { - if (check_conflicting_actions(flow_act->action, fte->action.action)) { + if (check_conflicting_actions(flow_act, &fte->action)) { mlx5_core_warn(get_dev(&fte->node), "Found two FTEs with conflicting actions\n"); return -EEXIST; -- cgit v1.2.3-58-ga151 From b489a6e5871690735752f8875f411e4d0cd8e5df Mon Sep 17 00:00:00 2001 From: Maxim Mikityanskiy Date: Wed, 8 Jun 2022 18:34:25 +0300 Subject: tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX To embrace possible future optimizations of TLS, rename zerocopy sendfile definitions to more generic ones: * setsockopt: TLS_TX_ZEROCOPY_SENDFILE- > TLS_TX_ZEROCOPY_RO * sock_diag: TLS_INFO_ZC_SENDFILE -> TLS_INFO_ZC_RO_TX RO stands for readonly and emphasizes that the application shouldn't modify the data being transmitted with zerocopy to avoid potential disconnection. Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()") Signed-off-by: Maxim Mikityanskiy Link: https://lore.kernel.org/r/20220608153425.3151146-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski --- include/uapi/linux/tls.h | 4 ++-- net/tls/tls_main.c | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/uapi/linux/tls.h b/include/uapi/linux/tls.h index ac39328eabe7..bb8f80812b0b 100644 --- a/include/uapi/linux/tls.h +++ b/include/uapi/linux/tls.h @@ -39,7 +39,7 @@ /* TLS socket options */ #define TLS_TX 1 /* Set transmit parameters */ #define TLS_RX 2 /* Set receive parameters */ -#define TLS_TX_ZEROCOPY_SENDFILE 3 /* transmit zerocopy sendfile */ +#define TLS_TX_ZEROCOPY_RO 3 /* TX zerocopy (only sendfile now) */ /* Supported versions */ #define TLS_VERSION_MINOR(ver) ((ver) & 0xFF) @@ -161,7 +161,7 @@ enum { TLS_INFO_CIPHER, TLS_INFO_TXCONF, TLS_INFO_RXCONF, - TLS_INFO_ZC_SENDFILE, + TLS_INFO_ZC_RO_TX, __TLS_INFO_MAX, }; #define TLS_INFO_MAX (__TLS_INFO_MAX - 1) diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index b91ddc110786..da176411c1b5 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -544,7 +544,7 @@ static int do_tls_getsockopt(struct sock *sk, int optname, rc = do_tls_getsockopt_conf(sk, optval, optlen, optname == TLS_TX); break; - case TLS_TX_ZEROCOPY_SENDFILE: + case TLS_TX_ZEROCOPY_RO: rc = do_tls_getsockopt_tx_zc(sk, optval, optlen); break; default: @@ -731,7 +731,7 @@ static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval, optname == TLS_TX); release_sock(sk); break; - case TLS_TX_ZEROCOPY_SENDFILE: + case TLS_TX_ZEROCOPY_RO: lock_sock(sk); rc = do_tls_setsockopt_tx_zc(sk, optval, optlen); release_sock(sk); @@ -970,7 +970,7 @@ static int tls_get_info(const struct sock *sk, struct sk_buff *skb) goto nla_failure; if (ctx->tx_conf == TLS_HW && ctx->zerocopy_sendfile) { - err = nla_put_flag(skb, TLS_INFO_ZC_SENDFILE); + err = nla_put_flag(skb, TLS_INFO_ZC_RO_TX); if (err) goto nla_failure; } @@ -994,7 +994,7 @@ static size_t tls_get_info_size(const struct sock *sk) nla_total_size(sizeof(u16)) + /* TLS_INFO_CIPHER */ nla_total_size(sizeof(u16)) + /* TLS_INFO_RXCONF */ nla_total_size(sizeof(u16)) + /* TLS_INFO_TXCONF */ - nla_total_size(0) + /* TLS_INFO_ZC_SENDFILE */ + nla_total_size(0) + /* TLS_INFO_ZC_RO_TX */ 0; return size; -- cgit v1.2.3-58-ga151 From 03d5005ff7356a9952a21da20a3d7828fd559fee Mon Sep 17 00:00:00 2001 From: Fei Qin Date: Wed, 8 Jun 2022 11:29:00 +0200 Subject: nfp: avoid unnecessary check warnings in nfp_app_get_vf_config nfp_net_sriov_check is added in nfp_app_get_vf_config which intends to ensure ivi->vlan_proto and ivi->max_tx_rate/min_tx_rate can be read from VF config table only when firmware supports corresponding capability. However, "nfp_app_get_vf_config" can be called by commands like "ip a", "ip link set $DEV up" and "ip link set $DEV vf $NUM vlan $param" (with VF). When using commands above, many warnings "ndo_set_vf_ not supported" would appear if firmware doesn't support VF rate limit and 802.1ad VLAN assingment. If more VFs are created, things could get worse. Thus, this patch add an extra bool parameter for nfp_net_sriov_check to enable/disable the cap check warning report. Unnecessary warnings in nfp_app_get_vf_config can be avoided. Valid warnings in kinds of vf setting function can be reserved. Fixes: e0d0e1fdf1ed ("nfp: VF rate limit support") Fixes: 59359597b010 ("nfp: support 802.1ad VLAN assingment to VF") Signed-off-by: Fei Qin Signed-off-by: Yinjun Zhang Signed-off-by: Louis Peens Signed-off-by: Simon Horman Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/nfp_net_sriov.c | 28 ++++++++++++---------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_sriov.c b/drivers/net/ethernet/netronome/nfp/nfp_net_sriov.c index 54af30961351..6eeeb0fda91f 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_sriov.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_sriov.c @@ -15,7 +15,7 @@ #include "nfp_net_sriov.h" static int -nfp_net_sriov_check(struct nfp_app *app, int vf, u16 cap, const char *msg) +nfp_net_sriov_check(struct nfp_app *app, int vf, u16 cap, const char *msg, bool warn) { u16 cap_vf; @@ -24,12 +24,14 @@ nfp_net_sriov_check(struct nfp_app *app, int vf, u16 cap, const char *msg) cap_vf = readw(app->pf->vfcfg_tbl2 + NFP_NET_VF_CFG_MB_CAP); if ((cap_vf & cap) != cap) { - nfp_warn(app->pf->cpp, "ndo_set_vf_%s not supported\n", msg); + if (warn) + nfp_warn(app->pf->cpp, "ndo_set_vf_%s not supported\n", msg); return -EOPNOTSUPP; } if (vf < 0 || vf >= app->pf->num_vfs) { - nfp_warn(app->pf->cpp, "invalid VF id %d\n", vf); + if (warn) + nfp_warn(app->pf->cpp, "invalid VF id %d\n", vf); return -EINVAL; } @@ -65,7 +67,7 @@ int nfp_app_set_vf_mac(struct net_device *netdev, int vf, u8 *mac) unsigned int vf_offset; int err; - err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_MAC, "mac"); + err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_MAC, "mac", true); if (err) return err; @@ -101,7 +103,7 @@ int nfp_app_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos, u32 vlan_tag; int err; - err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_VLAN, "vlan"); + err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_VLAN, "vlan", true); if (err) return err; @@ -115,7 +117,7 @@ int nfp_app_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos, } /* Check if fw supports or not */ - err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_VLAN_PROTO, "vlan_proto"); + err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_VLAN_PROTO, "vlan_proto", true); if (err) is_proto_sup = false; @@ -149,7 +151,7 @@ int nfp_app_set_vf_rate(struct net_device *netdev, int vf, u32 vf_offset, ratevalue; int err; - err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_RATE, "rate"); + err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_RATE, "rate", true); if (err) return err; @@ -181,7 +183,7 @@ int nfp_app_set_vf_spoofchk(struct net_device *netdev, int vf, bool enable) int err; err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_SPOOF, - "spoofchk"); + "spoofchk", true); if (err) return err; @@ -205,7 +207,7 @@ int nfp_app_set_vf_trust(struct net_device *netdev, int vf, bool enable) int err; err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_TRUST, - "trust"); + "trust", true); if (err) return err; @@ -230,7 +232,7 @@ int nfp_app_set_vf_link_state(struct net_device *netdev, int vf, int err; err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_LINK_STATE, - "link_state"); + "link_state", true); if (err) return err; @@ -265,7 +267,7 @@ int nfp_app_get_vf_config(struct net_device *netdev, int vf, u8 flags; int err; - err = nfp_net_sriov_check(app, vf, 0, ""); + err = nfp_net_sriov_check(app, vf, 0, "", true); if (err) return err; @@ -285,13 +287,13 @@ int nfp_app_get_vf_config(struct net_device *netdev, int vf, ivi->vlan = FIELD_GET(NFP_NET_VF_CFG_VLAN_VID, vlan_tag); ivi->qos = FIELD_GET(NFP_NET_VF_CFG_VLAN_QOS, vlan_tag); - if (!nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_VLAN_PROTO, "vlan_proto")) + if (!nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_VLAN_PROTO, "vlan_proto", false)) ivi->vlan_proto = htons(FIELD_GET(NFP_NET_VF_CFG_VLAN_PROT, vlan_tag)); ivi->spoofchk = FIELD_GET(NFP_NET_VF_CFG_CTRL_SPOOF, flags); ivi->trusted = FIELD_GET(NFP_NET_VF_CFG_CTRL_TRUST, flags); ivi->linkstate = FIELD_GET(NFP_NET_VF_CFG_CTRL_LINK_STATE, flags); - err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_RATE, "rate"); + err = nfp_net_sriov_check(app, vf, NFP_NET_VF_CFG_MB_CAP_RATE, "rate", false); if (!err) { rate = readl(app->pf->vfcfg_tbl2 + vf_offset + NFP_NET_VF_CFG_RATE); -- cgit v1.2.3-58-ga151 From a0b843340dae704e17c1ddfad0f85c583c36757f Mon Sep 17 00:00:00 2001 From: Etienne van der Linde Date: Wed, 8 Jun 2022 11:29:01 +0200 Subject: nfp: flower: restructure flow-key for gre+vlan combination Swap around the GRE and VLAN parts in the flow-key offloaded by the driver to fit in with other tunnel types and the firmware. Without this change used cases with GRE+VLAN on the outer header does not get offloaded as the flow-key mismatches what the firmware expect. Fixes: 0d630f58989a ("nfp: flower: add support to offload QinQ match") Fixes: 5a2b93041646 ("nfp: flower-ct: compile match sections of flow_payload") Signed-off-by: Etienne van der Linde Signed-off-by: Louis Peens Signed-off-by: Yinjun Zhang Signed-off-by: Simon Horman Signed-off-by: Jakub Kicinski --- .../net/ethernet/netronome/nfp/flower/conntrack.c | 32 +++++++++++----------- drivers/net/ethernet/netronome/nfp/flower/match.c | 16 +++++------ 2 files changed, 24 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/flower/conntrack.c b/drivers/net/ethernet/netronome/nfp/flower/conntrack.c index 443a5d6eb57b..7c31a46195b2 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/conntrack.c +++ b/drivers/net/ethernet/netronome/nfp/flower/conntrack.c @@ -507,6 +507,11 @@ nfp_fl_calc_key_layers_sz(struct nfp_fl_key_ls in_key_ls, uint16_t *map) key_size += sizeof(struct nfp_flower_ipv6); } + if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_QINQ) { + map[FLOW_PAY_QINQ] = key_size; + key_size += sizeof(struct nfp_flower_vlan); + } + if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_GRE) { map[FLOW_PAY_GRE] = key_size; if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6) @@ -515,11 +520,6 @@ nfp_fl_calc_key_layers_sz(struct nfp_fl_key_ls in_key_ls, uint16_t *map) key_size += sizeof(struct nfp_flower_ipv4_gre_tun); } - if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_QINQ) { - map[FLOW_PAY_QINQ] = key_size; - key_size += sizeof(struct nfp_flower_vlan); - } - if ((in_key_ls.key_layer & NFP_FLOWER_LAYER_VXLAN) || (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_GENEVE)) { map[FLOW_PAY_UDP_TUN] = key_size; @@ -758,6 +758,17 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) } } + if (NFP_FLOWER_LAYER2_QINQ & key_layer.key_layer_two) { + offset = key_map[FLOW_PAY_QINQ]; + key = kdata + offset; + msk = mdata + offset; + for (i = 0; i < _CT_TYPE_MAX; i++) { + nfp_flower_compile_vlan((struct nfp_flower_vlan *)key, + (struct nfp_flower_vlan *)msk, + rules[i]); + } + } + if (key_layer.key_layer_two & NFP_FLOWER_LAYER2_GRE) { offset = key_map[FLOW_PAY_GRE]; key = kdata + offset; @@ -798,17 +809,6 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) } } - if (NFP_FLOWER_LAYER2_QINQ & key_layer.key_layer_two) { - offset = key_map[FLOW_PAY_QINQ]; - key = kdata + offset; - msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) { - nfp_flower_compile_vlan((struct nfp_flower_vlan *)key, - (struct nfp_flower_vlan *)msk, - rules[i]); - } - } - if (key_layer.key_layer & NFP_FLOWER_LAYER_VXLAN || key_layer.key_layer_two & NFP_FLOWER_LAYER2_GENEVE) { offset = key_map[FLOW_PAY_UDP_TUN]; diff --git a/drivers/net/ethernet/netronome/nfp/flower/match.c b/drivers/net/ethernet/netronome/nfp/flower/match.c index 193a167a6762..e01430139b6d 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/match.c +++ b/drivers/net/ethernet/netronome/nfp/flower/match.c @@ -625,6 +625,14 @@ int nfp_flower_compile_flow_match(struct nfp_app *app, msk += sizeof(struct nfp_flower_ipv6); } + if (NFP_FLOWER_LAYER2_QINQ & key_ls->key_layer_two) { + nfp_flower_compile_vlan((struct nfp_flower_vlan *)ext, + (struct nfp_flower_vlan *)msk, + rule); + ext += sizeof(struct nfp_flower_vlan); + msk += sizeof(struct nfp_flower_vlan); + } + if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_GRE) { if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6) { struct nfp_flower_ipv6_gre_tun *gre_match; @@ -660,14 +668,6 @@ int nfp_flower_compile_flow_match(struct nfp_app *app, } } - if (NFP_FLOWER_LAYER2_QINQ & key_ls->key_layer_two) { - nfp_flower_compile_vlan((struct nfp_flower_vlan *)ext, - (struct nfp_flower_vlan *)msk, - rule); - ext += sizeof(struct nfp_flower_vlan); - msk += sizeof(struct nfp_flower_vlan); - } - if (key_ls->key_layer & NFP_FLOWER_LAYER_VXLAN || key_ls->key_layer_two & NFP_FLOWER_LAYER2_GENEVE) { if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6) { -- cgit v1.2.3-58-ga151 From a3bd2102e464202b58d57390a538d96f57ffc361 Mon Sep 17 00:00:00 2001 From: Andrea Mayer Date: Wed, 8 Jun 2022 11:19:17 +0200 Subject: net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdev Commit 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") adds a new entry (flowi_l3mdev) in the common flow struct used for indicating the l3mdev index for later rule and table matching. The l3mdev_update_flow() has been adapted to properly set the flowi_l3mdev based on the flowi_oif/flowi_iif. In fact, when a valid flowi_iif is supplied to the l3mdev_update_flow(), this function can update the flowi_l3mdev entry only if it has not yet been set (i.e., the flowi_l3mdev entry is equal to 0). The SRv6 End.DT6 behavior in VRF mode leverages a VRF device in order to force the routing lookup into the associated routing table. This routing operation is performed by seg6_lookup_any_nextop() preparing a flowi6 data structure used by ip6_route_input_lookup() which, in turn, (indirectly) invokes l3mdev_update_flow(). However, seg6_lookup_any_nexthop() does not initialize the new flowi_l3mdev entry which is filled with random garbage data. This prevents l3mdev_update_flow() from properly updating the flowi_l3mdev with the VRF index, and thus SRv6 End.DT6 (VRF mode)/DT46 behaviors are broken. This patch correctly initializes the flowi6 instance allocated and used by seg6_lookup_any_nexhtop(). Specifically, the entire flowi6 instance is wiped out: in case new entries are added to flowi/flowi6 (as happened with the flowi_l3mdev entry), we should no longer have incorrectly initialized values. As a result of this operation, the value of flowi_l3mdev is also set to 0. The proposed fix can be tested easily. Starting from the commit referenced in the Fixes, selftests [1],[2] indicate that the SRv6 End.DT6 (VRF mode)/DT46 behaviors no longer work correctly. By applying this patch, those behaviors are back to work properly again. [1] - tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh [2] - tools/testing/selftests/net/srv6_end_dt6_l3vpn_test.sh Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") Reported-by: Anton Makarov Signed-off-by: Andrea Mayer Reviewed-by: David Ahern Link: https://lore.kernel.org/r/20220608091917.20345-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski --- net/ipv6/seg6_local.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 9fbe243a0e81..98a34287439c 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -218,6 +218,7 @@ seg6_lookup_any_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, struct flowi6 fl6; int dev_flags = 0; + memset(&fl6, 0, sizeof(fl6)); fl6.flowi6_iif = skb->dev->ifindex; fl6.daddr = nhaddr ? *nhaddr : hdr->daddr; fl6.saddr = hdr->saddr; -- cgit v1.2.3-58-ga151