summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-26macsec: ensure rx_sa is set when validation is disabledBeniamino Galvani
macsec_decrypt() is not called when validation is disabled and so macsec_skb_cb(skb)->rx_sa is not set; but it is used later in macsec_post_decrypt(), ensure that it's always initialized. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Beniamino Galvani <bgalvani@redhat.com> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26Merge branch 'tipc-netlink-monitor-updates'David S. Miller
Parthasarathy Bhuvaragan says: ==================== tipc: netlink updates for neighbour monitor This series contains the updates to configure and read the attributes for neighbour monitor. v2: rebase on top of net-next ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26tipc: dump monitor attributesParthasarathy Bhuvaragan
In this commit, we dump the monitor attributes when queried. The link monitor attributes are separated into two kinds: 1. general attributes per bearer 2. specific attributes per node/peer This style resembles the socket attributes and the nametable publications per socket. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26tipc: add a function to get the bearer nameParthasarathy Bhuvaragan
Introduce a new function to get the bearer name from its id. This is used in subsequent commit. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26tipc: get monitor threshold for the clusterParthasarathy Bhuvaragan
In this commit, we add support to fetch the configured cluster monitoring threshold. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26tipc: make cluster size threshold for monitoring configurableParthasarathy Bhuvaragan
In this commit, we introduce support to configure the minimum threshold to activate the new link monitoring algorithm. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26tipc: introduce constants for tipc address validationParthasarathy Bhuvaragan
In this commit, we introduce defines for tipc address size, offset and mask specification for Zone.Cluster.Node. There is no functional change in this commit. Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-26net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in ↵He Chunhui
neigh_update() NUD_STALE is used when the caller(e.g. arp_process()) can't guarantee neighbour reachability. If the entry was NUD_VALID and lladdr is unchanged, the entry state should not be changed. Currently the code puts an extra "NUD_CONNECTED" condition. So if old state was NUD_DELAY or NUD_PROBE (they are NUD_VALID but not NUD_CONNECTED), the state can be changed to NUD_STALE. This may cause problem. Because NUD_STALE lladdr doesn't guarantee reachability, when we send traffic, the state will be changed to NUD_DELAY. In normal case, if we get no confirmation (by dst_confirm()), we will change the state to NUD_PROBE and send probe traffic. But now the state may be reset to NUD_STALE again(e.g. by broadcast ARP packets), so the probe traffic will not be sent. This situation may happen again and again, and packets will be sent to an non-reachable lladdr forever. The fix is to remove the "NUD_CONNECTED" condition. After that the "NEIGH_UPDATE_F_WEAK_OVERRIDE" condition (used by IPv6) in that branch will be redundant, so remove it. This change may increase probe traffic, but it's essential since NUD_STALE lladdr is unreliable. To ensure correctness, we prefer to resolve lladdr, when we can't get confirmation, even while remote packets try to set NUD_STALE state. Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn> Signed-off-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'xgene-fix-mod-crash-and-1g-hotplug'David S. Miller
Iyappan Subramanian says: ==================== drivers: net: xgene: Fix module crash and 1G hot-plug This patchset addresses the following issues, 1. Fixes the kernel crash when the driver loaded as an kernel module - by fixing hardware cleanups and rearrange kernel API calls 2. Hot-plug issue on the SGMII 1G interface - by adding a driver for MDIO management Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> --- v7: Address review comments from v6 - fixed kbuild warnings - unmapped DMA memory on xgene_enet_delete_bufpool() - delete descriptor rings and buffer pools on cle_init() failure - fixed error deconstruction path on probe v6: Address review comments from v5 - changed to use devm_ioremap_resource - changed to return PTR_ERR(clk) on failure - cleaned up and removed indirections - exported mdio read/write and phy_register functions - changed mii_bus is to indicate interface instance - changed to call the exported mdio read/write and phy_register functions v5: Address review comments from v4 - Fixed clock reset sequence by adding delay - Fixed clock count by adding clk_unprepare_disable() in port shutdown v4: Address review comments from v3 - Reorganized into smaller patches - Added wrapper functions for sgmii_control_reset and sgmii_tbi_control_reset - Removed clk_get warning info - mdio: Changed the order of 'if' statements and removed the 'else' statement - mdio: Removed the mdio_read(write) indirection wrapper functions - ethtool: Fixed SGMII 1G get_settings and set_settings - Documentation: dtb: Added MDIO node information - MAINTAINERS: Added MDIO driver and documentation path v3: Address review comments from v2 - Add comment about hardware clock reset sequence on xgene_mdio_reset v2: Address review comments from v1 - Fixed patch 1 compilation error - Fixed mdio@1f610000 xge0clk reference - Squashed dtb patches - Added PORT_OFFSET macro v1: - Initial version ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25MAINTAINERS: xgene: Add driver and documentation pathIyappan Subramanian
Added path to the MDIO driver and Documentation file. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Documentation: dtb: xgene: Add MDIO nodeIyappan Subramanian
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25dtb: xgene: Add MDIO nodeIyappan Subramanian
Added mdio node for mdio driver. Also added phy-handle reference to the ethernet nodes. Removed unused clock node from storm sgenet1. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: ethtool: Use phy_ethtool_gset and ssetIyappan Subramanian
Changed SGMII 1G get_settings to use phy_ethtool_gset. Changed SGMII 1G set_settings to use phy_ethtool_sset. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Use exported functionsIyappan Subramanian
This patch reuses the mdio read/write and phy_register functions and removed the local definitions. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Enable MDIO driverIyappan Subramanian
This patch enables MDIO driver by, - Selecting MDIO_XGENE - Changed open and close to use phy_start and phy_stop - Changed to use mac_ops->tx(rx)_enable and tx(rx)_disable Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Add backward compatibilityIyappan Subramanian
This patch adds xgene_enet_check_phy_hanlde() function that checks whether MDIO driver is probed successfully and sets pdata->mdio_driver to true. If MDIO driver is not probed, ethernet driver falls back to backward compatibility mode. Since enum xgene_enet_cmd is used by MDIO driver, removing this from ethernet driver. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: phy: xgene: Add MDIO driverIyappan Subramanian
Currently, SGMII based 1G rely on the hardware registers for link state and sometimes it's not reliable. To get most accurate link state, this interface has to use the MDIO bus to poll the PHY. In X-Gene SoC, MDIO bus is shared across RGMII and SGMII based 1G interfaces, so adding this driver to manage MDIO bus. This driver registers the mdio bus and registers the PHYs connected to it. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Fix module unload crash - clkrst sequenceIyappan Subramanian
This patch fixes clock reset sequence. - Added clock reset sequence for ACPI - Added delay in clock reset sequence to make sure pulse is generated - Added clk_unprepare_disable() in port shutdown to make sure clock increment/decrement counts are matching - Removed MII_MGMT_CONFIG programming, since it is not required - Fixed programming XGENET_CONFIG_REG to enable SGMII mode Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Fix module unload crash - change sw sequenceIyappan Subramanian
When the driver is configured as kernel module and when it gets unloaded and reloaded, kernel crash was observed. This patch addresses the software cleanup by doing the following, - Moved register_netdev call after hardware is ready - Since ndev is not ready, added set_irq_name to set irq name - Since ndev is not ready, changed mdio_bus->parent to pdev->dev - Replaced netif_start(stop)_queue by netif_tx_start(stop)_queues - Removed napi_del call since it's called by free_netdev - Added dev_close call, within remove - Added shutdown callback - Changed to use dmam_ APIs Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Fix module unload crash - hw resource cleanupIyappan Subramanian
When the driver is configured as kernel module and when it gets unloaded and reloaded, kernel crash was observed. This patch address the hardware resource cleanups by doing the following, - Added mac_ops->clear() to do prefetch buffer clean up - Fixed delete freepool buffers logic - Reordered mac_enable and mac_disable - Added Tx completion ring free - Moved down delete_desc_rings after ring cleanup Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Separate set_speed from mac_initIyappan Subramanian
Since mac_init is too heavy to be called when the link changes, moved the speed_set configuration to a new function and added mac_ops->set_speed function pointer. This function will be called from adjust_link callback. Added cases for 10/100 support for SGMII based 1G interface. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'refactor-tc_action-structs'David S. Miller
Cong Wang says: ==================== net_sched: refactor tc action structures These two patches factor out the struct tcf_common. v2: fix a compile warning ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net_sched: get rid of struct tcf_commonWANG Cong
After the previous patch, struct tc_action should be enough to represent the generic tc action, tcf_common is not necessary any more. This patch gets rid of it to make tc action code more readable. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net_sched: move tc_action into tcf_commonWANG Cong
struct tc_action is confusing, currently we use it for two purposes: 1) Pass in arguments and carry out results from helper functions 2) A generic representation for tc actions The first one is error-prone, since we need to make sure we don't miss anything. This patch aims to get rid of this use, by moving tc_action into tcf_common, so that they are allocated together in hashtable and can be cast'ed easily. And together with the following patch, we could really make tc_action a generic representation for all tc actions and each type of action can inherit from it. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25ipvlan: Scrub skb before crossing the namespace boundryMahesh Bandewar
The earlier patch c3aaa06d5a63 (ipvlan: scrub skb before routing in L3 mode.) did this but only for TX path in L3 mode. This patch extends it for both the modes for TX/RX path. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'bnxt_en-improve-ntuple-and-new-IDs'David S. Miller
Michael Chan says: ==================== bnxt_en: Improve ntuple filters and add new IDs. Improve ntuple filters and add some new PCI device IDs. Please review for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bnxt_en: Add new NPAR and dual media device IDs.Michael Chan
Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bnxt_en: Log a message, if enabling NTUPLE filtering fails.Vasundhara Volam
If there are not enough resources to enable ntuple filtering, log a warning message. v2: Use single message and add missing newline. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bnxt_en: Improve ntuple filters by checking destination MAC address.Michael Chan
Include the destination MAC address in the ntuple filter structure. The current code assumes that the destination MAC address is always the MAC address of the NIC. This may not be true if there are macvlans, for example. Add destination MAC address checking and configure the filter correctly using the correct index for the destination MAC address. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25qed: Fix setting/clearing bit in completion bitmapManish Chopra
Slowpath completion handling is incorrectly changing SPQ_RING_SIZE bits instead of a single one. Fixes: 76a9a3642a0b ("qed: fix handling of concurrent ramrods") Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25udp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skbDaniel Borkmann
After a612769774a3 ("udp: prevent bugcheck if filter truncates packet too much"), there followed various other fixes for similar cases such as f4979fcea7fd ("rose: limit sk_filter trim to payload"). Latter introduced a new helper sk_filter_trim_cap(), where we can pass the trim limit directly to the socket filter handling. Make use of it here as well with sizeof(struct udphdr) as lower cap limit and drop the extra skb->len test in UDP's input path. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25caif-hsi: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar
alloc_workqueue replaces deprecated create_singlethread_workqueue(). A dedicated workqueue has been used since the workitems are being used on a packet tx/rx path. Hence, WQ_MEM_RECLAIM has been set to guarantee forward progress under memory pressure. An ordered workqueue has been used since workitems &cfhsi->wake_up_work and &cfhsi->wake_down_work cannot be run concurrently. Calls to flush_workqueue() before destroy_workqueue() have been dropped since destroy_workqueue() itself calls drain_workqueue() which flushes repeatedly till the workqueue becomes empty. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'bpf-probe-write-user'David S. Miller
Sargun Dhillon says: ==================== bpf: add bpf_probe_write_user helper & example This patch series contains two patches that add support for a probe_write helper to BPF programs. This allows them to manipulate user memory during the course of tracing. The second patch in the series has an example that uses it, in one the intended ways to divert execution. Thanks to Alexei Starovoitov, and Daniel Borkmann for being patient, review, and helping me get familiar with the code base. I've made changes based on their recommendations. This helper should be considered for experimental usage and debugging, so we print a warning to dmesg when it is along with the command and pid when someone tries to install a proglet that uses it. A follow-up patchset will contain a mechanism to verify the safety of the probe beyond what was done by hand. ---- v1->v2: restrict writing to user space, as opposed to globally v2->v3: Fixed formatting issues v3->v4: Rename copy_to_user -> bpf_probe_write Simplify checking of whether or not it's safe to write Add warnings to dmesg v4->v5: Raise warning level Cleanup location of warning code Make test fail when helper is broken v5->v6: General formatting cleanup Rename bpf_probe_write -> bpf_probe_write_user v6->v7: More formatting cleanup. Clarifying a few comments Clarified log message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25samples/bpf: Add test/example of using bpf_probe_write_user bpf helperSargun Dhillon
This example shows using a kprobe to act as a dnat mechanism to divert traffic for arbitrary endpoints. It rewrite the arguments to a syscall while they're still in userspace, and before the syscall has a chance to copy the argument into kernel space. Although this is an example, it also acts as a test because the mapped address is 255.255.255.255:555 -> real address, and that's not a legal address to connect to. If the helper is broken, the example will fail on the intermediate steps, as well as the final step to verify the rewrite of userspace memory succeeded. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bpf: Add bpf_probe_write_user BPF helper to be called in tracersSargun Dhillon
This allows user memory to be written to during the course of a kprobe. It shouldn't be used to implement any kind of security mechanism because of TOC-TOU attacks, but rather to debug, divert, and manipulate execution of semi-cooperative processes. Although it uses probe_kernel_write, we limit the address space the probe can write into by checking the space with access_ok. We do this as opposed to calling copy_to_user directly, in order to avoid sleeping. In addition we ensure the threads's current fs / segment is USER_DS and the thread isn't exiting nor a kernel thread. Given this feature is meant for experiments, and it has a risk of crashing the system, and running programs, we print a warning on when a proglet that attempts to use this helper is installed, along with the pid and process name. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/mlx4_core: Check device state before unregistering itAlex Vesker
Verify that the device state is registered before un-registering it. This check is required to prevent an OOPS on flows that do re-registration of the device and its previous state was unregistered. Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25mlxsw: spectrum: Fix compilation error when CLS_ACT isn't setIdo Schimmel
When CONFIG_NET_CLS_ACT isn't set 'struct tcf_exts' has no member named 'actions' and we therefore must not access it. Otherwise compilation fails. Fix this by introducing a new macro similar to tc_no_actions(), which always returns 'false' if CONFIG_NET_CLS_ACT isn't set. Fixes: 763b4b70afcd ("mlxsw: spectrum: Add support in matchall mirror TC offloading") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net: davinci_cpdma: remove excessive dump of register values to kernel logUwe Kleine-König
Such a big dump of register values is hardly useful on a production system. Another downside of the now removed functions is that calling emac_dump_regs resulted in at least 87 calls to dev_info while holding a spinlock and having irqs off which is a big source of latency. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25gtp: #define #define _GTP_H_ and not #define _GTP_HColin Ian King
Fix clang build warning: ./include/net/gtp.h:1:9: warning: '_GTP_H_' is used as a header guard here, followed by #define of a different macro [-Wheader-guard] fix by defining _GTP_H_ and not _GTP_H Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'mlx5-minimum-inline-header-mode'David S. Miller
Saeed Mahameed says: ==================== Mellanox 100G mlx5 minimum inline header mode This small series from Hadar adds the support for minimum inline header mode query in mlx5e NIC driver. Today on TX the driver copies to the HW descriptor only up to L2 header which is the default required mode and sufficient for today's needs. The header in the HW descriptor is used for HW loopback steering decision, without it packets will go directly to the wire with no questions asked. For TX loopback steering according to L2/L3/L4 headers, ConnectX-4 requires to copy the corresponding headers into the send queue(SQ) WQE HW descriptor so it can decide whether to loop it back or to forward to wire. For legacy E-Switch mode only L2 headers copy is required. For advanced steering (E-Switch offloads) more header layers may be required to be copied, the required mode will be advertised by FW to each VF and PF according to the corresponding E-Switch configuration. Changes V2: - Allocate query_nic_vport_context_out on the stack ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/mlx5e: Query minimum required header copy during xmitHadar Hen Zion
Add support for query the minimum inline mode from the Firmware. It is required for correct TX steering according to L3/L4 packet headers. Each send queue (SQ) has inline mode that defines the minimal required headers that needs to be copied into the SQ WQE. The driver asks the Firmware for the wqe_inline_mode device capability value. In case the device capability defined as "vport context" the driver must check the reported min inline mode from the vport context before creating its SQs. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/mlx5e: Check the minimum inline header mode before xmitHadar Hen Zion
Each send queue (SQ) has inline mode that defines the minimal required inline headers in the SQ WQE. Before sending each packet check that the minimum required headers on the WQE are copied. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/sctp: terminate rhashtable walk correctlyVegard Nossum
I was seeing a lot of these: BUG: sleeping function called from invalid context at mm/slab.h:388 in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2 Preemption disabled at:[<ffffffff819bcd46>] rhashtable_walk_start+0x46/0x150 [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280 [<ffffffff83295722>] _raw_spin_lock+0x12/0x40 [<ffffffff811aac87>] console_unlock+0x2f7/0x930 [<ffffffff811ab5bb>] vprintk_emit+0x2fb/0x520 [<ffffffff811aba6a>] vprintk_default+0x1a/0x20 [<ffffffff812c171a>] printk+0x94/0xb0 [<ffffffff811d6ed0>] print_stack_trace+0xe0/0x170 [<ffffffff8115835e>] ___might_sleep+0x3be/0x460 [<ffffffff81158490>] __might_sleep+0x90/0x1a0 [<ffffffff8139b823>] kmem_cache_alloc+0x153/0x1e0 [<ffffffff819bca1e>] rhashtable_walk_init+0xfe/0x2d0 [<ffffffff82ec64de>] sctp_transport_walk_start+0x1e/0x60 [<ffffffff82edd8ad>] sctp_transport_seq_start+0x4d/0x150 [<ffffffff8143a82b>] seq_read+0x27b/0x1180 [<ffffffff814f97fc>] proc_reg_read+0xbc/0x180 [<ffffffff813d471b>] __vfs_read+0xdb/0x610 [<ffffffff813d4d3a>] vfs_read+0xea/0x2d0 [<ffffffff813d615b>] SyS_pread64+0x11b/0x150 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff832960a5>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff Apparently we always need to call rhashtable_walk_stop(), even when rhashtable_walk_start() fails: * rhashtable_walk_start - Start a hash table walk * @iter: Hash table iterator * * Start a hash table walk. Note that we take the RCU lock in all * cases including when we return an error. So you must always call * rhashtable_walk_stop to clean up. otherwise we never call rcu_read_unlock() and we get the splat above. Fixes: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag") See-also: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag") See-also: f2dba9c6 ("rhashtable: Introduce rhashtable_walk_*") Cc: Xin Long <lucien.xin@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2016-07-22 This series contains updates to ixgbe and ixgbevf only. Emil fixes the NACK check in ixgbevf_set_uc_addr_vf() for instances where the index is not equal to zero. Fixes an issue where mac->ops.setup_fc can be NULL for backplanes which can cause the driver to crash on load. Don fixes the second parameter of the LED functions, which is the index to the LED we are interested in affecting. Fixed variable to store register reads to unsigned integer. Adds support for the new x553 hardware into ixgbevf. Fixed a missing rtnl lock around ixgbevf_reinit_locked(). Fixed an issue where in ixgbevf_reset_subtask() was not verifying that the port has been removed. Cleans up the initial crosstalk fix, since the SFP that indicates the presence of a SFP+ module changes between hardware types. Babu Moger fixes typo in freeing IRQ, since the array subscript increments after the execution of the statement. Wei Yongjun adds the missing destroy_workqueue() before returning from ixgbe_init_module() in the error handling case. Tony adds range checking for setting the MTU from the VF, where the PF can return a NACK but this was not passed on to the VF, so propagate the results from the PF to the VF so errors can be reported. Consolidates mailbox read and write functions, since the recent changes to ixgbevf_write_msg_read_ack(), other functions are performing the same operations done here. Colin Ian King removes a redundant check on ret_val, since ret_val has not changed since the previous check. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/irda: fix NULL pointer dereference on memory allocation failureVegard Nossum
I ran into this: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000 RIP: 0010:[<ffffffff82bbf066>] [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710 RSP: 0018:ffff880111747bb8 EFLAGS: 00010286 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358 RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048 RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000 R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000 FS: 00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0 Stack: 0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220 ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232 ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000 Call Trace: [<ffffffff82bca542>] irda_connect+0x562/0x1190 [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0 [<ffffffff825b4489>] SyS_connect+0x9/0x10 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25 Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74 RIP [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710 RSP <ffff880111747bb8> ---[ end trace 4cda2588bc055b30 ]--- The problem is that irda_open_tsap() can fail and leave self->tsap = NULL, and then irttp_connect_request() almost immediately dereferences it. Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25sctp: also point GSO head_skb to the sk when it's availableMarcelo Ricardo Leitner
The head skb for GSO packets won't travel through the inner depths of SCTP stack as it doesn't contain any chunks on it. That means skb->sk doesn't get set and then when sctp_recvmsg() calls sctp_inet6_skb_msgname() on the head_skb it panics, as this last needs to check flags at the socket (sp->v4mapped). The fix is to initialize skb->sk for th head skb once we are able to do it. That is, when the first chunk is processed. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25sctp: fix BH handling on socket backlogMarcelo Ricardo Leitner
Now that the backlog processing is called with BH enabled, we have to disable BH before taking the socket lock via bh_lock_sock() otherwise it may dead lock: sctp_backlog_rcv() bh_lock_sock(sk); if (sock_owned_by_user(sk)) { if (sk_add_backlog(sk, skb, sk->sk_rcvbuf)) sctp_chunk_free(chunk); else backloged = 1; } else sctp_inq_push(inqueue, chunk); bh_unlock_sock(sk); while sctp_inq_push() was disabling/enabling BH, but enabling BH triggers pending softirq, which then may try to re-lock the socket in sctp_rcv(). [ 219.187215] <IRQ> [ 219.187217] [<ffffffff817ca3e0>] _raw_spin_lock+0x20/0x30 [ 219.187223] [<ffffffffa041888c>] sctp_rcv+0x48c/0xba0 [sctp] [ 219.187225] [<ffffffff816e7db2>] ? nf_iterate+0x62/0x80 [ 219.187226] [<ffffffff816f1b14>] ip_local_deliver_finish+0x94/0x1e0 [ 219.187228] [<ffffffff816f1e1f>] ip_local_deliver+0x6f/0xf0 [ 219.187229] [<ffffffff816f1a80>] ? ip_rcv_finish+0x3b0/0x3b0 [ 219.187230] [<ffffffff816f17a8>] ip_rcv_finish+0xd8/0x3b0 [ 219.187232] [<ffffffff816f2122>] ip_rcv+0x282/0x3a0 [ 219.187233] [<ffffffff810d8bb6>] ? update_curr+0x66/0x180 [ 219.187235] [<ffffffff816abac4>] __netif_receive_skb_core+0x524/0xa90 [ 219.187236] [<ffffffff810d8e00>] ? update_cfs_shares+0x30/0xf0 [ 219.187237] [<ffffffff810d557c>] ? __enqueue_entity+0x6c/0x70 [ 219.187239] [<ffffffff810dc454>] ? enqueue_entity+0x204/0xdf0 [ 219.187240] [<ffffffff816ac048>] __netif_receive_skb+0x18/0x60 [ 219.187242] [<ffffffff816ad1ce>] process_backlog+0x9e/0x140 [ 219.187243] [<ffffffff816ac8ec>] net_rx_action+0x22c/0x370 [ 219.187245] [<ffffffff817cd352>] __do_softirq+0x112/0x2e7 [ 219.187247] [<ffffffff817cc3bc>] do_softirq_own_stack+0x1c/0x30 [ 219.187247] <EOI> [ 219.187248] [<ffffffff810aa1c8>] do_softirq.part.14+0x38/0x40 [ 219.187249] [<ffffffff810aa24d>] __local_bh_enable_ip+0x7d/0x80 [ 219.187254] [<ffffffffa0408428>] sctp_inq_push+0x68/0x80 [sctp] [ 219.187258] [<ffffffffa04190f1>] sctp_backlog_rcv+0x151/0x1c0 [sctp] [ 219.187260] [<ffffffff81692b07>] __release_sock+0x87/0xf0 [ 219.187261] [<ffffffff81692ba0>] release_sock+0x30/0xa0 [ 219.187265] [<ffffffffa040e46d>] sctp_accept+0x17d/0x210 [sctp] [ 219.187266] [<ffffffff810e7510>] ? prepare_to_wait_event+0xf0/0xf0 [ 219.187268] [<ffffffff8172d52c>] inet_accept+0x3c/0x130 [ 219.187269] [<ffffffff8168d7a3>] SYSC_accept4+0x103/0x210 [ 219.187271] [<ffffffff817ca2ba>] ? _raw_spin_unlock_bh+0x1a/0x20 [ 219.187272] [<ffffffff81692bfc>] ? release_sock+0x8c/0xa0 [ 219.187276] [<ffffffffa0413e22>] ? sctp_inet_listen+0x62/0x1b0 [sctp] [ 219.187277] [<ffffffff8168f2d0>] SyS_accept+0x10/0x20 Fixes: 860fbbc343bf ("sctp: prepare for socket backlog behavior change") Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25hv_netvsc: Fix VF register on bonding devicesHaiyang Zhang
Added a condition to avoid bonding devices with same MAC registering as VF. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25kcm: remove redundant -ve error check and return pathColin Ian King
The check for a -ve error is redundant, remove it and just immediately return the return value from the call to seq_open_net. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net: ipv6: Always leave anycast and multicast groups on link downMike Manning
Default kernel behavior is to delete IPv6 addresses on link down, which entails deletion of the multicast and the subnet-router anycast addresses. These deletions do not happen with sysctl setting to keep global IPv6 addresses on link down, so every link down/up causes an increment of the anycast and multicast refcounts. These bogus refcounts may stop these addrs from being removed on subsequent calls to delete them. The solution is to leave the groups for the multicast and subnet anycast on link down for the callflow when global IPv6 addresses are kept. Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Mike Manning <mmanning@brocade.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>