summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-12-196lowpan: convert to DEFINE_SHOW_ATTRIBUTEYangtao Li
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-12-17ip6mr: Drop mfc6_cache argument to ip6mr_forward2David Ahern
mfc6_cache is not needed by ip6mr_forward2 so drop it from the input argument list. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17ipmr: Drop mfc_cache argument to ipmr_queue_xmitDavid Ahern
mfc_cache is not needed by ipmr_queue_xmit so drop it from the input argument list. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17net: dccp: initialize (addr,port) listening hashtablePeter Oskolkov
Commit d9fbc7f6431f "net: tcp: prefer listeners bound to an address" removes port-only listener lookups. This caused segfaults in DCCP lookups because DCCP did not initialize the (addr,port) hashtable. This patch adds said initialization. The only non-trivial issue here is the size of the new hashtable. It seemed reasonable to make it match the size of the port-only hashtable (= INET_LHTABLE_SIZE) that was used previously. Other parameters to inet_hashinfo2_init() match those used in TCP. V2 changes: marked inet_hashinfo2_init as an exported symbol so that DCCP compiles when configured as a module. Tested: syzcaller issues fixed; the second patch in the patchset tests that DCCP lookups work correctly. Fixes: d9fbc7f6431f "net: tcp: prefer listeners bound to an address" Reported-by: syzcaller <syzkaller@googlegroups.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17fou: Prevent unbounded recursion in GUE error handlerStefano Brivio
Handling exceptions for direct UDP encapsulation in GUE (that is, UDP-in-UDP) leads to unbounded recursion in the GUE exception handler, syzbot reported. While draft-ietf-intarea-gue-06 doesn't explicitly forbid direct encapsulation of UDP in GUE, it probably doesn't make sense to set up GUE this way, and it's currently not even possible to configure this. Skip exception handling if the GUE proto/ctype field is set to the UDP protocol number. Should we need to handle exceptions for UDP-in-GUE one day, we might need to either explicitly set a bound for recursion, or implement a special iterative handling for these cases. Reported-and-tested-by: syzbot+43f6755d1c2e62743468@syzkaller.appspotmail.com Fixes: b8a51b38e4d4 ("fou, fou6: ICMP error handlers for FoU and GUE") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16bridge: support for ndo_fdb_getRoopa Prabhu
This patch implements ndo_fdb_get for the bridge fdb. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16net: rtnetlink: support for fdb getRoopa Prabhu
This patch adds support for fdb get similar to route get. arguments can be any of the following (similar to fdb add/del/dump): [bridge, mac, vlan] or [bridge_port, mac, vlan, flags=[NTF_MASTER]] or [dev, mac, [vni|vlan], flags=[NTF_SELF]] Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16net: dsa: ksz: Add STP multicast handlingMarek Vasut
In case the destination address is link local, add override bit into the switch tag to let such a packet through the switch even if the port is blocked. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Cc: Woojung Huh <woojung.huh@microchip.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16net: dsa: ksz: Factor out common tag codeTristram Ha
Factor out common code from the tag_ksz , so that the code can be used with other KSZ family switches which use differenly sized tags. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Signed-off-by: Marek Vasut <marex@denx.de> Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Cc: Woojung Huh <woojung.huh@microchip.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16net: dsa: ksz: Rename NET_DSA_TAG_KSZ to _KSZ9477Tristram Ha
Rename the tag Kconfig option and related macros in preparation for addition of new KSZ family switches with different tag formats. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Signed-off-by: Marek Vasut <marex@denx.de> Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Cc: Woojung Huh <woojung.huh@microchip.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16Revert "net: dccp: initialize (addr,port) listening hashtable"David S. Miller
This reverts commit ec49d83f245453515a9b6e88324e27bbcb69fbae. Cause build failures when DCCP is modular. ERROR: "inet_hashinfo2_init" [net/dccp/dccp.ko] undefined! Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16neighbor: Add protocol attributeDavid Ahern
Similar to routes and rules, add protocol attribute to neighbor entries for easier tracking of how each was created. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-16net: dccp: initialize (addr,port) listening hashtablePeter Oskolkov
Commit d9fbc7f6431f "net: tcp: prefer listeners bound to an address" removes port-only listener lookups. This caused segfaults in DCCP lookups because DCCP did not initialize the (addr,port) hashtable. This patch adds said initialization. The only non-trivial issue here is the size of the new hashtable. It seemed reasonable to make it match the size of the port-only hashtable (= INET_LHTABLE_SIZE) that was used previously. Other parameters to inet_hashinfo2_init() match those used in TCP. Tested: syzcaller issues fixed; the second patch in the patchset tests that DCCP lookups work correctly. Fixes: d9fbc7f6431f "net: tcp: prefer listeners bound to an address" Reported-by: syzcaller <syzkaller@googlegroups.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15l2tp: Add protocol field decompressionSam Protsenko
When Protocol Field Compression (PFC) is enabled, the "Protocol" field in PPP packet will be received without leading 0x00. See section 6.5 in RFC 1661 for details. So let's decompress protocol field if needed, the same way it's done in drivers/net/ppp/pptp.c. In case when "nopcomp" pppd option is not enabled, PFC (pcomp) can be negotiated during LCP handshake, and L2TP driver in kernel will receive PPP packets with compressed Protocol field, which in turn leads to next error: Protocol Rejected (unsupported protocol 0x2145) because instead of Protocol=0x0021 in PPP packet there will be Protocol=0x21. This patch unwraps it back to 0x0021, which fixes the issue. Sending the compressed Protocol field will be implemented in subsequent patch, this one is self-sufficient. Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15udp: use indirect call wrappers for GRO socket lookupPaolo Abeni
This avoids another indirect call for UDP GRO. Again, the test for the IPv6 variant is performed first. v1 -> v2: - adapted to INDIRECT_CALL_ changes Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15net: use indirect call wrappers at GRO transport layerPaolo Abeni
This avoids an indirect call in the receive path for TCP and UDP packets. TCP takes precedence on UDP, so that we have a single additional conditional in the common case. When IPV6 is build as module, all gro symbols except UDPv6 are builtin, while the latter belong to the ipv6 module, so we need some special care. v1 -> v2: - adapted to INDIRECT_CALL_ changes v2 -> v3: - fix build issue with CONFIG_IPV6=m Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15net: use indirect call wrappers at GRO network layerPaolo Abeni
This avoids an indirect calls for L3 GRO receive path, both for ipv4 and ipv6, if the latter is not compiled as a module. Note that when IPv6 is compiled as builtin, it will be checked first, so we have a single additional compare for the more common path. v1 -> v2: - adapted to INDIRECT_CALL_ changes Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15net: sched: simplify the qdisc_leaf codeTonghao Zhang
Except for returning, the var leaf is not used in the qdisc_leaf(). For simplicity, remove it. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15ipv6: Fix handling of LLA with VRF and sockets bound to VRFDavid Ahern
A recent commit allows sockets bound to a VRF to receive ipv6 link local packets. However, it only works for UDP and worse TCP connection attempts to the LLA with the only listener bound to the VRF just hang where as before the client gets a reset and connection refused. Fix by adjusting ir_iif for LL addresses and packets received through a device enslaved to a VRF. Fixes: 6f12fa775530 ("vrf: mark skb for multicast or link-local as enslaved to VRF") Reported-by: Donald Sharp <sharpd@cumulusnetworks.com> Cc: Mike Manning <mmanning@vyatta.att-mail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15ipconfig: convert to DEFINE_SHOW_ATTRIBUTEYangtao Li
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14net: tcp6: prefer listeners bound to an addressPeter Oskolkov
A relatively common use case is to have several IPs configured on a host, and have different listeners for each of them. We would like to add a "catch all" listener on addr_any, to match incoming connections not served by any of the listeners bound to a specific address. However, port-only lookups can match addr_any sockets when sockets listening on specific addresses are present if so_reuseport flag is set. This patch eliminates lookups into port-only hashtable, as lookups by (addr,port) tuple are easily available. In addition, compute_score() is tweaked to _not_ match addr_any sockets to specific addresses, as hash collisions could result in the unwanted behavior described above. Tested: the patch compiles; full test in the last patch in this patchset. Existing reuseport_* selftests also pass. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14net: tcp: prefer listeners bound to an addressPeter Oskolkov
A relatively common use case is to have several IPs configured on a host, and have different listeners for each of them. We would like to add a "catch all" listener on addr_any, to match incoming connections not served by any of the listeners bound to a specific address. However, port-only lookups can match addr_any sockets when sockets listening on specific addresses are present if so_reuseport flag is set. This patch eliminates lookups into port-only hashtable, as lookups by (addr,port) tuple are easily available. In addition, compute_score() is tweaked to _not_ match addr_any sockets to specific addresses, as hash collisions could result in the unwanted behavior described above. Tested: the patch compiles; full test in the last patch in this patchset. Existing reuseport_* selftests also pass. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14net: udp6: prefer listeners bound to an addressPeter Oskolkov
A relatively common use case is to have several IPs configured on a host, and have different listeners for each of them. We would like to add a "catch all" listener on addr_any, to match incoming connections not served by any of the listeners bound to a specific address. However, port-only lookups can match addr_any sockets when sockets listening on specific addresses are present if so_reuseport flag is set. This patch eliminates lookups into port-only hashtable, as lookups by (addr,port) tuple are easily available. In addition, compute_score() is tweaked to _not_ match addr_any sockets to specific addresses, as hash collisions could result in the unwanted behavior described above. Tested: the patch compiles; full test in the last patch in this patchset. Existing reuseport_* selftests also pass. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14net: udp: prefer listeners bound to an addressPeter Oskolkov
A relatively common use case is to have several IPs configured on a host, and have different listeners for each of them. We would like to add a "catch all" listener on addr_any, to match incoming connections not served by any of the listeners bound to a specific address. However, port-only lookups can match addr_any sockets when sockets listening on specific addresses are present if so_reuseport flag is set. This patch eliminates lookups into port-only hashtable, as lookups by (addr,port) tuple are easily available. In addition, compute_score() is tweaked to _not_ match addr_any sockets to specific addresses, as hash collisions could result in the unwanted behavior described above. Tested: the patch compiles; full test in the last patch in this patchset. Existing reuseport_* selftests also pass. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14neighbor: Remove externally learned entries from gc_listDavid Ahern
Externally learned entries are similar to PERMANENT entries in the sense they are managed by userspace and can not be garbage collected. As such remove them from the gc_list, remove the flags check from neigh_forced_gc and skip threshold checks in neigh_alloc. As with PERMANENT entries, this allows unlimited number of NTF_EXT_LEARNED entries. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14neighbor: Move neigh_update_ext_learned to core fileDavid Ahern
neigh_update_ext_learned has one caller in neighbour.c so does not need to be defined in the header. Move it and in the process remove the intialization of ndm_flags and just set it based on the flags check. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14neighbor: Remove state and flags arguments to neigh_delDavid Ahern
neigh_del now only has 1 caller, and the state and flags arguments are both 0. Remove them and simplify neigh_del. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14neighbor: Fix state check in neigh_forced_gcDavid Ahern
PERMANENT entries are not on the gc_list so the state check is now redundant. Also, the move to not purge entries until after 5 seconds should not apply to FAILED entries; those can be removed immediately to make way for newer ones. This restores the previous logic prior to the gc_list. Fixes: 58956317c8de ("neighbor: Improve garbage collection") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14neighbor: Fix locking order for gc_list changesDavid Ahern
Lock checker noted an inverted lock order between neigh_change_state (neighbor lock then table lock) and neigh_periodic_work (table lock and then neighbor lock) resulting in: [ 121.057652] ====================================================== [ 121.058740] WARNING: possible circular locking dependency detected [ 121.059861] 4.20.0-rc6+ #43 Not tainted [ 121.060546] ------------------------------------------------------ [ 121.061630] kworker/0:2/65 is trying to acquire lock: [ 121.062519] (____ptrval____) (&n->lock){++--}, at: neigh_periodic_work+0x237/0x324 [ 121.063894] [ 121.063894] but task is already holding lock: [ 121.064920] (____ptrval____) (&tbl->lock){+.-.}, at: neigh_periodic_work+0x194/0x324 [ 121.066274] [ 121.066274] which lock already depends on the new lock. [ 121.066274] [ 121.067693] [ 121.067693] the existing dependency chain (in reverse order) is: ... Fix by renaming neigh_change_state to neigh_update_gc_list, changing it to only manage whether an entry should be on the gc_list and taking locks in the same order as neigh_periodic_work. Invoke at the end of neigh_update only if diff between old or new states has the PERMANENT flag set. Fixes: 8cc196d6ef86 ("neighbor: gc_list changes should be protected by table lock") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-14net_sched: fold tcf_block_cb_call() into tc_setup_cb_call()Cong Wang
After commit 69bd48404f25 ("net/sched: Remove egdev mechanism"), tc_setup_cb_call() is nearly identical to tcf_block_cb_call(), so we can just fold tcf_block_cb_call() into tc_setup_cb_call() and remove its unused parameter 'exts'. Fixes: 69bd48404f25 ("net/sched: Remove egdev mechanism") Cc: Oz Shlomo <ozsh@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13net: bridge: Handle NETDEV_PRE_CHANGEADDR from portsPetr Machata
When a port device seeks approval of a potential new MAC address, make sure that should the bridge device end up using this address, all interested parties would agree with it. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13net: bridge: Issue NETDEV_PRE_CHANGEADDRPetr Machata
When a port is attached to a bridge, the address of the bridge in question may change as well. Even if it would not change at this point (because the current bridge address is lower), it might end up changing later as a result of detach of another port, which can't be vetoed. Therefore issue NETDEV_PRE_CHANGEADDR regardless of whether the address will be used at this point or not, and make sure all involved parties would agree with the change. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13net: dev: Issue NETDEV_PRE_CHANGEADDRPetr Machata
When a device address is about to be changed, or an address added to the list of device HW addresses, it is necessary to ensure that all interested parties can support the address. Therefore, send the NETDEV_PRE_CHANGEADDR notification, and if anyone bails on it, do not change the address. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13net: dev: Add NETDEV_PRE_CHANGEADDRPetr Machata
The NETDEV_CHANGEADDR notification is emitted after a device address changes. Extending this message to allow vetoing is certainly possible, but several other notification types have instead adopted a simple two-stage approach: first a "pre" notification is sent to make sure all interested parties are OK with a change that's about to be done. Then the change is done, and afterwards a "post" notification is sent. This dual approach is easier to use: when the change is vetoed, nothing has changed yet, and it's therefore unnecessary to roll anything back. Therefore adopt it for NETDEV_CHANGEADDR as well. To that end, add NETDEV_PRE_CHANGEADDR and an info structure to go along with it. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-13net: dev: Add extack argument to dev_set_mac_address()Petr Machata
A follow-up patch will add a notifier type NETDEV_PRE_CHANGEADDR, which allows vetoing of MAC address changes. One prominent path to that notification is through dev_set_mac_address(). Therefore give this function an extack argument, so that it can be packed together with the notification. Thus a textual reason for rejection (or a warning) can be communicated back to the user. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-12net: switchdev: Add extack to switchdev_handle_port_obj_add() callbackPetr Machata
Drivers use switchdev_handle_port_obj_add() to handle recursive descent through lower devices. Change this function prototype to take add_cb that itself takes an extack argument. Decode extack from switchdev_notifier_port_obj_info and pass it to add_cb. Update mlxsw and ocelot drivers which use this helper. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-12net: switchdev: Add extack to struct switchdev_notifier_infoPetr Machata
In order to pass extack to the drivers that need it, add an extack field to struct switchdev_notifier_info, and an extack argument to the function call_switchdev_blocking_notifiers(). Also add a helper function switchdev_notifier_info_to_extack(). Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-12net: switchdev: Add extack argument to switchdev_port_obj_add()Petr Machata
After the previous patch, bridge driver has extack argument available to pass to switchdev. Therefore extend switchdev_port_obj_add() with this argument, updating all callers, and passing the argument through to switchdev_port_obj_notify(). Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-12net: bridge: Propagate extack to switchdevPetr Machata
ndo_bridge_setlink has been updated in the previous patch to have extack available, and changelink RTNL op has had this argument since the time extack was added. Propagate both through the bridge driver to eventually reach br_switchdev_port_vlan_add(), where it will be used by subsequent patches. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-12net: ndo_bridge_setlink: Add extackPetr Machata
Drivers may not be able to implement a VLAN addition or reconfiguration. In those cases it's desirable to explain to the user that it was rejected (and why). To that end, add extack argument to ndo_bridge_setlink. Adapt all users to that change. Following patches will use the new argument in the bridge driver. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-11 The following pull-request contains BPF updates for your *net-next* tree. It has three minor merge conflicts, resolutions: 1) tools/testing/selftests/bpf/test_verifier.c Take first chunk with alignment_prevented_execution. 2) net/core/filter.c [...] case bpf_ctx_range_ptr(struct __sk_buff, flow_keys): case bpf_ctx_range(struct __sk_buff, wire_len): return false; [...] 3) include/uapi/linux/bpf.h Take the second chunk for the two cases each. The main changes are: 1) Add support for BPF line info via BTF and extend libbpf as well as bpftool's program dump to annotate output with BPF C code to facilitate debugging and introspection, from Martin. 2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter and all JIT backends, from Jiong. 3) Improve BPF test coverage on archs with no efficient unaligned access by adding an "any alignment" flag to the BPF program load to forcefully disable verifier alignment checks, from David. 4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz. 5) Extend tc BPF programs to use a new __sk_buff field called wire_len for more accurate accounting of packets going to wire, from Petar. 6) Improve bpftool to allow dumping the trace pipe from it and add several improvements in bash completion and map/prog dump, from Quentin. 7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for kernel addresses and add a dedicated BPF JIT backend allocator, from Ard. 8) Add a BPF helper function for IR remotes to report mouse movements, from Sean. 9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info member naming consistent with existing conventions, from Yonghong and Song. 10) Misc cleanups and improvements in allowing to pass interface name via cmdline for xdp1 BPF example, from Matteo. 11) Fix a potential segfault in BPF sample loader's kprobes handling, from Daniel T. 12) Fix SPDX license in libbpf's README.rst, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10neighbor: gc_list changes should be protected by table lockDavid Ahern
Adding and removing neighbor entries to / from the gc_list need to be done while holding the table lock; a couple of places were missed in the original patch. Move the list_add_tail in neigh_alloc to ___neigh_create where the lock is already obtained. Since neighbor entries should rarely be moved to/from PERMANENT state, add lock/unlock around the gc_list changes in neigh_change_state rather than extending the lock hold around all neighbor updates. Fixes: 58956317c8de ("neighbor: Improve garbage collection") Reported-by: Andrei Vagin <avagin@gmail.com> Reported-by: syzbot+6cc2fd1d3bdd2e007363@syzkaller.appspotmail.com Reported-by: syzbot+35e87b87c00f386b041f@syzkaller.appspotmail.com Reported-by: syzbot+b354d1fb59091ea73c37@syzkaller.appspotmail.com Reported-by: syzbot+3ddead5619658537909b@syzkaller.appspotmail.com Reported-by: syzbot+424d47d5c456ce8b2bbe@syzkaller.appspotmail.com Reported-by: syzbot+e4d42eb35f6a27b0a628@syzkaller.appspotmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10net/sched: Remove egdev mechanismOz Shlomo
The egdev mechanism was replaced by the TC indirect block notifications platform. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Cc: John Hurley <john.hurley@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net: Add netif_is_gretap()/netif_is_ip6gretap()Oz Shlomo
Changed the is_gretap_dev and is_ip6gretap_dev logic from structure comparison to string comparison of the rtnl_link_ops kind field. This approach aligns with the current identification methods and function names of vxlan and geneve network devices. Convert mlxsw to use these helpers and use them in downstream mlx5 patch. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10tcp: handle EOR and FIN conditions the same in tcp_tso_should_defer()Eric Dumazet
In commit f9bfe4e6a9d0 ("tcp: lack of available data can also cause TSO defer") we moved the test in tcp_tso_should_defer() for packets with a FIN flag, and we mentioned that the same would be done later for EOR flag. Both flags should be handled at the same time, after all other heuristics have been considered. They both mean that no more bytes can be added to this skb by an application. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Several conflicts, seemingly all over the place. I used Stephen Rothwell's sample resolutions for many of these, if not just to double check my own work, so definitely the credit largely goes to him. The NFP conflict consisted of a bug fix (moving operations past the rhashtable operation) while chaning the initial argument in the function call in the moved code. The net/dsa/master.c conflict had to do with a bug fix intermixing of making dsa_master_set_mtu() static with the fixing of the tagging attribute location. cls_flower had a conflict because the dup reject fix from Or overlapped with the addition of port range classifiction. __set_phy_supported()'s conflict was relatively easy to resolve because Andrew fixed it in both trees, so it was just a matter of taking the net-next copy. Or at least I think it was :-) Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup() intermixed with changes on how the sdif and caller_net are calculated in these code paths in net-next. The remaining BPF conflicts were largely about the addition of the __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions to the relevant data structure where the MD pointer macros are used. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "A decent batch of fixes here. I'd say about half are for problems that have existed for a while, and half are for new regressions added in the 4.20 merge window. 1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach. 2) Revert bogus emac driver change, from Benjamin Herrenschmidt. 3) Handle BPF exported data structure with pointers when building 32-bit userland, from Daniel Borkmann. 4) Memory leak fix in act_police, from Davide Caratti. 5) Check RX checksum offload in RX descriptors properly in aquantia driver, from Dmitry Bogdanov. 6) SKB unlink fix in various spots, from Edward Cree. 7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from Eric Dumazet. 8) Fix FID leak in mlxsw driver, from Ido Schimmel. 9) IOTLB locking fix in vhost, from Jean-Philippe Brucker. 10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory limits otherwise namespace exit can hang. From Jiri Wiesner. 11) Address block parsing length fixes in x25 from Martin Schiller. 12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan. 13) For tun interfaces, only iface delete works with rtnl ops, enforce this by disallowing add. From Nicolas Dichtel. 14) Use after free in liquidio, from Pan Bian. 15) Fix SKB use after passing to netif_receive_skb(), from Prashant Bhole. 16) Static key accounting and other fixes in XPS from Sabrina Dubroca. 17) Partially initialized flow key passed to ip6_route_output(), from Shmulik Ladkani. 18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas Falcon. 19) Several small TCP fixes (off-by-one on window probe abort, NULL deref in tail loss probe, SNMP mis-estimations) from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) net/sched: cls_flower: Reject duplicated rules also under skip_sw bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips. bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips. bnxt_en: Keep track of reserved IRQs. bnxt_en: Fix CNP CoS queue regression. net/mlx4_core: Correctly set PFC param if global pause is turned off. Revert "net/ibm/emac: wrong bit is used for STA control" neighbour: Avoid writing before skb->head in neigh_hh_output() ipv6: Check available headroom in ip6_xmit() even without options tcp: lack of available data can also cause TSO defer ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl mlxsw: spectrum_router: Relax GRE decap matching check mlxsw: spectrum_switchdev: Avoid leaking FID's reference count mlxsw: spectrum_nve: Remove easily triggerable warnings ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes sctp: frag_point sanity check tcp: fix NULL ref in tail loss probe tcp: Do not underestimate rwnd_limited net: use skb_list_del_init() to remove from RX sublists ...
2018-12-09net/sched: cls_flower: Reject duplicated rules also under skip_swOr Gerlitz
Currently, duplicated rules are rejected only for skip_hw or "none", hence allowing users to push duplicates into HW for no reason. Use the flower tables to protect for that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reported-by: Chris Mi <chrism@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08net: dsa: Make dsa_master_set_mtu() staticAndrew Lunn
Add the missing static keyword. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08net: dsa: Restore MTU on master device on unloadAndrew Lunn
A previous change tries to set the MTU on the master device to take into account the DSA overheads. This patch tries to reset the master device back to the default MTU. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>