summaryrefslogtreecommitdiff
path: root/net/mctp
AgeCommit message (Collapse)Author
2024-02-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: net/ipv4/udp.c f796feabb9f5 ("udp: add local "peek offset enabled" flag") 56667da7399e ("net: implement lockless setsockopt(SO_PEEK_OFF)") Adjacent changes: net/unix/garbage.c aa82ac51d633 ("af_unix: Drop oob_skb ref before purging queue in GC.") 11498715f266 ("af_unix: Remove io_uring code for GC.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-22net: mctp: tests: Add a test for proper tag creation on local outputJeremy Kerr
Ensure we have the correct key parameters on sending a message. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: tests: Test that outgoing skbs have flow data populatedJeremy Kerr
When CONFIG_MCTP_FLOWS is enabled, outgoing skbs should have their SKB_EXT_MCTP extension set for drivers to consume. Add two tests for local-to-output routing that check for the flow extensions: one for the simple single-packet case, and one for fragmentation. We now make MCTP_TEST select MCTP_FLOWS, so we always get coverage of these flow tests. The tests are skippable if MCTP_FLOWS is (otherwise) disabled, but that would need manual config tweaking. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: copy skb ext data when fragmentingJeremy Kerr
If we're fragmenting on local output, the original packet may contain ext data for the MCTP flows. We'll want this in the resulting fragment skbs too. So, do a skb_ext_copy() in the fragmentation path, and implement the MCTP-specific parts of an ext copy operation. Fixes: 67737c457281 ("mctp: Pass flow data & flow release events to drivers") Reported-by: Jian Zhang <zhangjian.3032@bytedance.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: tests: Add MCTP net isolation testsJeremy Kerr
Add a couple of tests that excersise the new net-specific sk_key and bind lookups Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: tests: Add netid argument to __mctp_route_test_initJeremy Kerr
We'll want to create net-specific test setups in an upcoming change, so allow the caller to provide a non-default netid. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: provide a more specific tag allocation ioctlJeremy Kerr
Now that we have net-specific tags, extend the tag allocation ioctls (SIOCMCTPALLOCTAG / SIOCMCTPDROPTAG) to allow a network parameter to be passed to the tag allocation. We also add a local_addr member to the ioc struct, to allow for a future finer-grained tag allocation using local EIDs too. We don't add any specific support for that now though, so require MCTP_ADDR_ANY or MCTP_ADDR_NULL for those at present. The old ioctls will still work, but allocate for the default MCTP net. These are now marked as deprecated in the header. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: separate key correlation across netsJeremy Kerr
Currently, we lookup sk_keys from the entire struct net_namespace, which may contain multiple MCTP net IDs. In those cases we want to distinguish between endpoints with the same EID but different net ID. Add the net ID data to the struct mctp_sk_key, populate on add and filter on this during route lookup. For the ioctl interface, we use a default net of MCTP_INITIAL_DEFAULT_NET (ie., what will be in use for single-net configurations), but we'll extend the ioctl interface to provide net-specific tag allocation in an upcoming change. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: tests: create test skbs with the correct net and deviceJeremy Kerr
In our test skb creation functions, we're not setting up the net and device data. This doesn't matter at the moment, but we will want to add support for distinct net IDs in future. Set the ->net identifier on the test MCTP device, and ensure that test skbs are set up with the correct device-related data on creation. Create a helper for setting skb->dev and mctp_skb_cb->net. We have a few cases where we're calling __mctp_cb() to initialise the cb (which we need for the above) separately, so integrate this into the skb creation helpers. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: make key lookups match the ANY address on either local or peerJeremy Kerr
We may have an ANY address in either the local or peer address of a sk_key, and may want to match on an incoming daddr or saddr being ANY. Do this by altering the conflicting-tag lookup to also accept ANY as the local/peer address. We don't want mctp_address_matches to match on the requested EID being ANY, as that is a specific lookup case on packet input. Reported-by: Eric Chuang <echuang@google.com> Reported-by: Anthony <anthonyhkf@google.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: Add some detail on the key allocation implementationJeremy Kerr
We could do with a little more comment on where MCTP_ADDR_ANY will match in the key allocations. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-22net: mctp: avoid confusion over local/peer dest/source addressesJeremy Kerr
We have a double-swap of local and peer addresses in mctp_alloc_local_tag; the arguments in both call sites are swapped, but there is also a swap in the implementation of alloc_local_tag. This is opaque because we're using source/dest address references, which don't match the local/peer semantics. Avoid this confusion by naming the arguments as 'local' and 'peer', and remove the double swap. The calling order now matches mctp_key_alloc. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-21net: mctp: put sock on tag allocation failureJeremy Kerr
We may hold an extra reference on a socket if a tag allocation fails: we optimistically allocate the sk_key, and take a ref there, but do not drop if we end up not using the allocated key. Ensure we're dropping the sock on this failure by doing a proper unref rather than directly kfree()ing. Fixes: de8a6b15d965 ("net: mctp: add an explicit reference from a mctp_sk_key to sock") Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/ce9b61e44d1cdae7797be0c5e3141baf582d23a0.1707983487.git.jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-10mctp: perform route lookups under a RCU read-side lockJeremy Kerr
Our current route lookups (mctp_route_lookup and mctp_route_lookup_null) traverse the net's route list without the RCU read lock held. This means the route lookup is subject to preemption, resulting in an potential grace period expiry, and so an eventual kfree() while we still have the route pointer. Add the proper read-side critical section locks around the route lookups, preventing premption and a possible parallel kfree. The remaining net->mctp.routes accesses are already under a rcu_read_lock, or protected by the RTNL for updates. Based on an analysis from Sili Luo <rootlab@huawei.com>, where introducing a delay in the route lookup could cause a UAF on simultaneous sendmsg() and route deletion. Reported-by: Sili Luo <rootlab@huawei.com> Fixes: 889b7da23abf ("mctp: Add initial routing framework") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/29c4b0e67dc1bf3571df3982de87df90cae9b631.1696837310.git.jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-24sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)David Howells
Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: mptcp@lists.linux.dev cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17net: mctp: remove redundant RTN_UNICAST checkLin Ma
Current mctp_newroute() contains two exactly same check against rtm->rtm_type static int mctp_newroute(...) { ... if (rtm->rtm_type != RTN_UNICAST) { // (1) NL_SET_ERR_MSG(extack, "rtm_type must be RTN_UNICAST"); return -EINVAL; } ... if (rtm->rtm_type != RTN_UNICAST) // (2) return -EINVAL; ... } This commits removes the (2) check as it is redundant. Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://lore.kernel.org/r/20230615152240.1749428-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09mctp: remove MODULE_LICENSE in non-modulesNick Alcock
Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Jeremy Kerr <jk@codeconstruct.com.au> Cc: Matt Johnston <matt@codeconstruct.com.au> Link: https://lore.kernel.org/r/20230308121230.5354-2-nick.alcock@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-28net: mctp: purge receive queues on sk destructionJeremy Kerr
We may have pending skbs in the receive queue when the sk is being destroyed; add a destructor to purge the queue. MCTP doesn't use the error queue, so only the receive_queue is purged. Fixes: 833ef3b91de6 ("mctp: Populate socket implementation") Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://lore.kernel.org/r/20230126064551.464468-1-jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-25net: mctp: mark socks as dead on unhash, prevent re-addJeremy Kerr
Once a socket has been unhashed, we want to prevent it from being re-used in a sk_key entry as part of a routing operation. This change marks the sk as SOCK_DEAD on unhash, which prevents addition into the net's key list. We need to do this during the key add path, rather than key lookup, as we release the net keys_lock between those operations. Fixes: 4a992bbd3650 ("mctp: Implement message fragmentation & reassembly") Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net: mctp: hold key reference when looking up a general keyPaolo Abeni
Currently, we have a race where we look up a sock through a "general" (ie, not directly associated with the (src,dest,tag) tuple) key, then drop the key reference while still holding the key's sock. This change expands the key reference until we've finished using the sock, and hence the sock reference too. Commit message changes from Jeremy Kerr <jk@codeconstruct.com.au>. Reported-by: Noam Rathaus <noamr@ssd-disclosure.com> Fixes: 73c618456dc5 ("mctp: locking, lifetime and validity changes for sk_keys") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net: mctp: move expiry timer delete to unhashJeremy Kerr
Currently, we delete the key expiry timer (in sk->close) before unhashing the sk. This means that another thread may find the sk through its presence on the key list, and re-queue the timer. This change moves the timer deletion to the unhash, after we have made the key no longer observable, so the timer cannot be re-queued. Fixes: 7b14e15ae6f4 ("mctp: Implement a timeout for tags") Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net: mctp: add an explicit reference from a mctp_sk_key to sockJeremy Kerr
Currently, we correlate the mctp_sk_key lifetime to the sock lifetime through the sock hash/unhash operations, but this is pretty tenuous, and there are cases where we may have a temporary reference to an unhashed sk. This change makes the reference more explicit, by adding a hold on the sock when it's associated with a mctp_sk_key, released on final key unref. Fixes: 73c618456dc5 ("mctp: locking, lifetime and validity changes for sk_keys") Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-19mctp: Remove device type check at unregisterMatt Johnston
The unregister check could be incorrectly triggered if a netdev changes its type after register. That is possible for a tun device using TUNSETLINK ioctl, resulting in mctp unregister failing and the netdev unregister waiting forever. This was encountered by https://github.com/openthread/openthread/issues/8523 Neither check at register or unregister is required. They were added in an attempt to track down mctp_ptr being set unexpectedly, which should not happen in normal operation. Fixes: 7b1871af75f3 ("mctp: Warn if pointer is set for a wrong dev type") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://lore.kernel.org/r/20221215054933.2403401-1-matt@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mctp: Fix an error handling path in mctp_init()Wei Yongjun
If mctp_neigh_init() return error, the routes resources should be released in the error handling path. Otherwise some resources leak. Fixes: 4d8b9319282a ("mctp: Add neighbour implementation") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://lore.kernel.org/r/20221108095517.620115-1-weiyongjun@huaweicloud.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-12mctp: prevent double key removal and unrefJeremy Kerr
Currently, we have a bug where a simultaneous DROPTAG ioctl and socket close may race, as we attempt to remove a key from lists twice, and perform an unref for each removal operation. This may result in a uaf when we attempt the second unref. This change fixes the race by making __mctp_key_remove tolerant to being called on a key that has already been removed from the socket/net lists, and only performs the unref when we do the actual remove. We also need to hold the list lock on the ioctl cleanup path. This fix is based on a bug report and comprehensive analysis from butt3rflyh4ck <butterflyhuangxx@gmail.com>, found via syzkaller. Cc: stable@vger.kernel.org Fixes: 63ed1aab3d40 ("mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control") Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-25Merge tag 'net-next-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-25Merge tag 'linux-kselftest-kunit-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit updates from Shuah Khan: "Several fixes, cleanups, and enhancements to tests and framework: - introduce _NULL and _NOT_NULL macros to pointer error checks - rework kunit_resource allocation policy to fix memory leaks when caller doesn't specify free() function to be used when allocating memory using kunit_add_resource() and kunit_alloc_resource() funcs. - add ability to specify suite-level init and exit functions" * tag 'linux-kselftest-kunit-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (41 commits) kunit: tool: Use qemu-system-i386 for i386 runs kunit: fix executor OOM error handling logic on non-UML kunit: tool: update riscv QEMU config with new serial dependency kcsan: test: use new suite_{init,exit} support kunit: tool: Add list of all valid test configs on UML kunit: take `kunit_assert` as `const` kunit: tool: misc cleanups kunit: tool: minor cosmetic cleanups in kunit_parser.py kunit: tool: make parser stop overwriting status of suites w/ no_tests kunit: tool: remove dead parse_crash_in_log() logic kunit: tool: print clearer error message when there's no TAP output kunit: tool: stop using a shell to run kernel under QEMU kunit: tool: update test counts summary line format kunit: bail out of test filtering logic quicker if OOM lib/Kconfig.debug: change KUnit tests to default to KUNIT_ALL_TESTS kunit: Rework kunit_resource allocation policy kunit: fix debugfs code to use enum kunit_status, not bool kfence: test: use new suite_{init/exit} support, add .kunitconfig kunit: add ability to specify suite-level init and exit functions kunit: rename print_subtest_{start,end} for clarity (s/subtest/suite) ...
2022-04-28net: SO_RCVMARK socket option for SO_MARK with recvmsg()Erin MacNeil
Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK should be included in the ancillary data returned by recvmsg(). Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs(). Signed-off-by: Erin MacNeil <lnx.erin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
include/linux/netdevice.h net/core/dev.c 6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats") 794c24e9921f ("net-core: rx_otherhost_dropped to core_stats") https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/ drivers/net/wan/cosa.c d48fea8401cf ("net: cosa: fix error check return value of register_chrdev()") 89fbca3307d4 ("net: wan: remove support for COSA and SRP synchronous serial boards") https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26mctp: defer the kfree of object mdev->addrsLin Ma
The function mctp_unregister() reclaims the device's relevant resource when a netcard detaches. However, a running routine may be unaware of this and cause the use-after-free of the mdev->addrs object. The race condition can be demonstrated below cleanup thread another thread | unregister_netdev() | mctp_sendmsg() ... | ... mctp_unregister() | rt = mctp_route_lookup() ... | mctl_local_output() kfree(mdev->addrs) | ... | saddr = rt->dev->addrs[0]; | An attacker can adopt the (recent provided) mtcpserial driver with pty to fake the device detaching and use the userfaultfd to increase the race success chance (in mctp_sendmsg). The KASan report for such a POC is shown below: [ 86.051955] ================================================================== [ 86.051955] BUG: KASAN: use-after-free in mctp_local_output+0x4e9/0xb7d [ 86.051955] Read of size 1 at addr ffff888005f298c0 by task poc/295 [ 86.051955] [ 86.051955] Call Trace: [ 86.051955] <TASK> [ 86.051955] dump_stack_lvl+0x33/0x42 [ 86.051955] print_report.cold.13+0xb2/0x6b3 [ 86.051955] ? preempt_schedule_irq+0x57/0x80 [ 86.051955] ? mctp_local_output+0x4e9/0xb7d [ 86.051955] kasan_report+0xa5/0x120 [ 86.051955] ? mctp_local_output+0x4e9/0xb7d [ 86.051955] mctp_local_output+0x4e9/0xb7d [ 86.051955] ? mctp_dev_set_key+0x79/0x79 [ 86.051955] ? copyin+0x38/0x50 [ 86.051955] ? _copy_from_iter+0x1b6/0xf20 [ 86.051955] ? sysvec_apic_timer_interrupt+0x97/0xb0 [ 86.051955] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 86.051955] ? mctp_local_output+0x1/0xb7d [ 86.051955] mctp_sendmsg+0x64d/0xdb0 [ 86.051955] ? mctp_sk_close+0x20/0x20 [ 86.051955] ? __fget_light+0x2fd/0x4f0 [ 86.051955] ? mctp_sk_close+0x20/0x20 [ 86.051955] sock_sendmsg+0xdd/0x110 [ 86.051955] __sys_sendto+0x1cc/0x2a0 [ 86.051955] ? __ia32_sys_getpeername+0xa0/0xa0 [ 86.051955] ? new_sync_write+0x335/0x550 [ 86.051955] ? alloc_file+0x22f/0x500 [ 86.051955] ? __ip_do_redirect+0x820/0x1820 [ 86.051955] ? vfs_write+0x44d/0x7b0 [ 86.051955] ? vfs_write+0x44d/0x7b0 [ 86.051955] ? fput_many+0x15/0x120 [ 86.051955] ? ksys_write+0x155/0x1b0 [ 86.051955] ? __ia32_sys_read+0xa0/0xa0 [ 86.051955] __x64_sys_sendto+0xd8/0x1b0 [ 86.051955] ? exit_to_user_mode_prepare+0x2f/0x120 [ 86.051955] ? syscall_exit_to_user_mode+0x12/0x20 [ 86.051955] do_syscall_64+0x3a/0x80 [ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 86.051955] RIP: 0033:0x7f82118a56b3 [ 86.051955] RSP: 002b:00007ffdb154b110 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [ 86.051955] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f82118a56b3 [ 86.051955] RDX: 0000000000000010 RSI: 00007f8211cd4000 RDI: 0000000000000007 [ 86.051955] RBP: 00007ffdb154c1d0 R08: 00007ffdb154b164 R09: 000000000000000c [ 86.051955] R10: 0000000000000000 R11: 0000000000000293 R12: 000055d779800db0 [ 86.051955] R13: 00007ffdb154c2b0 R14: 0000000000000000 R15: 0000000000000000 [ 86.051955] </TASK> [ 86.051955] [ 86.051955] Allocated by task 295: [ 86.051955] kasan_save_stack+0x1c/0x40 [ 86.051955] __kasan_kmalloc+0x84/0xa0 [ 86.051955] mctp_rtm_newaddr+0x242/0x610 [ 86.051955] rtnetlink_rcv_msg+0x2fd/0x8b0 [ 86.051955] netlink_rcv_skb+0x11c/0x340 [ 86.051955] netlink_unicast+0x439/0x630 [ 86.051955] netlink_sendmsg+0x752/0xc00 [ 86.051955] sock_sendmsg+0xdd/0x110 [ 86.051955] __sys_sendto+0x1cc/0x2a0 [ 86.051955] __x64_sys_sendto+0xd8/0x1b0 [ 86.051955] do_syscall_64+0x3a/0x80 [ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 86.051955] [ 86.051955] Freed by task 301: [ 86.051955] kasan_save_stack+0x1c/0x40 [ 86.051955] kasan_set_track+0x21/0x30 [ 86.051955] kasan_set_free_info+0x20/0x30 [ 86.051955] __kasan_slab_free+0x104/0x170 [ 86.051955] kfree+0x8c/0x290 [ 86.051955] mctp_dev_notify+0x161/0x2c0 [ 86.051955] raw_notifier_call_chain+0x8b/0xc0 [ 86.051955] unregister_netdevice_many+0x299/0x1180 [ 86.051955] unregister_netdevice_queue+0x210/0x2f0 [ 86.051955] unregister_netdev+0x13/0x20 [ 86.051955] mctp_serial_close+0x6d/0xa0 [ 86.051955] tty_ldisc_kill+0x31/0xa0 [ 86.051955] tty_ldisc_hangup+0x24f/0x560 [ 86.051955] __tty_hangup.part.28+0x2ce/0x6b0 [ 86.051955] tty_release+0x327/0xc70 [ 86.051955] __fput+0x1df/0x8b0 [ 86.051955] task_work_run+0xca/0x150 [ 86.051955] exit_to_user_mode_prepare+0x114/0x120 [ 86.051955] syscall_exit_to_user_mode+0x12/0x20 [ 86.051955] do_syscall_64+0x46/0x80 [ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 86.051955] [ 86.051955] The buggy address belongs to the object at ffff888005f298c0 [ 86.051955] which belongs to the cache kmalloc-8 of size 8 [ 86.051955] The buggy address is located 0 bytes inside of [ 86.051955] 8-byte region [ffff888005f298c0, ffff888005f298c8) [ 86.051955] [ 86.051955] The buggy address belongs to the physical page: [ 86.051955] flags: 0x100000000000200(slab|node=0|zone=1) [ 86.051955] raw: 0100000000000200 dead000000000100 dead000000000122 ffff888005c42280 [ 86.051955] raw: 0000000000000000 0000000080660066 00000001ffffffff 0000000000000000 [ 86.051955] page dumped because: kasan: bad access detected [ 86.051955] [ 86.051955] Memory state around the buggy address: [ 86.051955] ffff888005f29780: 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 [ 86.051955] ffff888005f29800: fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc [ 86.051955] >ffff888005f29880: fc fc fc fb fc fc fc fc fa fc fc fc fc fa fc fc [ 86.051955] ^ [ 86.051955] ffff888005f29900: fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc [ 86.051955] ffff888005f29980: fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc [ 86.051955] ================================================================== To this end, just like the commit e04480920d1e ("Bluetooth: defer cleanup of resources in hci_unregister_dev()") this patch defers the destructive kfree(mdev->addrs) in mctp_unregister to the mctp_dev_put, where the refcount of mdev is zero and the entire device is reclaimed. This prevents the use-after-free because the sendmsg thread holds the reference of mdev in the mctp_route object. Fixes: 583be982d934 (mctp: Add device handling and netlink interface) Signed-off-by: Lin Ma <linma@zju.edu.cn> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://lore.kernel.org/r/20220422114340.32346-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-06net: remove noblock parameter from skb_recv_datagram()Oliver Hartkopp
skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-04mctp: test: Use NULL macrosRicardo Ribalda
Replace the PTR_EQ NULL checks wit the NULL macros. More idiomatic and specific. Acked-by: Daniel Latypov <dlatypov@google.com> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Acked-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-01mctp: Use output netdev to allocate skb headroomMatt Johnston
Previously the skb was allocated with headroom MCTP_HEADER_MAXLEN, but that isn't sufficient if we are using devs that are not MCTP specific. This also adds a check that the smctp_halen provided to sendmsg for extended addressing is the correct size for the netdev. Fixes: 833ef3b91de6 ("mctp: Populate socket implementation") Reported-by: Matthew Rinaldi <mjrinal@g.clemson.edu> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01mctp: Fix check for dev_hard_header() resultMatt Johnston
dev_hard_header() returns the length of the header, so we need to test for negative errors rather than non-zero. Fixes: 889b7da23abf ("mctp: Add initial routing framework") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25mctp: Avoid warning if unregister notifies twiceMatt Johnston
Previously if an unregister notify handler ran twice (waiting for netdev to be released) it would print a warning in mctp_unregister() every subsequent time the unregister notify occured. Instead we only need to worry about the case where a mctp_ptr is set on an unknown device type. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23mctp: Fix warnings reported by clang-analyzerMatt Johnston
net/mctp/device.c:140:11: warning: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign] mcb->idx = idx; - Not a real problem due to how the callback runs, fix the warning. net/mctp/route.c:458:4: warning: Value stored to 'msk' is never read [clang-analyzer-deadcode.DeadStores] msk = container_of(key->sk, struct mctp_sock, sk); - 'msk' dead assignment can be removed here. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23mctp: Fix incorrect netdev unref for extended addrMatt Johnston
In the extended addressing local route output codepath dev_get_by_index_rcu() doesn't take a dev_hold() so we shouldn't dev_put(). Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23mctp: make __mctp_dev_get() take a refcount holdMatt Johnston
Previously there was a race that could allow the mctp_dev refcount to hit zero: rcu_read_lock(); mdev = __mctp_dev_get(dev); // mctp_unregister() happens here, mdev->refs hits zero mctp_dev_hold(dev); rcu_read_unlock(); Now we make __mctp_dev_get() take the hold itself. It is safe to test against the zero refcount because __mctp_dev_get() is called holding rcu_read_lock and mctp_dev uses kfree_rcu(). Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18mctp: add address validity checking for packet receiveJeremy Kerr
This change adds some basic sanity checks for the source and dest headers of packets on initial receive. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18mctp: replace mctp_address_ok with more fine-grained helpersJeremy Kerr
Currently, we have mctp_address_ok(), which checks if an EID is in the "valid" range of 8-254 inclusive. However, 0 and 255 may also be valid addresses, depending on context. 0 is the NULL EID, which may be set when physical addressing is used. 255 is valid as a destination address for broadcasts. This change renames mctp_address_ok to mctp_address_unicast, and adds similar helpers for broadcast and null EIDs, which will be used in an upcoming commit. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-15mctp: fix use after freeTom Rix
Clang static analysis reports this problem route.c:425:4: warning: Use of memory after it is freed trace_mctp_key_acquire(key); ^~~~~~~~~~~~~~~~~~~~~~~~~~~ When mctp_key_add() fails, key is freed but then is later used in trace_mctp_key_acquire(). Add an else statement to use the key only when mctp_key_add() is successful. Fixes: 4f9e1ba6de45 ("mctp: Add tracepoints for tag/key handling") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag controlMatt Johnston
This change adds a couple of new ioctls for mctp sockets: SIOCMCTPALLOCTAG and SIOCMCTPDROPTAG. These ioctls provide facilities for explicit allocation / release of tags, overriding the automatic allocate-on-send/release-on-reply and timeout behaviours. This allows userspace more control over messages that may not fit a simple request/response model. In order to indicate a pre-allocated tag to the sendmsg() syscall, we introduce a new flag to the struct sockaddr_mctp.smctp_tag value: MCTP_TAG_PREALLOC. Additional changes from Jeremy Kerr <jk@codeconstruct.com.au>. Contains a fix that was: Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09mctp: Allow keys matching any local addressJeremy Kerr
Currently, we require an exact match on an incoming packet's dest address, and the key's local_addr field. In a future change, we may want to set up a key before packets are routed, meaning we have no local address to match on. This change allows key lookups to match on local_addr = MCTP_ADDR_ANY. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09mctp: Add helper for address match checkingJeremy Kerr
Currently, we have a couple of paths that check that an EID matches, or the match value is MCTP_ADDR_ANY. Rather than open coding this, add a little helper. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09mctp: tests: Add key state testsJeremy Kerr
This change adds a few more tests to check the key/tag lookups on route input. We add a specific entry to the keys lists, route a packet with specific header values, and check for key match/mismatch. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09mctp: tests: Rename FL_T macro to FL_TOJeremy Kerr
This is a definition for the tag-owner flag, which has TO as a standard abbreviation. We'll want to add a helper for the actual tag value in a future change. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-04net: don't include ndisc.h from ipv6.hJakub Kicinski
Nothing in ipv6.h needs ndisc.h, drop it. Link: https://lore.kernel.org/r/20220203043457.2222388-1-kuba@kernel.org Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Link: https://lore.kernel.org/r/20220203231240.2297588-1-kuba@kernel.org Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-11mctp: test: zero out sockaddrMatt Johnston
MCTP now requires that padding bytes are zero. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Fixes: 1e4b50f06d97 ("mctp: handle the struct sockaddr_mctp padding fields") Link: https://lore.kernel.org/r/20220110021806.2343023-1-matt@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>