summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2022-07-14ubsan: disable UBSAN_DIV_ZERO for clangNick Desaulniers
Building with UBSAN_DIV_ZERO with clang produces numerous fallthrough warnings from objtool. In the case of uncheck division, UBSAN_DIV_ZERO may introduce new control flow to check for division by zero. Because the result of the division is undefined, LLVM may optimize the control flow such that after the call to __ubsan_handle_divrem_overflow doesn't matter. If panic_on_warn was set, __ubsan_handle_divrem_overflow would panic. The problem is is that panic_on_warn is run time configurable. If it's disabled, then we cannot guarantee that we will be able to recover safely. Disable this config for clang until we can come up with a solution in LLVM. Link: https://github.com/ClangBuiltLinux/linux/issues/1657 Link: https://github.com/llvm/llvm-project/issues/56289 Link: https://lore.kernel.org/lkml/CAHk-=wj1qhf7y3VNACEexyp5EbkNpdcu_542k-xZpzmYLOjiCg@mail.gmail.com/ Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-10ida: don't use BUG_ON() for debuggingLinus Torvalds
This is another old BUG_ON() that just shouldn't exist (see also commit a382f8fee42c: "signal handling: don't use BUG_ON() for debugging"). In fact, as Matthew Wilcox points out, this condition shouldn't really even result in a warning, since a negative id allocation result is just a normal allocation failure: "I wonder if we should even warn here -- sure, the caller is trying to free something that wasn't allocated, but we don't warn for kfree(NULL)" and goes on to point out how that current error check is only causing people to unnecessarily do their own index range checking before freeing it. This was noted by Itay Iellin, because the bluetooth HCI socket cookie code does *not* do that range checking, and ends up just freeing the error case too, triggering the BUG_ON(). The HCI code requires CAP_NET_RAW, and seems to just result in an ugly splat, but there really is no reason to BUG_ON() here, and we have generally striven for allocation models where it's always ok to just do free(alloc()); even if the allocation were to fail for some random reason (usually obviously that "random" reason being some resource limit). Fixes: 88eca0207cf1 ("ida: simplified functions for id allocation") Reported-by: Itay Iellin <ieitayie@gmail.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-03lockref: remove unused 'lockref_get_or_lock()' functionLinus Torvalds
Looking at the conditional lock acquire functions in the kernel due to the new sparse support (see commit 4a557a5d1a61 "sparse: introduce conditional lock acquire function attribute"), it became obvious that the lockref code has a couple of them, but they don't match the usual naming convention for the other ones, and their return value logic is also reversed. In the other very similar places, the naming pattern is '*_and_lock()' (eg 'atomic_put_and_lock()' and 'refcount_dec_and_lock()'), and the function returns true when the lock is taken. The lockref code is superficially very similar to the refcount code, only with the special "atomic wrt the embedded lock" semantics. But instead of the '*_and_lock()' naming it uses '*_or_lock()'. And instead of returning true in case it took the lock, it returns true if it *didn't* take the lock. Now, arguably the reflock code is quite logical: it really is a "either decrement _or_ lock" kind of situation - and the return value is about whether the operation succeeded without any special care needed. So despite the similarities, the differences do make some sense, and maybe it's not worth trying to unify the different conditional locking primitives in this area. But while looking at this all, it did become obvious that the 'lockref_get_or_lock()' function hasn't actually had any users for almost a decade. The only user it ever had was the shortlived 'd_rcu_to_refcount()' function, and it got removed and replaced with 'lockref_get_not_dead()' back in 2013 in commits 0d98439ea3c6 ("vfs: use lockred 'dead' flag to mark unrecoverably dead dentries") and e5c832d55588 ("vfs: fix dentry RCU to refcounting possibly sleeping dput()") In fact, that single use was removed less than a week after the whole function was introduced in commit b3abd80250c1 ("lockref: add 'lockref_get_or_lock() helper") so this function has been around for a decade, but only had a user for six days. Let's just put this mis-designed and unused function out of its misery. We can think about the naming and semantic oddities of the remaining 'lockref_put_or_lock()' later, but at least that function has users. And while the naming is different and the return value doesn't match, that function matches the whole '{atomic,refcount}_dec_and_test()' pattern much better (ie the magic happens when the count goes down to zero, not when it is incremented from zero). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-01Merge tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for batch getting of tags in sbitmap (wuchi) - NVMe pull request via Christoph: - More quirks (Lamarque Vieira Souza, Pablo Greco) - Fix a fabrics disconnect regression (Ruozhu Li) - Fix a nvmet-tcp data_digest calculation regression (Sagi Grimberg) - Fix nvme-tcp send failure handling (Sagi Grimberg) - Fix a regression with nvmet-loop and passthrough controllers (Alan Adamson) * tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block: nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1 nvmet: add a clear_ids attribute for passthru targets nvme: fix regression when disconnect a recovering ctrl nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G) nvme-tcp: always fail a request when sending it failed nvmet-tcp: fix regression in data_digest calculation lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()
2022-06-25lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()wuchi
1. Getting next index before continue branch. 2. Checking free bits when setting the target bits. Otherwise, it may reuse the busying bits. Signed-off-by: wuchi <wuchi.zero@gmail.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Link: https://lore.kernel.org/r/20220605145835.26916-1-wuchi.zero@gmail.com Fixes: 9672b0d43782 ("sbitmap: add __sbitmap_queue_get_batch()") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-19Merge tag 'objtool-urgent-2022-06-19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull build tooling updates from Thomas Gleixner: - Remove obsolete CONFIG_X86_SMAP reference from objtool - Fix overlapping text section failures in faddr2line for real - Remove OBJECT_FILES_NON_STANDARD usage from x86 ftrace and replace it with finegrained annotations so objtool can validate that code correctly. * tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usage faddr2line: Fix overlapping text section failures, the sequel objtool: Fix obsolete reference to CONFIG_X86_SMAP
2022-06-17Merge tag 'v5.19-p2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a potential build failure when CRYPTO=m" * tag 'v5.19-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: memneq - move into lib/
2022-06-12Merge tag 'random-5.19-rc2-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fixes from Jason Donenfeld: - A fix for a 5.19 regression for a case in which early device tree initializes the RNG, which flips a static branch. On most plaforms, jump labels aren't initialized until much later, so this caused splats. On a few mailing list threads, we cooked up easy fixes for arm64, arm32, and risc-v. But then things looked slightly more involved for xtensa, powerpc, arc, and mips. And at that point, when we're patching 7 architectures in a place before the console is even available, it seems like the cost/risk just wasn't worth it. So random.c works around it now by checking the already exported `static_key_initialized` boolean, as though somebody already ran into this issue in the past. I'm not super jazzed about that; it'd be prettier to not have to complicate downstream code. But I suppose it's practical. - A few small code nits and adding a missing __init annotation. - A change to the default config values to use the cpu and bootloader's seeds for initializing the RNG earlier. This brings them into line with what all the distros do (Fedora/RHEL, Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void... at least), and moreover will now give us test coverage in various test beds that might have caught the above device tree bug earlier. - A change to WireGuard CI's configuration to increase test coverage around the RNG. - A documentation comment fix to unrelated maintainerless CRC code that I was asked to take, I guess because it has to do with polynomials (which the RNG thankfully no longer uses). * tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: wireguard: selftests: use maximum cpu features and allow rng seeding random: remove rng_has_arch_random() random: credit cpu and bootloader seeds by default random: do not use jump labels before they are initialized random: account for arch randomness in bits random: mark bootloader randomness code as __init random: avoid checking crng_ready() twice in random_init() crc-itu-t: fix typo in CRC ITU-T polynomial comment
2022-06-12crypto: memneq - move into lib/Jason A. Donenfeld
This is used by code that doesn't need CONFIG_CRYPTO, so move this into lib/ with a Kconfig option so that it can be selected by whatever needs it. This fixes a linker error Zheng pointed out when CRYPTO_MANAGER_DISABLE_TESTS!=y and CRYPTO=m: lib/crypto/curve25519-selftest.o: In function `curve25519_selftest': curve25519-selftest.c:(.init.text+0x60): undefined reference to `__crypto_memneq' curve25519-selftest.c:(.init.text+0xec): undefined reference to `__crypto_memneq' curve25519-selftest.c:(.init.text+0x114): undefined reference to `__crypto_memneq' curve25519-selftest.c:(.init.text+0x154): undefined reference to `__crypto_memneq' Reported-by: Zheng Bin <zhengbin13@huawei.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: stable@vger.kernel.org Fixes: aa127963f1ca ("crypto: lib/curve25519 - re-add selftests") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-11iov_iter: fix build issue due to possible type mis-matchLinus Torvalds
Commit 6c77676645ad ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()") introduced a problem on some 32-bit architectures (at least arm, xtensa, csky,sparc and mips), that have a 'size_t' that is 'unsigned int'. The reason is that we now do min(nr * PAGE_SIZE - offset, maxsize); where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is 'unsigned long'. As a result, the normal C type rules means that the first argument to 'min()' ends up being 'unsigned long'. In contrast, 'maxsize' is of type 'size_t'. Now, 'size_t' and 'unsigned long' are always the same physical type in the kernel, so you'd think this doesn't matter, and from an actual arithmetic standpoint it doesn't. But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if it could also be 'unsigned long'. In that situation, both are unsigned 32-bit types, but they are not the *same* type. And as a result 'min()' will complain about the distinct types (ignore the "pointer types" part of the error message: that's an artifact of the way we have made 'min()' check types for being the same): lib/iov_iter.c: In function 'iter_xarray_get_pages': include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror] 20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) | ^~ lib/iov_iter.c:1464:16: note: in expansion of macro 'min' 1464 | return min(nr * PAGE_SIZE - offset, maxsize); | ^~~ This was not visible on 64-bit architectures (where we always define 'size_t' to be 'unsigned long'). Force these cases to use 'min_t(size_t, x, y)' to make the type explicit and avoid the issue. [ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned long' arithmetically. We've certainly historically seen environments with 16-bit address spaces and 32-bit 'unsigned long'. Similarly, even in 64-bit modern environments, 'size_t' could be its own type distinct from 'unsigned long', even if it were arithmetically identical. So the above type commentary is only really descriptive of the kernel environment, not some kind of universal truth for the kinds of wild and crazy situations that are allowed by the C standard ] Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/ Cc: Jeff Layton <jlayton@kernel.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-10Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull iov_iter fix from Al Viro: "ITER_XARRAY get_pages fix; now the return value is a lot saner (and more similar to logics for other flavours)" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: iov_iter: Fix iter_xarray_get_pages{,_alloc}()
2022-06-10iov_iter: Fix iter_xarray_get_pages{,_alloc}()David Howells
The maths at the end of iter_xarray_get_pages() to calculate the actual size doesn't work under some circumstances, such as when it's been asked to extract a partial single page. Various terms of the equation cancel out and you end up with actual == offset. The same issue exists in iter_xarray_get_pages_alloc(). Fix these to just use min() to select the lesser amount from between the amount of page content transcribed into the buffer, minus the offset, and the size limit specified. This doesn't appear to have caused a problem yet upstream because network filesystems aren't getting the pages from an xarray iterator, but rather passing it directly to the socket, which just iterates over it. Cachefiles *does* do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for whole pages to be written or read. Fixes: 7ff5062079ef ("iov_iter: Add ITER_XARRAY") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Mike Marshall <hubcap@omnibond.com> cc: Gao Xiang <xiang@kernel.org> cc: linux-afs@lists.infradead.org cc: v9fs-developer@lists.sourceforge.net cc: devel@lists.orangefs.org cc: linux-erofs@lists.ozlabs.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-10random: remove rng_has_arch_random()Jason A. Donenfeld
With arch randomness being used by every distro and enabled in defconfigs, the distinction between rng_has_arch_random() and rng_is_initialized() is now rather small. In fact, the places where they differ are now places where paranoid users and system builders really don't want arch randomness to be used, in which case we should respect that choice, or places where arch randomness is known to be broken, in which case that choice is all the more important. So this commit just removes the function and its one user. Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-09mm/huge_memory: Fix xarray node memory leakMatthew Wilcox (Oracle)
If xas_split_alloc() fails to allocate the necessary nodes to complete the xarray entry split, it sets the xa_state to -ENOMEM, which xas_nomem() then interprets as "Please allocate more memory", not as "Please free any unnecessary memory" (which was the intended outcome). It's confusing to use xas_nomem() to free memory in this context, so call xas_destroy() instead. Reported-by: syzbot+9e27a75a8c24f3fe75c1@syzkaller.appspotmail.com Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-07crc-itu-t: fix typo in CRC ITU-T polynomial commentRoger Knecht
The code comment says that the polynomial is x^16 + x^12 + x^15 + 1, but the correct polynomial is x^16 + x^12 + x^5 + 1. Quoting from page 2 in the ITU-T V.41 specification [1]: 2 Encoding and checking process The service bits and information bits, taken in conjunction, correspond to the coefficients of a message polynomial having terms from x^(n-1) (n = total number of bits in a block or sequence) down to x^16. This polynomial is divided, modulo 2, by the generating polynomial x^16 + x^12 + x^5 + 1. The hex (truncated) polynomial 0x1021 and CRC code implementation are correct, however. [1] https://www.itu.int/rec/T-REC-V.41-198811-I/en Signed-off-by: Roger Knecht <roger@norberthealth.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-06objtool: Fix obsolete reference to CONFIG_X86_SMAPJosh Poimboeuf
CONFIG_X86_SMAP no longer exists. For objtool's purposes it has been replaced with CONFIG_HAVE_UACCESS_VALIDATION. Fixes: 03f16cd020eb ("objtool: Add CONFIG_OBJTOOL") Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/44c57668768c1ba1b4ba1ff541ec54781636e07c.1654101721.git.jpoimboe@kernel.org
2022-06-04Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - bitmap: optimize bitmap_weight() usage, from me - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro Carvalho Chehab - include/linux/find: Fix documentation, from Anna-Maria Behnsen - bitmap: fix conversion from/to fix-sized arrays, from me - bitmap: Fix return values to be unsigned, from Kees Cook It has been in linux-next for at least a week with no problems. * tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits) nodemask: Fix return values to be unsigned bitmap: Fix return values to be unsigned KVM: x86: hyper-v: replace bitmap_weight() with hweight64() KVM: x86: hyper-v: fix type of valid_bank_mask ia64: cleanup remove_siblinginfo() drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate lib/bitmap: add test for bitmap_{from,to}_arr64 lib: add bitmap_{from,to}_arr64 lib/bitmap: extend comment for bitmap_(from,to)_arr32() include/linux/find: Fix documentation lib/bitmap.c make bitmap_print_bitmask_to_buf parseable MAINTAINERS: add cpumask and nodemask files to BITMAP_API arch/x86: replace nodes_weight with nodes_empty where appropriate mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate clocksource: replace cpumask_weight with cpumask_empty in clocksource.c genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate irq: mips: replace cpumask_weight with cpumask_empty where appropriate drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate arch/x86: replace cpumask_weight with cpumask_empty where appropriate ...
2022-06-03Merge tag 'driver-core-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info *pld driver core: Add "*" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...
2022-06-03Merge tag 'spdx-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here are some SPDX license marker changes. The SPDX-labeling effort has started to pick up again, so here are some changes for various parts of the tree that are related to this effort. Included in here are: - freevxfs license updates - spihash.c license cleanups - spdxcheck script updates to make things easier to work with going forward All of the license updates came from the original authors/copyright holders of the code involved. All of these have been in linux-next for weeks with no reported issues" * tag 'spdx-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: siphash: add SPDX tags as sole licensing authority scripts/spdxcheck: Exclude top-level README scripts/spdxcheck: Exclude MAINTAINERS/CREDITS scripts/spdxcheck: Exclude config directories scripts/spdxcheck: Put excluded files and directories into a separate file scripts/spdxcheck: Add option to display files without SPDX scripts/spdxcheck: Add [sub]directory statistics scripts/spdxcheck: Add directory statistics scripts/spdxcheck: Add percentage to statistics freevxfs: relicense to GPLv2 only
2022-06-03nodemask: Fix return values to be unsignedKees Cook
The nodemask routines had mixed return values that provided potentially signed return values that could never happen. This was leading to the compiler getting confusing about the range of possible return values (it was thinking things could be negative where they could not be). Fix all the nodemask routines that should be returning unsigned (or bool) values. Silences: mm/swapfile.c: In function ‘setup_swap_info’: mm/swapfile.c:2291:47: error: array subscript -1 is below array bounds of ‘struct plist_node[]’ [-Werror=array-bounds] 2291 | p->avail_lists[i].prio = 1; | ~~~~~~~~~~~~~~^~~ In file included from mm/swapfile.c:16: ./include/linux/swap.h:292:27: note: while referencing ‘avail_lists’ 292 | struct plist_node avail_lists[]; /* | ^~~~~~~~~~~ Reported-by: Christophe de Dinechin <dinechin@redhat.com> Link: https://lore.kernel.org/lkml/20220414150855.2407137-3-dinechin@redhat.com/ Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03bitmap: Fix return values to be unsignedKees Cook
Both nodemask and bitmap routines had mixed return values that provided potentially signed return values that could never happen. This was leading to the compiler getting confusing about the range of possible return values (it was thinking things could be negative where they could not be). In preparation for fixing nodemask, fix all the bitmap routines that should be returning unsigned (or bool) values. Cc: Yury Norov <yury.norov@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Christophe de Dinechin <dinechin@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03lib/bitmap: add test for bitmap_{from,to}_arr64Yury Norov
Test newly added bitmap_{from,to}_arr64() functions similarly to already existing bitmap_{from,to}_arr32() tests. CC: Alexander Gordeev <agordeev@linux.ibm.com> CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: Christian Borntraeger <borntraeger@linux.ibm.com> CC: Claudio Imbrenda <imbrenda@linux.ibm.com> CC: David Hildenbrand <david@redhat.com> CC: Heiko Carstens <hca@linux.ibm.com> CC: Janosch Frank <frankja@linux.ibm.com> CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> CC: Sven Schnelle <svens@linux.ibm.com> CC: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03lib: add bitmap_{from,to}_arr64Yury Norov
Manipulating 64-bit arrays with bitmap functions is potentially dangerous because on 32-bit BE machines the order of halfwords doesn't match. Another issue is that compiler may throw a warning about out-of-boundary access. This patch adds bitmap_{from,to}_arr64 functions in addition to existing bitmap_{from,to}_arr32. CC: Alexander Gordeev <agordeev@linux.ibm.com> CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: Christian Borntraeger <borntraeger@linux.ibm.com> CC: Claudio Imbrenda <imbrenda@linux.ibm.com> CC: David Hildenbrand <david@redhat.com> CC: Heiko Carstens <hca@linux.ibm.com> CC: Janosch Frank <frankja@linux.ibm.com> CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> CC: Sven Schnelle <svens@linux.ibm.com> CC: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03lib/bitmap.c make bitmap_print_bitmask_to_buf parseableMauro Carvalho Chehab
The documentation of such function is not on a proper ReST format, as reported by Sphinx: Documentation/core-api/kernel-api:81: ./lib/bitmap.c:532: WARNING: Unexpected indentation. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:526: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:532: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:532: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:533: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:536: WARNING: Definition list ends without a blank line; unexpected unindent. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:542: WARNING: Unexpected indentation. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:536: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:536: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:543: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:552: WARNING: Unexpected indentation. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:545: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:545: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:552: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:552: WARNING: Inline emphasis start-string without end-string. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:554: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:556: WARNING: Definition list ends without a blank line; unexpected unindent. Documentation/core-api/kernel-api:81: ./lib/bitmap.c:580: WARNING: Unexpected indentation. So, the produced output at: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html?#c.bitmap_print_bitmask_to_buf is broken. Fix it by adding spaces and marking the literal blocks. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-01assoc_array: Fix BUG_ON during garbage collectStephen Brennan
A rare BUG_ON triggered in assoc_array_gc: [3430308.818153] kernel BUG at lib/assoc_array.c:1609! Which corresponded to the statement currently at line 1593 upstream: BUG_ON(assoc_array_ptr_is_meta(p)); Using the data from the core dump, I was able to generate a userspace reproducer[1] and determine the cause of the bug. [1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc After running the iterator on the entire branch, an internal tree node looked like the following: NODE (nr_leaves_on_branch: 3) SLOT [0] NODE (2 leaves) SLOT [1] NODE (1 leaf) SLOT [2..f] NODE (empty) In the userspace reproducer, the pr_devel output when compressing this node was: -- compress node 0x5607cc089380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] after: 3 At slot 0, an internal node with 2 leaves could not be folded into the node, because there was only one available slot (slot 0). Thus, the internal node was retained. At slot 1, the node had one leaf, and was able to be folded in successfully. The remaining nodes had no leaves, and so were removed. By the end of the compression stage, there were 14 free slots, and only 3 leaf nodes. The tree was ascended and then its parent node was compressed. When this node was seen, it could not be folded, due to the internal node it contained. The invariant for compression in this function is: whenever nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all leaf nodes. The compression step currently cannot guarantee this, given the corner case shown above. To fix this issue, retry compression whenever we have retained a node, and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second compression will then allow the node in slot 1 to be folded in, satisfying the invariant. Below is the output of the reproducer once the fix is applied: -- compress node 0x560e9c562380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] internal nodes remain despite enough space, retrying -- compress node 0x560e9c562380 -- free=14, leaves=1 [0] fold node 2/15 [nx 0] after: 3 Changes ======= DH: - Use false instead of 0. - Reorder the inserted lines in a couple of places to put retained before next_slot. ver #2) - Fix typo in pr_devel, correct comparison to "<=" Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Cc: <stable@vger.kernel.org> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.com/ # v1 Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.com/ # v2 Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-29Merge tag 'trace-v5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The majority of the changes are for fixes and clean ups. Notable changes: - Rework trace event triggers code to be easier to interact with. - Support for embedding bootconfig with the kernel (as suppose to having it embedded in initram). This is useful for embedded boards without initram disks. - Speed up boot by parallelizing the creation of tracefs files. - Allow absolute ring buffer timestamps handle timestamps that use more than 59 bits. - Added new tracing clock "TAI" (International Atomic Time) - Have weak functions show up in available_filter_function list as: __ftrace_invalid_address___<invalid-offset> instead of using the name of the function before it" * tag 'trace-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (52 commits) ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function tracing: Fix comments for event_trigger_separate_filter() x86/traceponit: Fix comment about irq vector tracepoints x86,tracing: Remove unused headers ftrace: Clean up hash direct_functions on register failures tracing: Fix comments of create_filter() tracing: Disable kcov on trace_preemptirq.c tracing: Initialize integer variable to prevent garbage return value ftrace: Fix typo in comment ftrace: Remove return value of ftrace_arch_modify_*() tracing: Cleanup code by removing init "char *name" tracing: Change "char *" string form to "char []" tracing/timerlat: Do not wakeup the thread if the trace stops at the IRQ tracing/timerlat: Print stacktrace in the IRQ handler if needed tracing/timerlat: Notify IRQ new max latency only if stop tracing is set kprobes: Fix build errors with CONFIG_KRETPROBES=n tracing: Fix return value of trace_pid_write() tracing: Fix potential double free in create_var_ref() tracing: Use strim() to remove whitespace instead of doing it manually ftrace: Deal with error return code of the ftrace_process_locs() function ...
2022-05-28Revert "crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE"Jason A. Donenfeld
This reverts commit 8bdc2a190105e862dfe7a4033f2fd385b7e58ae8. It got merged a bit prematurely and shortly after the kernel test robot and Sudip pointed out build failures: arm: imx_v6_v7_defconfig and multi_v7_defconfig mips: decstation_64_defconfig, decstation_defconfig, decstation_r4k_defconfig In file included from crypto/chacha20poly1305.c:13: include/crypto/poly1305.h:56:46: error: 'CONFIG_CRYPTO_LIB_POLY1305_RSIZE' undeclared here (not in a function); did you mean 'CONFIG_CRYPTO_POLY1305_MODULE'? 56 | struct poly1305_key opaque_r[CONFIG_CRYPTO_LIB_POLY1305_RSIZE]; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We could attempt to fix this by listing the dependencies piecemeal, but it's not as obvious as it looks: drivers like caam use this macro in headers even if there's no .o compiled in that makes use of it. So actually fixing this might require a bit more of a comprehensive approach, rather than whack-a-mole with hunting down which drivers use which headers which use this macro. Therefore, this commit just reverts the change, and maybe the problem can be visited on the next rainy day. Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Fixes: 8bdc2a190105 ("crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-27Merge tag 'cxl-for-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for this cycle. The highlight is new driver-core infrastructure and CXL subsystem changes for allowing lockdep to validate device_lock() usage. Thanks to PeterZ for setting me straight on the current capabilities of the lockdep API, and Greg acked it as well. On the CXL ACPI side this update adds support for CXL _OSC so that platform firmware knows that it is safe to still grant Linux native control of PCIe hotplug and error handling in the presence of CXL devices. A circular dependency problem was discovered between suspend and CXL memory for cases where the suspend image might be stored in CXL memory where that image also contains the PCI register state to restore to re-enable the device. Disable suspend for now until an architecture is defined to clarify that conflict. Lastly a collection of reworks, fixes, and cleanups to the CXL subsystem where support for snooping mailbox commands and properly handling the "mem_enable" flow are the highlights. Summary: - Add driver-core infrastructure for lockdep validation of device_lock(), and fixup a deadlock report that was previously hidden behind the 'lockdep no validate' policy. - Add CXL _OSC support for claiming native control of CXL hotplug and error handling. - Disable suspend in the presence of CXL memory unless and until a protocol is identified for restoring PCI device context from memory hosted on CXL PCI devices. - Add support for snooping CXL mailbox commands to protect against inopportune changes, like set-partition with the 'immediate' flag set. - Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC / 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL HDM Capability). - Miscellaneous cleanups and fixes from -next exposure" * tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits) cxl/port: Enable HDM Capability after validating DVSEC Ranges cxl/port: Reuse 'struct cxl_hdm' context for hdm init cxl/port: Move endpoint HDM Decoder Capability init to port driver cxl/pci: Drop @info argument to cxl_hdm_decode_init() cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init() cxl/mem: Skip range enumeration if mem_enable clear cxl/mem: Consolidate CXL DVSEC Range enumeration in the core cxl/pci: Move cxl_await_media_ready() to the core cxl/mem: Validate port connectivity before dvsec ranges cxl/mem: Fix cxl_mem_probe() error exit cxl/pci: Drop wait_for_valid() from cxl_await_media_ready() cxl/pci: Consolidate wait_for_media() and wait_for_media_ready() cxl/mem: Drop mem_enabled check from wait_for_media() nvdimm: Fix firmware activation deadlock scenarios device-core: Kill the lockdep_mutex nvdimm: Drop nd_device_lock() ACPI: NFIT: Drop nfit_device_lock() nvdimm: Replace lockdep_mutex with local lock classes cxl: Drop cxl_device_lock() cxl/acpi: Add root device lockdep validation ...
2022-05-27Merge tag 'v5.19-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Test in-place en/decryption with two sglists in testmgr - Fix process vs softirq race in cryptd Algorithms: - Add arm64 acceleration for sm4 - Add s390 acceleration for chacha20 Drivers: - Add polarfire soc hwrng support in mpsf - Add support for TI SoC AM62x in sa2ul - Add support for ATSHA204 cryptochip in atmel-sha204a - Add support for PRNG in caam - Restore support for storage encryption in qat - Restore support for storage encryption in hisilicon/sec" * tag 'v5.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) hwrng: omap3-rom - fix using wrong clk_disable() in omap_rom_rng_runtime_resume() crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY crypto: qat - add support for 401xx devices crypto: qat - re-enable registration of algorithms crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag crypto: qat - add param check for DH crypto: qat - add param check for RSA crypto: qat - remove dma_free_coherent() for DH crypto: qat - remove dma_free_coherent() for RSA crypto: qat - fix memory leak in RSA crypto: qat - add backlog mechanism crypto: qat - refactor submission logic crypto: qat - use pre-allocated buffers in datapath crypto: qat - set to zero DH parameters before free crypto: s390 - add crypto library interface for ChaCha20 crypto: talitos - Uniform coding style with defined variable crypto: octeontx2 - simplify the return expression of otx2_cpt_aead_cbc_aes_sha_setkey() crypto: cryptd - Protect per-CPU resource by disabling BH. crypto: sun8i-ce - do not fallback if cryptlen is less than sg length crypto: sun8i-ce - rework debugging ...
2022-05-27Merge tag 'mm-stable-2022-05-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - Two follow-on fixes for the post-5.19 series "Use pageblock_order for cma and alloc_contig_range alignment", from Zi Yan. - A series of z3fold cleanups and fixes from Miaohe Lin. - Some memcg selftests work from Michal Koutný <mkoutny@suse.com> - Some swap fixes and cleanups from Miaohe Lin - Several individual minor fixups * tag 'mm-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits) mm/shmem.c: suppress shift warning mm: Kconfig: reorganize misplaced mm options mm: kasan: fix input of vmalloc_to_page() mm: fix is_pinnable_page against a cma page mm: filter out swapin error entry in shmem mapping mm/shmem: fix infinite loop when swap in shmem error at swapoff time mm/madvise: free hwpoison and swapin error entry in madvise_free_pte_range mm/swapfile: fix lost swap bits in unuse_pte() mm/swapfile: unuse_pte can map random data if swap read fails selftests: memcg: factor out common parts of memory.{low,min} tests selftests: memcg: remove protection from top level memcg selftests: memcg: adjust expected reclaim values of protected cgroups selftests: memcg: expect no low events in unprotected sibling selftests: memcg: fix compilation mm/z3fold: fix z3fold_page_migrate races with z3fold_map mm/z3fold: fix z3fold_reclaim_page races with z3fold_free mm/z3fold: always clear PAGE_CLAIMED under z3fold page lock mm/z3fold: put z3fold page back into unbuddied list when reclaim or migration fails revert "mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc" mm/z3fold: throw warning on failure of trylock_page in z3fold_alloc ...
2022-05-27Merge tag 'mm-nonmm-stable-2022-05-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "The non-MM patch queue for this merge window. Not a lot of material this cycle. Many singleton patches against various subsystems. Most notably some maintenance work in ocfs2 and initramfs" * tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits) kcov: update pos before writing pc in trace function ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock fs/ntfs: remove redundant variable idx fat: remove time truncations in vfat_create/vfat_mkdir fat: report creation time in statx fat: ignore ctime updates, and keep ctime identical to mtime in memory fat: split fat_truncate_time() into separate functions MAINTAINERS: add Muchun as a memcg reviewer proc/sysctl: make protected_* world readable ia64: mca: drop redundant spinlock initialization tty: fix deadlock caused by calling printk() under tty_port->lock relay: remove redundant assignment to pointer buf fs/ntfs3: validate BOOT sectors_per_clusters lib/string_helpers: fix not adding strarray to device's resource list kernel/crash_core.c: remove redundant check of ck_cmdline ELF, uapi: fixup ELF_ST_TYPE definition ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() ipc: update semtimedop() to use hrtimer ipc/sem: remove redundant assignments ...
2022-05-27crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZEJason A. Donenfeld
When CRYPTO_LIB_POLY1305 is unset, CRYPTO_LIB_POLY1305_RSIZE is still set in the Kconfig, cluttering things. Fix this by making CRYPTO_LIB_POLY1305_RSIZE depend on CRYPTO_LIB_POLY1305. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-27mm: Kconfig: reorganize misplaced mm optionsVlastimil Babka
After commits 7b42f1041c98 ("mm: Kconfig: move swap and slab config options to the MM section") and 519bcb797907 ("mm: Kconfig: group swap, slab, hotplug and thp options into submenus") we now have nicely organized mm related config options. I have noticed some that were still misplaced, so this moves them from various places into the new structure: VM_EVENT_COUNTERS, COMPAT_BRK, MMAP_ALLOW_UNINITIALIZED to mm/Kconfig and general MM section. SLUB_STATS to mm/Kconfig and the slab submenu. DEBUG_SLAB, SLUB_DEBUG, SLUB_DEBUG_ON to mm/Kconfig.debug and the Kernel hacking / Memory Debugging submenu. Link: https://lkml.kernel.org/r/20220525112559.1139-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-26Merge tag 'mm-stable-2022-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ...
2022-05-26locking/lockref: Use try_cmpxchg64 in CMPXCHG_LOOP macroUros Bizjak
Use try_cmpxchg64 instead of cmpxchg64 in CMPXCHG_LOOP macro. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). The main loop of lockref_get improves from: 13: 48 89 c1 mov %rax,%rcx 16: 48 c1 f9 20 sar $0x20,%rcx 1a: 83 c1 01 add $0x1,%ecx 1d: 48 89 ce mov %rcx,%rsi 20: 89 c1 mov %eax,%ecx 22: 48 89 d0 mov %rdx,%rax 25: 48 c1 e6 20 shl $0x20,%rsi 29: 48 09 f1 or %rsi,%rcx 2c: f0 48 0f b1 4d 00 lock cmpxchg %rcx,0x0(%rbp) 32: 48 39 d0 cmp %rdx,%rax 35: 75 17 jne 4e <lockref_get+0x4e> to: 13: 48 89 ca mov %rcx,%rdx 16: 48 c1 fa 20 sar $0x20,%rdx 1a: 83 c2 01 add $0x1,%edx 1d: 48 89 d6 mov %rdx,%rsi 20: 89 ca mov %ecx,%edx 22: 48 c1 e6 20 shl $0x20,%rsi 26: 48 09 f2 or %rsi,%rdx 29: f0 48 0f b1 55 00 lock cmpxchg %rdx,0x0(%rbp) 2f: 75 02 jne 33 <lockref_get+0x33> [ Michael Ellerman and Mark Rutland confirm that code generation on powerpc and arm64 respectively is also ok, even though they do not have a native arch_try_cmpxchg() implementation, and rely on the default fallback case - Linus ] Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman.Long@hp.com Cc: paulmck@linux.vnet.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-25Merge tag 'net-next-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-25Merge tag 'linux-kselftest-kunit-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit updates from Shuah Khan: "Several fixes, cleanups, and enhancements to tests and framework: - introduce _NULL and _NOT_NULL macros to pointer error checks - rework kunit_resource allocation policy to fix memory leaks when caller doesn't specify free() function to be used when allocating memory using kunit_add_resource() and kunit_alloc_resource() funcs. - add ability to specify suite-level init and exit functions" * tag 'linux-kselftest-kunit-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (41 commits) kunit: tool: Use qemu-system-i386 for i386 runs kunit: fix executor OOM error handling logic on non-UML kunit: tool: update riscv QEMU config with new serial dependency kcsan: test: use new suite_{init,exit} support kunit: tool: Add list of all valid test configs on UML kunit: take `kunit_assert` as `const` kunit: tool: misc cleanups kunit: tool: minor cosmetic cleanups in kunit_parser.py kunit: tool: make parser stop overwriting status of suites w/ no_tests kunit: tool: remove dead parse_crash_in_log() logic kunit: tool: print clearer error message when there's no TAP output kunit: tool: stop using a shell to run kernel under QEMU kunit: tool: update test counts summary line format kunit: bail out of test filtering logic quicker if OOM lib/Kconfig.debug: change KUnit tests to default to KUNIT_ALL_TESTS kunit: Rework kunit_resource allocation policy kunit: fix debugfs code to use enum kunit_status, not bool kfence: test: use new suite_{init/exit} support, add .kunitconfig kunit: add ability to specify suite-level init and exit functions kunit: rename print_subtest_{start,end} for clarity (s/subtest/suite) ...
2022-05-25Merge tag 'printk-for-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Offload writing printk() messages on consoles to per-console kthreads. It prevents soft-lockups when an extensive amount of messages is printed. It was observed, for example, during boot of large systems with a lot of peripherals like disks or network interfaces. It prevents live-lockups that were observed, for example, when messages about allocation failures were reported and a CPU handled consoles instead of reclaiming the memory. It was hard to solve even with rate limiting because it would need to take into account the amount of messages and the speed of all consoles. It is a must to have for real time. Otherwise, any printk() might break latency guarantees. The per-console kthreads allow to handle each console on its own speed. Slow consoles do not longer slow down faster ones. And printk() does not longer unpredictably slows down various code paths. There are situations when the kthreads are either not available or not reliable, for example, early boot, suspend, or panic. In these situations, printk() uses the legacy mode and tries to handle consoles immediately. - Add documentation for the printk index. * tag 'printk-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk, tracing: fix console tracepoint printk: remove @console_locked printk: extend console_lock for per-console locking printk: add kthread console printers printk: add functions to prefer direct printing printk: add pr_flush() printk: move buffer definitions into console_emit_next_record() caller printk: refactor and rework printing logic printk: add con_printk() macro for console details printk: call boot_delay_msec() in printk_delay() printk: get caller_id/timestamp after migration disable printk: wake waiters for safe and NMI contexts printk: wake up all waiters printk: add missing memory barrier to wake_up_klogd() printk: cpu sync always disable interrupts printk: rename cpulock functions printk/index: Printk index feature documentation MAINTAINERS: Add printk indexing maintainers on mention of printk_index
2022-05-25Merge tag 'slab-for-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - Conversion of slub_debug stack traces to stackdepot, allowing more useful debugfs-based inspection for e.g. memory leak debugging. Allocation and free debugfs info now includes full traces and is sorted by the unique trace frequency. The stackdepot conversion was already attempted last year but reverted by ae14c63a9f20. The memory overhead (while not actually enabled on boot) has been meanwhile solved by making the large stackdepot allocation dynamic. The xfstest issues haven't been reproduced on current kernel locally nor in -next, so the slab cache layout changes that originally made that bug manifest were probably not the root cause. - Refactoring of dma-kmalloc caches creation. - Trivial cleanups such as removal of unused parameters, fixes and clarifications of comments. - Hyeonggon Yoo joins as a reviewer. * tag 'slab-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: MAINTAINERS: add myself as reviewer for slab mm/slub: remove unused kmem_cache_order_objects max mm: slab: fix comment for __assume_kmalloc_alignment mm: slab: fix comment for ARCH_KMALLOC_MINALIGN mm/slub: remove unneeded return value of slab_pad_check mm/slab_common: move dma-kmalloc caches creation into new_kmalloc_cache() mm/slub: remove meaningless node check in ___slab_alloc() mm/slub: remove duplicate flag in allocate_slab() mm/slub: remove unused parameter in setup_object*() mm/slab.c: fix comments slab, documentation: add description of debugfs files for SLUB caches mm/slub: sort debugfs output by frequency of stack traces mm/slub: distinguish and print stack traces in debugfs files mm/slub: use stackdepot to save stack trace in objects mm/slub: move struct track init out of set_track() lib/stackdepot: allow requesting early initialization dynamically mm/slub, kunit: Make slub_kunit unaffected by user specified flags mm/slab: remove some unused functions
2022-05-24Merge tag 'hwmon-for-v5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon updates from Guenter Roeck: "New drivers: - Driver for the Microchip LAN966x SoC - PMBus driver for Infineon Digital Multi-phase xdp152 family controllers Chip support added to existing drivers: - asus-ec-sensors: - Support for ROG STRIX X570-E GAMING WIFI II, PRIME X470-PRO, and ProArt X570 Creator WIFI - External temperature sensor support for ASUS WS X570-ACE - nct6775: - Support for I2C driver - Support for ASUS PRO H410T / PRIME H410M-R / ROG X570-E GAMING WIFI II - lm75: - Support for - Atmel AT30TS74 - pmbus/max16601: - Support for MAX16602 - aquacomputer_d5next: - Support for Aquacomputer Farbwerk - Support for Aquacomputer Octo - jc42: - Support for S-34TS04A Kernel API changes / clarifications: - The chip parameter of with_info API is now mandatory - New hwmon_device_register_for_thermal API call for use by the thermal subsystem Improvements: - PMBus and JC42 drivers now register with thermal subsystem - PMBus drivers now support get_voltage/set_voltage power operations - The adt7475 driver now supports pin configuration - The lm90 driver now supports setting extended range temperatures configuration with a devicetree property - The dell-smm driver now registers as cooling device - The OCC driver delays hwmon registration until requested by userspace ... and various other minor fixes and improvements" * tag 'hwmon-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (71 commits) hwmon: (aquacomputer_d5next) Fix an error handling path in aqc_probe() hwmon: (sl28cpld) Fix typo in comment hwmon: (pmbus) Check PEC support before reading other registers hwmon: (dimmtemp) Fix bitmap handling hwmon: (lm90) enable extended range according to DTS node dt-bindings: hwmon: lm90: add ti,extended-range-enable property dt-bindings: hwmon: lm90: add missing ti,tmp461 hwmon: (ibmaem) Directly use ida_alloc()/free() hwmon: Directly use ida_alloc()/free() hwmon: (asus-ec-sensors) fix Formula VIII definition dt-bindings: trivial-devices: Add xdp152 hwmon: (sl28cpld-hwmon) Use HWMON_CHANNEL_INFO macro hwmon: (pwm-fan) Use HWMON_CHANNEL_INFO macro hwmon: (peci/dimmtemp) Use HWMON_CHANNEL_INFO macro hwmon: (peci/cputemp) Use HWMON_CHANNEL_INFO macro hwmon: (mr75203) Use HWMON_CHANNEL_INFO macro hwmon: (ltc2992) Use HWMON_CHANNEL_INFO macro hwmon: (as370-hwmon) Use HWMON_CHANNEL_INFO macro hwmon: Make chip parameter for with_info API mandatory thermal/drivers/thermal_hwmon: Use hwmon_device_register_for_thermal() ...
2022-05-24Merge tag 'random-5.19-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "These updates continue to refine the work began in 5.17 and 5.18 of modernizing the RNG's crypto and streamlining and documenting its code. New for 5.19, the updates aim to improve entropy collection methods and make some initial decisions regarding the "premature next" problem and our threat model. The cloc utility now reports that random.c is 931 lines of code and 466 lines of comments, not that basic metrics like that mean all that much, but at the very least it tells you that this is very much a manageable driver now. Here's a summary of the various updates: - The random_get_entropy() function now always returns something at least minimally useful. This is the primary entropy source in most collectors, which in the best case expands to something like RDTSC, but prior to this change, in the worst case it would just return 0, contributing nothing. For 5.19, additional architectures are wired up, and architectures that are entirely missing a cycle counter now have a generic fallback path, which uses the highest resolution clock available from the timekeeping subsystem. Some of those clocks can actually be quite good, despite the CPU not having a cycle counter of its own, and going off-core for a stamp is generally thought to increase jitter, something positive from the perspective of entropy gathering. Done very early on in the development cycle, this has been sitting in next getting some testing for a while now and has relevant acks from the archs, so it should be pretty well tested and fine, but is nonetheless the thing I'll be keeping my eye on most closely. - Of particular note with the random_get_entropy() improvements is MIPS, which, on CPUs that lack the c0 count register, will now combine the high-speed but short-cycle c0 random register with the lower-speed but long-cycle generic fallback path. - With random_get_entropy() now always returning something useful, the interrupt handler now collects entropy in a consistent construction. - Rather than comparing two samples of random_get_entropy() for the jitter dance, the algorithm now tests many samples, and uses the amount of differing ones to determine whether or not jitter entropy is usable and how laborious it must be. The problem with comparing only two samples was that if the cycle counter was extremely slow, but just so happened to be on the cusp of a change, the slowness wouldn't be detected. Taking many samples fixes that to some degree. This, combined with the other improvements to random_get_entropy(), should make future unification of /dev/random and /dev/urandom maybe more possible. At the very least, were we to attempt it again today (we're not), it wouldn't break any of Guenter's test rigs that broke when we tried it with 5.18. So, not today, but perhaps down the road, that's something we can revisit. - We attempt to reseed the RNG immediately upon waking up from system suspend or hibernation, making use of the various timestamps about suspend time and such available, as well as the usual inputs such as RDRAND when available. - Batched randomness now falls back to ordinary randomness before the RNG is initialized. This provides more consistent guarantees to the types of random numbers being returned by the various accessors. - The "pre-init injection" code is now gone for good. I suspect you in particular will be happy to read that, as I recall you expressing your distaste for it a few months ago. Instead, to avoid a "premature first" issue, while still allowing for maximal amount of entropy availability during system boot, the first 128 bits of estimated entropy are used immediately as it arrives, with the next 128 bits being buffered. And, as before, after the RNG has been fully initialized, it winds up reseeding anyway a few seconds later in most cases. This resulted in a pretty big simplification of the initialization code and let us remove various ad-hoc mechanisms like the ugly crng_pre_init_inject(). - The RNG no longer pretends to handle the "premature next" security model, something that various academics and other RNG designs have tried to care about in the past. After an interesting mailing list thread, these issues are thought to be a) mainly academic and not practical at all, and b) actively harming the real security of the RNG by delaying new entropy additions after a potential compromise, making a potentially bad situation even worse. As well, in the first place, our RNG never even properly handled the premature next issue, so removing an incomplete solution to a fake problem was particularly nice. This allowed for numerous other simplifications in the code, which is a lot cleaner as a consequence. If you didn't see it before, https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a thread worth skimming through. - While the interrupt handler received a separate code path years ago that avoids locks by using per-cpu data structures and a faster mixing algorithm, in order to reduce interrupt latency, input and disk events that are triggered in hardirq handlers were still hitting locks and more expensive algorithms. Those are now redirected to use the faster per-cpu data structures. - Rather than having the fake-crypto almost-siphash-based random32 implementation be used right and left, and in many places where cryptographically secure randomness is desirable, the batched entropy code is now fast enough to replace that. - As usual, numerous code quality and documentation cleanups. For example, the initialization state machine now uses enum symbolic constants instead of just hard coding numbers everywhere. - Since the RNG initializes once, and then is always initialized thereafter, a pretty heavy amount of code used during that initialization is never used again. It is now completely cordoned off using static branches and it winds up in the .text.unlikely section so that it doesn't reduce cache compactness after the RNG is ready. - A variety of functions meant for waiting on the RNG to be initialized were only used by vsprintf, and in not a particularly optimal way. Replacing that usage with a more ordinary setup made it possible to remove those functions. - A cleanup of how we warn userspace about the use of uninitialized /dev/urandom and uninitialized get_random_bytes() usage. Interestingly, with the change you merged for 5.18 that attempts to use jitter (but does not block if it can't), the majority of users should never see those warnings for /dev/urandom at all now, and the one for in-kernel usage is mainly a debug thing. - The file_operations struct for /dev/[u]random now implements .read_iter and .write_iter instead of .read and .write, allowing it to also implement .splice_read and .splice_write, which makes splice(2) work again after it was broken here (and in many other places in the tree) during the set_fs() removal. This was a bit of a last minute arrival from Jens that hasn't had as much time to bake, so I'll be keeping my eye on this as well, but it seems fairly ordinary. Unfortunately, read_iter() is around 3% slower than read() in my tests, which I'm not thrilled about. But Jens and Al, spurred by this observation, seem to be making progress in removing the bottlenecks on the iter paths in the VFS layer in general, which should remove the performance gap for all drivers. - Assorted other bug fixes, cleanups, and optimizations. - A small SipHash cleanup" * tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits) random: check for signals after page of pool writes random: wire up fops->splice_{read,write}_iter() random: convert to using fops->write_iter() random: convert to using fops->read_iter() random: unify batched entropy implementations random: move randomize_page() into mm where it belongs random: remove mostly unused async readiness notifier random: remove get_random_bytes_arch() and add rng_has_arch_random() random: move initialization functions out of hot pages random: make consistent use of buf and len random: use proper return types on get_random_{int,long}_wait() random: remove extern from functions in header random: use static branch for crng_ready() random: credit architectural init the exact amount random: handle latent entropy and command line from random_init() random: use proper jiffies comparison macro random: remove ratelimiting for in-kernel unseeded randomness random: move initialization out of reseeding hot path random: avoid initializing twice in credit race random: use symbolic constants for crng_init states ...
2022-05-24Merge tag 'objtool-core-2022-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Comprehensive interface overhaul: ================================= Objtool's interface has some issues: - Several features are done unconditionally, without any way to turn them off. Some of them might be surprising. This makes objtool tricky to use, and prevents porting individual features to other arches. - The config dependencies are too coarse-grained. Objtool enablement is tied to CONFIG_STACK_VALIDATION, but it has several other features independent of that. - The objtool subcmds ("check" and "orc") are clumsy: "check" is really a subset of "orc", so it has all the same options. The subcmd model has never really worked for objtool, as it only has a single purpose: "do some combination of things on an object file". - The '--lto' and '--vmlinux' options are nonsensical and have surprising behavior. Overhaul the interface: - get rid of subcmds - make all features individually selectable - remove and/or clarify confusing/obsolete options - update the documentation - fix some bugs found along the way - Fix x32 regression - Fix Kbuild cleanup bugs - Add scripts/objdump-func helper script to disassemble a single function from an object file. - Rewrite scripts/faddr2line to be section-aware, by basing it on 'readelf', moving it away from 'nm', which doesn't handle multiple sections well, which can result in decoding failure. - Rewrite & fix symbol handling - which had a number of bugs wrt. object files that don't have global symbols - which is rare but possible. Also fix a bunch of symbol handling bugs found along the way. * tag 'objtool-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) objtool: Fix objtool regression on x32 systems objtool: Fix symbol creation scripts/faddr2line: Fix overlapping text section failures scripts: Create objdump-func helper script objtool: Remove libsubcmd.a when make clean objtool: Remove inat-tables.c when make clean objtool: Update documentation objtool: Remove --lto and --vmlinux in favor of --link objtool: Add HAVE_NOINSTR_VALIDATION objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION" objtool: Make noinstr hacks optional objtool: Make jump label hack optional objtool: Make static call annotation optional objtool: Make stack validation frame-pointer-specific objtool: Add CONFIG_OBJTOOL objtool: Extricate sls from stack validation objtool: Rework ibt and extricate from stack validation objtool: Make stack validation optional objtool: Add option to print section addresses objtool: Don't print parentheses in function addresses ...
2022-05-23Merge tag 'x86_core_for_v5.19_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Borislav Petkov: - Remove all the code around GS switching on 32-bit now that it is not needed anymore - Other misc improvements * tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: bug: Use normal relative pointers in 'struct bug_entry' x86/nmi: Make register_nmi_handler() more robust x86/asm: Merge load_gs_index() x86/32: Remove lazy GS macros ELF: Remove elf_core_copy_kernel_regs() x86/32: Simplify ELF_CORE_COPY_REGS
2022-05-23Merge tag 'core-debugobjects-2022-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debugobjects fixlet from Thomas Gleixner: "Trivial licensing cleanup in debugobjects" * tag 'core-debugobjects-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: debugobjects: Convert to SPDX license identifier
2022-05-23Merge tag 'core-core-2022-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irqpoll update from Thomas Gleixner: "A single update for irqpoll: Ensure that a raised soft interrupt is handled after pulling the blk_cpu_iopoll backlog from a unplugged CPU. This prevents that the CPU which runs that code reaches idle with soft interrupts pending" * tag 'core-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lib/irq_poll: Prevent softirq pending leak in irq_poll_cpu_dead()
2022-05-23Merge branches 'slab/for-5.19/stackdepot' and 'slab/for-5.19/refactor' into ↵Vlastimil Babka
slab/for-linus
2022-05-22lib: add generic polynomial calculationMichael Walle
Some temperature and voltage sensors use a polynomial to convert between raw data points and actual temperature or voltage. The polynomial is usually the result of a curve fitting of the diode characteristic. The BT1 PVT hwmon driver already uses such a polynonmial calculation which is rather generic. Move it to lib/ so other drivers can reuse it. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220401214032.3738095-2-michael@walle.cc Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-05-19bug: Use normal relative pointers in 'struct bug_entry'Josh Poimboeuf
With CONFIG_GENERIC_BUG_RELATIVE_POINTERS, the addr/file relative pointers are calculated weirdly: based on the beginning of the bug_entry struct address, rather than their respective pointer addresses. Make the relative pointers less surprising to both humans and tools by calculating them the normal way. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Sven Schnelle <svens@linux.ibm.com> # s390 Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Link: https://lkml.kernel.org/r/f0e05be797a16f4fc2401eeb88c8450dcbe61df6.1652362951.git.jpoimboe@kernel.org
2022-05-19mm: fix missing handler for __GFP_NOWARNQi Zheng
We expect no warnings to be issued when we specify __GFP_NOWARN, but currently in paths like alloc_pages() and kmalloc(), there are still some warnings printed, fix it. But for some warnings that report usage problems, we don't deal with them. If such warnings are printed, then we should fix the usage problems. Such as the following case: WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); [zhengqi.arch@bytedance.com: v2] Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/mellanox/mlx5/core/main.c b33886971dbc ("net/mlx5: Initialize flow steering during driver probe") 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") f2b41b32cde8 ("net/mlx5: Remove ipsec_ops function table") https://lore.kernel.org/all/20220519040345.6yrjromcdistu7vh@sx1/ 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") 8324a02c342a ("net/mlx5: Add exit route when waiting for FW") https://lore.kernel.org/all/20220519114119.060ce014@canb.auug.org.au/ tools/testing/selftests/net/mptcp/mptcp_join.sh e274f7154008 ("selftests: mptcp: add subflow limits test-cases") b6e074e171bc ("selftests: mptcp: add infinite map testcase") 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") https://lore.kernel.org/all/20220516111918.366d747f@canb.auug.org.au/ net/mptcp/options.c ba2c89e0ea74 ("mptcp: fix checksum byte order") 1e39e5a32ad7 ("mptcp: infinite mapping sending") ea66758c1795 ("tcp: allow MPTCP to update the announced window") https://lore.kernel.org/all/20220519115146.751c3a37@canb.auug.org.au/ net/mptcp/pm.c 95d686517884 ("mptcp: fix subflow accounting on close") 4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") https://lore.kernel.org/all/20220516111435.72f35dca@canb.auug.org.au/ net/mptcp/subflow.c ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure") 0348c690ed37 ("mptcp: add the fallback check") f8d4bcacff3b ("mptcp: infinite mapping receiving") https://lore.kernel.org/all/20220519115837.380bb8d4@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>