summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2024-09-01maple_tree: simplify mas_commit_b_node()Sidhartha Kumar
The only callers of mas_commit_b_node() are those with store type of wr_rebalance and wr_split_store. Use mas->store_type to dispatch to the correct helper function. This allows the removal of mas_reuse_node() as it is no longer used. Link: https://lkml.kernel.org/r/20240814161944.55347-12-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: convert mas_insert() to preallocate nodesSidhartha Kumar
By setting the store type in mas_insert(), we no longer need to use mas_wr_modify() to determine the correct store function to use. Instead, set the store type and call mas_wr_store_entry(). Also, pass in the requested gfp flags to mas_insert() so they can be passed to the call to mas_wr_preallocate(). Link: https://lkml.kernel.org/r/20240814161944.55347-11-sidhartha.kumar@oracle.com Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: use store type in mas_wr_store_entry()Sidhartha Kumar
When storing an entry, we can read the store type that was set from a previous partial walk of the tree. Now that the type of store is known, select the correct write helper function to use to complete the store. Also noinline mas_wr_spanning_store() to limit stack frame usage in mas_wr_store_entry() as it allocates a maple_big_node on the stack. Link: https://lkml.kernel.org/r/20240814161944.55347-10-sidhartha.kumar@oracle.com Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: print store type in mas_dump()Sidhartha Kumar
Knowing the store type of the maple state could be helpful for debugging. Have mas_dump() print mas->store_type. Link: https://lkml.kernel.org/r/20240814161944.55347-9-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: use mas_store_gfp() in mtree_store_range()Sidhartha Kumar
Refactor mtree_store_range() to use mas_store_gfp() which will abstract the store, memory allocation, and error handling. Link: https://lkml.kernel.org/r/20240814161944.55347-8-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: preallocate nodes in mas_erase()Sidhartha Kumar
Use mas_wr_preallocate() in mas_erase() to preallocate enough nodes to complete the erase. Add error handling by skipping the store if the preallocation lead to some error besides no memory. Link: https://lkml.kernel.org/r/20240814161944.55347-7-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: remove mas_destroy() from mas_nomem()Sidhartha Kumar
Separate call to mas_destroy() from mas_nomem() so we can check for no memory errors without destroying the current maple state in mas_store_gfp(). We then add calls to mas_destroy() to callers of mas_nomem(). Link: https://lkml.kernel.org/r/20240814161944.55347-6-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: introduce mas_wr_store_type()Sidhartha Kumar
Introduce mas_wr_store_type() which will set the correct store type based on a walk of the tree. In mas_wr_node_store() the <= min_slots condition is changed to < as if new_end is = to mt_min_slots then there is not enough room. mas_prealloc_calc() is also introduced to abstract the calculation used to determine the number of nodes needed for a store operation. In this change a call to mas_reset() is removed in the error case of mas_prealloc(). This is only needed in the MA_STATE_REBALANCE case of mas_destroy(). We can move the call to mas_reset() directly to mas_destroy(). Also, add a test case to validate the order that we check the store type in is correct. This test models a vma expanding and then shrinking which is part of the boot process. Link: https://lkml.kernel.org/r/20240814161944.55347-5-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: move up mas_wr_store_setup() and mas_wr_prealloc_setup()Sidhartha Kumar
Subsequent patches require these definitions to be higher, no functional changes intended. Link: https://lkml.kernel.org/r/20240814161944.55347-4-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: introduce mas_wr_prealloc_setup()Sidhartha Kumar
Introduce a helper function, mas_wr_prealoc_setup(), that will set up a maple write state in order to start a walk of a maple tree. Link: https://lkml.kernel.org/r/20240814161944.55347-3-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: fix comment typo with corresponding maple_statusWei Yang
In comment of function mas_start(), we list the return value of different cases. According to the comment context, tell the maple_status here is more consistent with others. Let's correct it with ma_active in the case it's a tree. Link: https://lkml.kernel.org/r/20240812150925.31551-2-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: fix comment typo of ma_rootWei Yang
In comment of mas_start(), we lists the return value for different cases. In case of a single entry, we set mas->status to ma_root, while the comment uses mas_root, which is not a maple_status. Fix the typo according to the code. Link: https://lkml.kernel.org/r/20240812150925.31551-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: add test to replicate low memory race conditionsSidhartha Kumar
Add new callback fields to the userspace implementation of struct kmem_cache. This allows for executing callback functions in order to further test low memory scenarios where node allocation is retried. This callback can help test race conditions by calling a function when a low memory event is tested. Link: https://lkml.kernel.org/r/20240812190543.71967-2-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: reset mas->index and mas->last on write retriesSidhartha Kumar
The following scenario can result in a race condition: Consider a node with the following indices and values a<------->b<----------->c<--------->d 0xA NULL 0xB CPU 1 CPU 2 --------- --------- mas_set_range(a,b) mas_erase() -> range is expanded (a,c) because of null expansion mas_nomem() mas_unlock() mas_store_range(b,c,0xC) The node now looks like: a<------->b<----------->c<--------->d 0xA 0xC 0xB mas_lock() mas_erase() <------ range of erase is still (a,c) The node is now NULL from (a,c) but the write from CPU 2 should have been retained and range (b,c) should still have 0xC as its value. We can fix this by re-intializing to the original index and last. This does not need a cc: Stable as there are no users of the maple tree which use internal locking and this condition is only possible with internal locking. Link: https://lkml.kernel.org/r/20240812190543.71967-1-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01lib: test_hmm: use min() to improve dmirror_exclusive()Thorsten Blum
Use min() to simplify the dmirror_exclusive() function and improve its readability. Link: https://lkml.kernel.org/r/20240726131245.161695-1-thorsten.blum@toblux.com Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mm: kvmalloc: align kvrealloc() with krealloc()Danilo Krummrich
Besides the obvious (and desired) difference between krealloc() and kvrealloc(), there is some inconsistency in their function signatures and behavior: - krealloc() frees the memory when the requested size is zero, whereas kvrealloc() simply returns a pointer to the existing allocation. - krealloc() behaves like kmalloc() if a NULL pointer is passed, whereas kvrealloc() does not accept a NULL pointer at all and, if passed, would fault instead. - krealloc() is self-contained, whereas kvrealloc() relies on the caller to provide the size of the previous allocation. Inconsistent behavior throughout allocation APIs is error prone, hence make kvrealloc() behave like krealloc(), which seems superior in all mentioned aspects. Besides that, implementing kvrealloc() by making use of krealloc() and vrealloc() provides oppertunities to grow (and shrink) allocations more efficiently. For instance, vrealloc() can be optimized to allocate and map additional pages to grow the allocation or unmap and free unused pages to shrink the allocation. [dakr@kernel.org: document concurrency restrictions] Link: https://lkml.kernel.org/r/20240725125442.4957-1-dakr@kernel.org [dakr@kernel.org: disable KASAN when switching to vmalloc] Link: https://lkml.kernel.org/r/20240730185049.6244-2-dakr@kernel.org [dakr@kernel.org: properly document __GFP_ZERO behavior] Link: https://lkml.kernel.org/r/20240730185049.6244-5-dakr@kernel.org Link: https://lkml.kernel.org/r/20240722163111.4766-3-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Christian König <christian.koenig@amd.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <kees@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01alloc_tag: fix allocation tag reporting when CONFIG_MODULES=nSuren Baghdasaryan
codetag_module_init() is used to initialize sections containing allocation tags. This function is used to initialize module sections as well as core kernel sections, in which case the module parameter is set to NULL. This function has to be called even when CONFIG_MODULES=n to initialize core kernel allocation tag sections. When CONFIG_MODULES=n, this function is a NOP, which is wrong. This leads to /proc/allocinfo reported as empty. Fix this by making it independent of CONFIG_MODULES. Link: https://lkml.kernel.org/r/20240828231536.1770519-1-surenb@google.com Fixes: 916cc5167cc6 ("lib: code tagging framework") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Sourav Panda <souravpanda@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [6.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: remove rcu_read_lock() from mt_validate()Liam R. Howlett
The write lock should be held when validating the tree to avoid updates racing with checks. Holding the rcu read lock during a large tree validation may also cause a prolonged rcu read window and "rcu_preempt detected stalls" warnings. Link: https://lore.kernel.org/all/0000000000001d12d4062005aea1@google.com/ Link: https://lkml.kernel.org/r/20240820175417.2782532-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reported-by: syzbot+036af2f0c7338a33b0cd@syzkaller.appspotmail.com Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-08-29Merge tag 'random-6.11-rc6-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fix from Jason Donenfeld: "Reject invalid flags passed to vgetrandom() in the same way that getrandom() does, so that the behavior is the same, from Yann. The flags argument to getrandom() only has a behavioral effect on the function if the RNG isn't initialized yet, so vgetrandom() falls back to the syscall in that case. But if the RNG is initialized, all of the flags behave the same way, so vgetrandom() didn't bother checking them, and just ignored them entirely. But that doesn't account for invalid flags passed in, which need to be rejected so we can use them later" * tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: vDSO: reject unknown getrandom() flags
2024-08-28lib/test_bits.c: Add tests for GENMASK_U128()Anshuman Khandual
This adds GENMASK_U128() tests although currently only 64 bit wide masks are being tested. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-08-27kunit, slub: add test_kfree_rcu() and test_leak_destroy()Vlastimil Babka
Add a test that will create cache, allocate one object, kfree_rcu() it and attempt to destroy it. As long as the usage of kvfree_rcu_barrier() in kmem_cache_destroy() works correctly, there should be no warnings in dmesg and the test should pass. Additionally add a test_leak_destroy() test that leaks an object on purpose and verifies that kmem_cache_destroy() catches it. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-08-26kunit: Device wrappers should also manage driver nameDavid Gow
kunit_driver_create() accepts a name for the driver, but does not copy it, so if that name is either on the stack, or otherwise freed, we end up with a use-after-free when the driver is cleaned up. Instead, strdup() the name, and manage it as another KUnit allocation. As there was no existing kunit_kstrdup(), we add one. Further, add a kunit_ variant of strdup_const() and kfree_const(), so we don't need to allocate and manage the string in the majority of cases where it's a constant. However, these are inline functions, and is_kernel_rodata() only works for built-in code. This causes problems in two cases: - If kunit is built as a module, __{start,end}_rodata is not defined. - If a kunit test using these functions is built as a module, it will suffer the same fate. This fixes a KASAN splat with overflow.overflow_allocation_test, when built as a module. Restrict the is_kernel_rodata() case to when KUnit is built as a module, which fixes the first case, at the cost of losing the optimisation. Also, make kunit_{kstrdup,kfree}_const non-inline, so that other modules using them will not accidentally depend on is_kernel_rodata(). If KUnit is built-in, they'll benefit from the optimisation, if KUnit is not, they won't, but the string will be properly duplicated. Fixes: d03c720e03bd ("kunit: Add APIs for managing devices") Reported-by: Nico Pache <npache@redhat.com> Closes: https://groups.google.com/g/kunit-dev/c/81V9b9QYON0 Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: David Gow <davidgow@google.com> Tested-by: Rae Moar <rmoar@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-08-26random: vDSO: reject unknown getrandom() flagsYann Droneaud
Like the getrandom() syscall, vDSO getrandom() must also reject unknown flags. [1] It would be possible to return -EINVAL from vDSO itself, but in the possible case that a new flag is added to getrandom() syscall in the future, it would be easier to get the behavior from the syscall, instead of erroring until the vDSO is extended to support the new flag or explicitly falling back. [1] Designing the API: Planning for Extension https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension Signed-off-by: Yann Droneaud <yann@droneaud.fr> [Jason: reworded commit message] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-08-20softirq: Remove unused 'action' parameter from action callbackCaleb Sander Mateos
When soft interrupt actions are called, they are passed a pointer to the struct softirq action which contains the action's function pointer. This pointer isn't useful, as the action callback already knows what function it is. And since each callback handles a specific soft interrupt, the callback also knows which soft interrupt number is running. No soft interrupt action callback actually uses this parameter, so remove it from the function pointer signature. This clarifies that soft interrupt actions are global routines and makes it slightly cheaper to call them. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/all/20240815171549.3260003-1-csander@purestorage.com
2024-08-19x86: support user address masking instead of non-speculative conditionalLinus Torvalds
The Spectre-v1 mitigations made "access_ok()" much more expensive, since it has to serialize execution with the test for a valid user address. All the normal user copy routines avoid this by just masking the user address with a data-dependent mask instead, but the fast "unsafe_user_read()" kind of patterms that were supposed to be a fast case got slowed down. This introduces a notion of using src = masked_user_access_begin(src); to do the user address sanity using a data-dependent mask instead of the more traditional conditional if (user_read_access_begin(src, len)) { model. This model only works for dense accesses that start at 'src' and on architectures that have a guard region that is guaranteed to fault in between the user space and the kernel space area. With this, the user access doesn't need to be manually checked, because a bad address is guaranteed to fault (by some architecture masking trick: on x86-64 this involves just turning an invalid user address into all ones, since we don't map the top of address space). This only converts a couple of examples for now. Example x86-64 code generation for loading two words from user space: stac mov %rax,%rcx sar $0x3f,%rcx or %rax,%rcx mov (%rcx),%r13 mov 0x8(%rcx),%r14 clac where all the error handling and -EFAULT is now purely handled out of line by the exception path. Of course, if the micro-architecture does badly at 'clac' and 'stac', the above is still pitifully slow. But at least we did as well as we could. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-17Merge tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent OverstreetL - New on disk format version, bcachefs_metadata_version_disk_accounting_inum This adds one more disk accounting counter, which counts disk usage and number of extents per inode number. This lets us track fragmentation, for implementing defragmentation later, and it also counts disk usage per inode in all snapshots, which will be a useful thing to expose to users. - One performance issue we've observed is threads spinning when they should be waiting for dirty keys in the key cache to be flushed by journal reclaim, so we now have hysteresis for the waiting thread, as well as improving the tracepoint and a new time_stat, for tracking time blocked waiting on key cache flushing. ... and various assorted smaller fixes. * tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs: bcachefs: Fix locking in __bch2_trans_mark_dev_sb() bcachefs: fix incorrect i_state usage bcachefs: avoid overflowing LRU_TIME_BITS for cached data lru bcachefs: Fix forgetting to pass trans to fsck_err() bcachefs: Increase size of cuckoo hash table on too many rehashes bcachefs: bcachefs_metadata_version_disk_accounting_inum bcachefs: Kill __bch2_accounting_mem_mod() bcachefs: Make bkey_fsck_err() a wrapper around fsck_err() bcachefs: Fix warning in __bch2_fsck_err() for trans not passed in bcachefs: Add a time_stat for blocked on key cache flush bcachefs: Improve trans_blocked_journal_reclaim tracepoint bcachefs: Add hysteresis to waiting on btree key cache flush lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc() bcachefs: Convert for_each_btree_node() to lockrestart_do() bcachefs: Add missing downgrade table entry bcachefs: disk accounting: ignore unknown types bcachefs: bch2_accounting_invalid() fixup bcachefs: Fix bch2_trigger_alloc when upgrading from old versions bcachefs: delete faulty fastpath in bch2_btree_path_traverse_cached()
2024-08-17crypto: lib/mpi - Add error checks to extensionHerbert Xu
The remaining functions added by commit a8ea8bdd9df92a0e5db5b43900abb7a288b8a53e did not check for memory allocation errors. Add the checks and change the API to allow errors to be returned. Fixes: a8ea8bdd9df9 ("lib/mpi: Extend the MPI library") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-17Revert "lib/mpi: Extend the MPI library"Herbert Xu
This partially reverts commit a8ea8bdd9df92a0e5db5b43900abb7a288b8a53e. Most of it is no longer needed since sm2 has been removed. However, the following functions have been kept as they have developed other uses: mpi_copy mpi_mod mpi_test_bit mpi_set_bit mpi_rshift mpi_add mpi_sub mpi_addm mpi_subm mpi_mul mpi_mulm mpi_tdiv_r mpi_fdiv_r Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-15lib/string_helpers: rework overflow-dependent codeJustin Stitt
When @size is 0, the desired behavior is to allow unlimited bytes to be parsed. Currently, this relies on some intentional arithmetic overflow where --size gives us SIZE_MAX when size is 0. Explicitly spell out the desired behavior without relying on intentional overflow/underflow. Signed-off-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240808-b4-string_helpers_caa133-v1-1-686a455167c4@google.com Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15fortify: use if_changed_dep to record header dependency in *.cmd filesMasahiro Yamada
After building with CONFIG_FORTIFY_SOURCE=y, many .*.d files are left in lib/test_fortify/ because the compiler outputs header dependencies into *.d without fixdep being invoked. When compiling C files, if_changed_dep should be used so that the auto-generated header dependencies are recorded in .*.cmd files. Currently, if_changed is incorrectly used, and only two headers are hard-coded in lib/Makefile. In the previous patch version, the kbuild test robot detected new errors on GCC 7. GCC 7 or older does not produce test.d with the following test code: $ echo 'void b(void) __attribute__((__error__(""))); void a(void) { b(); }' | gcc -Wp,-MMD,test.d -c -o /dev/null -x c - Perhaps, this was a bug that existed in older GCC versions. Skip the tests for GCC<=7 for now, as this will be eventually solved when we bump the minimal supported GCC version. Link: https://lore.kernel.org/oe-kbuild-all/CAK7LNARmJcyyzL-jVJfBPi3W684LTDmuhMf1koF0TXoCpKTmcw@mail.gmail.com/T/#m13771bf78ae21adff22efc4d310c973fb4bcaf67 Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240727150302.1823750-4-masahiroy@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15fortify: move test_fortify.sh to lib/test_fortify/Masahiro Yamada
This script is only used in lib/test_fortify/. There is no reason to keep it in scripts/. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240727150302.1823750-3-masahiroy@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15fortify: refactor test_fortify Makefile to fix some build problemsMasahiro Yamada
There are some issues in the test_fortify Makefile code. Problem 1: cc-disable-warning invokes compiler dozens of times To see how many times the cc-disable-warning is evaluated, change this code: $(call cc-disable-warning,fortify-source) to: $(call cc-disable-warning,$(shell touch /tmp/fortify-$$$$)fortify-source) Then, build the kernel with CONFIG_FORTIFY_SOURCE=y. You will see a large number of '/tmp/fortify-<PID>' files created: $ ls -1 /tmp/fortify-* | wc 80 80 1600 This means the compiler was invoked 80 times just for checking the -Wno-fortify-source flag support. $(call cc-disable-warning,fortify-source) should be added to a simple variable instead of a recursive variable. Problem 2: do not recompile string.o when the test code is updated The test cases are independent of the kernel. However, when the test code is updated, $(obj)/string.o is rebuilt and vmlinux is relinked due to this dependency: $(obj)/string.o: $(obj)/$(TEST_FORTIFY_LOG) always-y is suitable for building the log files. Problem 3: redundant code clean-files += $(addsuffix .o, $(TEST_FORTIFY_LOGS)) ... is unneeded because the top Makefile globally cleans *.o files. This commit fixes these issues and makes the code readable. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240727150302.1823750-2-masahiroy@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15kunit/overflow: Fix UB in overflow_allocation_testIvan Orlov
The 'device_name' array doesn't exist out of the 'overflow_allocation_test' function scope. However, it is being used as a driver name when calling 'kunit_driver_create' from 'kunit_device_register'. It produces the kernel panic with KASAN enabled. Since this variable is used in one place only, remove it and pass the device name into kunit_device_register directly as an ascii string. Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Reviewed-by: David Gow <davidgow@google.com> Link: https://lore.kernel.org/r/20240815000431.401869-1-ivan.orlov0322@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-15locking/csd_lock: Provide an indication of ongoing CSD-lock stallPaul E. McKenney
If a CSD-lock stall goes on long enough, it will cause an RCU CPU stall warning. This additional warning provides much additional console-log traffic and little additional information. Therefore, provide a new csd_lock_is_stuck() function that returns true if there is an ongoing CSD-lock stall. This function will be used by the RCU CPU stall warnings to provide a one-line indication of the stall when this function returns true. [ neeraj.upadhyay: Apply Rik van Riel feedback. ] [ neeraj.upadhyay: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Leonardo Bras <leobras@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
2024-08-13lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()Kent Overstreet
If we need to increase the tree depth, allocate a new node, and then race with another thread that increased the tree depth before us, we'll still have a preallocated node that might be used later. If we then use that node for a new non-root node, it'll still have a pointer to the old root instead of being zeroed - fix this by zeroing it in the cmpxchg failure path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-10Revert "lib/mpi: Introduce ec implementation to MPI library"Herbert Xu
This reverts commit d58bb7e55a8a65894cc02f27c3e2bf9403e7c40f. It's no longer needed since sm2 has been removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-08-08kcov: Add interrupt handling self testDmitry Vyukov
Add a boot self test that can catch sprious coverage from interrupts. The coverage callback filters out interrupt code, but only after the handler updates preempt count. Some code periodically leaks out of that section and leads to spurious coverage. Add a best-effort (but simple) test that is likely to catch such bugs. If the test is enabled on CI systems that use KCOV, they should catch any issues fast. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/all/7662127c97e29da1a748ad1c1539dd7b65b737b2.1718092070.git.dvyukov@google.com
2024-08-06x86/traps: Enable UBSAN traps on x86Gatlin Newhouse
Currently ARM64 extracts which specific sanitizer has caused a trap via encoded data in the trap instruction. Clang on x86 currently encodes the same data in the UD1 instruction but x86 handle_bug() and is_valid_bugaddr() currently only look at UD2. Bring x86 to parity with ARM64, similar to commit 25b84002afb9 ("arm64: Support Clang UBSAN trap codes for better reporting"). See the llvm links for information about the code generation. Enable the reporting of UBSAN sanitizer details on x86 compiled with clang when CONFIG_UBSAN_TRAP=y by analysing UD1 and retrieving the type immediate which is encoded by the compiler after the UD1. [ tglx: Simplified it by moving the printk() into handle_bug() ] Signed-off-by: Gatlin Newhouse <gatlin.newhouse@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20240724000206.451425-1-gatlin.newhouse@gmail.com Link: https://github.com/llvm/llvm-project/commit/c5978f42ec8e9#diff-bb68d7cd885f41cfc35843998b0f9f534adb60b415f647109e597ce448e92d9f Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InstrSystem.td#L27
2024-07-30Union-Find: add a new module in kernel libraryXavier
This patch implements a union-find data structure in the kernel library, which includes operations for allocating nodes, freeing nodes, finding the root of a node, and merging two nodes. Signed-off-by: Xavier <xavier_qy@163.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30Merge tag 'v6.11-rc1' into for-6.12Tejun Heo
Linux 6.11-rc1
2024-07-29platform: Add test managed platform_device/driver APIsStephen Boyd
Introduce KUnit resource wrappers around platform_driver_register(), platform_device_alloc(), and platform_device_add() so that test authors can register platform drivers/devices from their tests and have the drivers/devices automatically be unregistered when the test is done. This makes test setup code simpler when a platform driver or platform device is needed. Add a few test cases at the same time to make sure the APIs work as intended. Cc: Brendan Higgins <brendan.higgins@linux.dev> Reviewed-by: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240718210513.3801024-6-sboyd@kernel.org
2024-07-28minmax: don't use max() in situations that want a C constant expressionLinus Torvalds
We only had a couple of array[] declarations, and changing them to just use 'MAX()' instead of 'max()' fixes the issue. This will allow us to simplify our min/max macros enormously, since they can now unconditionally use temporary variables to avoid using the argument values multiple times. Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28minmax: make generic MIN() and MAX() macros available everywhereLinus Torvalds
This just standardizes the use of MIN() and MAX() macros, with the very traditional semantics. The goal is to use these for C constant expressions and for top-level / static initializers, and so be able to simplify the min()/max() macros. These macro names were used by various kernel code - they are very traditional, after all - and all such users have been fixed up, with a few different approaches: - trivial duplicated macro definitions have been removed Note that 'trivial' here means that it's obviously kernel code that already included all the major kernel headers, and thus gets the new generic MIN/MAX macros automatically. - non-trivial duplicated macro definitions are guarded with #ifndef This is the "yes, they define their own versions, but no, the include situation is not entirely obvious, and maybe they don't get the generic version automatically" case. - strange use case #1 A couple of drivers decided that the way they want to describe their versioning is with #define MAJ 1 #define MIN 2 #define DRV_VERSION __stringify(MAJ) "." __stringify(MIN) which adds zero value and I just did my Alexander the Great impersonation, and rewrote that pointless Gordian knot as #define DRV_VERSION "1.2" instead. - strange use case #2 A couple of drivers thought that it's a good idea to have a random 'MIN' or 'MAX' define for a value or index into a table, rather than the traditional macro that takes arguments. These values were re-written as C enum's instead. The new function-line macros only expand when followed by an open parenthesis, and thus don't clash with enum use. Happily, there weren't really all that many of these cases, and a lot of users already had the pattern of using '#ifndef' guarding (or in one case just using '#undef MIN') before defining their own private version that does the same thing. I left such cases alone. Cc: David Laight <David.Laight@aculab.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-27Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull Rust updates from Miguel Ojeda: "The highlight is the establishment of a minimum version for the Rust toolchain, including 'rustc' (and bundled tools) and 'bindgen'. The initial minimum will be the pinned version we currently have, i.e. we are just widening the allowed versions. That covers three stable Rust releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow), plus beta, plus nightly. This should already be enough for kernel developers in distributions that provide recent Rust compiler versions routinely, such as Arch Linux, Debian Unstable (outside the freeze period), Fedora Linux, Gentoo Linux (especially the testing channel), Nix (unstable) and openSUSE Slowroll and Tumbleweed. In addition, the kernel is now being built-tested by Rust's pre-merge CI. That is, every change that is attempting to land into the Rust compiler is tested against the kernel, and it is merged only if it passes. Similarly, the bindgen tool has agreed to build the kernel in their CI too. Thus, with the pre-merge CI in place, both projects hope to avoid unintentional changes to Rust that break the kernel. This means that, in general, apart from intentional changes on their side (that we will need to workaround conditionally on our side), the upcoming Rust compiler versions should generally work. In addition, the Rust project has proposed getting the kernel into stable Rust (at least solving the main blockers) as one of its three flagship goals for 2024H2 [1]. I would like to thank Niko, Sid, Emilio et al. for their help promoting the collaboration between Rust and the kernel. Toolchain and infrastructure: - Support several Rust toolchain versions. - Support several bindgen versions. - Remove 'cargo' requirement and simplify 'rusttest', thanks to 'alloc' having been dropped last cycle. - Provide proper error reporting for the 'rust-analyzer' target. 'kernel' crate: - Add 'uaccess' module with a safe userspace pointers abstraction. - Add 'page' module with a 'struct page' abstraction. - Support more complex generics in workqueue's 'impl_has_work!' macro. 'macros' crate: - Add 'firmware' field support to the 'module!' macro. - Improve 'module!' macro documentation. Documentation: - Provide instructions on what packages should be installed to build the kernel in some popular Linux distributions. - Introduce the new kernel.org LLVM+Rust toolchains. - Explain '#[no_std]'. And a few other small bits" Link: https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals [1] * tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux: (26 commits) docs: rust: quick-start: add section on Linux distributions rust: warn about `bindgen` versions 0.66.0 and 0.66.1 rust: start supporting several `bindgen` versions rust: work around `bindgen` 0.69.0 issue rust: avoid assuming a particular `bindgen` build rust: start supporting several compiler versions rust: simplify Clippy warning flags set rust: relax most deny-level lints to warnings rust: allow `dead_code` for never constructed bindings rust: init: simplify from `map_err` to `inspect_err` rust: macros: indent list item in `paste!`'s docs rust: add abstraction for `struct page` rust: uaccess: add typed accessors for userspace pointers uaccess: always export _copy_[from|to]_user with CONFIG_RUST rust: uaccess: add userspace pointers kbuild: rust-analyzer: improve comment documentation kbuild: rust-analyzer: better error handling docs: rust: no_std is used rust: alloc: add __GFP_HIGHMEM flag rust: alloc: fix typo in docs for GFP_NOWAIT ...
2024-07-27Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "11 hotfixes, 7 of which are cc:stable. 7 are MM, 4 are other" * tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: nilfs2: handle inconsistent state in nilfs_btnode_create_block() selftests/mm: skip test for non-LPA2 and non-LVA systems mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist() mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node alloc_tag: outline and export free_reserved_page() decompress_bunzip2: fix rare decompression failure mm/huge_memory: avoid PMD-size page cache if needed mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines mm: fix old/young bit handling in the faulting path dt-bindings: arm: update James Clark's email address MAINTAINERS: mailmap: update James Clark's email address
2024-07-26decompress_bunzip2: fix rare decompression failureRoss Lagerwall
The decompression code parses a huffman tree and counts the number of symbols for a given bit length. In rare cases, there may be >= 256 symbols with a given bit length, causing the unsigned char to overflow. This causes a decompression failure later when the code tries and fails to find the bit length for a given symbol. Since the maximum number of symbols is 258, use unsigned short instead. Link: https://lkml.kernel.org/r/20240717162016.1514077-1-ross.lagerwall@citrix.com Fixes: bc22c17e12c1 ("bzip2/lzma: library support for gzip, bzip2 and lzma decompression") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: Alain Knaff <alain@knaff.lu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-26Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: "Random fixes" * tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux: riscv: Remove unnecessary int cast in variable_fls() radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c bitops: Add a comment explaining the double underscore macros lib: bitmap: add missing MODULE_DESCRIPTION() macros cpumask: introduce assign_cpu() macro
2024-07-25Merge tag 'printk-for-6.11-trivial' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - trivial printk changes The bigger "real" printk work is still being discussed. * tag 'printk-for-6.11-trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: vsprintf: add missing MODULE_DESCRIPTION() macro printk: Rename console_replay_all() and update context
2024-07-25Merge tag 'driver-core-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) ARM: sa1100: make match function take a const pointer sysfs/cpu: Make crash_hotplug attribute world-readable dio: Have dio_bus_match() callback take a const * zorro: make match function take a const pointer driver core: module: make module_[add|remove]_driver take a const * driver core: make driver_find_device() take a const * driver core: make driver_[create|remove]_file take a const * firmware_loader: fix soundness issue in `request_internal` firmware_loader: annotate doctests as `no_run` devres: Correct code style for functions that return a pointer type devres: Initialize an uninitialized struct member devres: Fix memory leakage caused by driver API devm_free_percpu() devres: Fix devm_krealloc() wasting memory driver core: platform: Switch to use kmemdup_array() driver core: have match() callback in struct bus_type take a const * MAINTAINERS: add Rust device abstractions to DRIVER CORE device: rust: improve safety comments MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER firmware: rust: improve safety comments ...
2024-07-24Merge tag 'random-6.11-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "This adds getrandom() support to the vDSO. First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which lets the kernel zero out pages anytime under memory pressure, which enables allocating memory that never gets swapped to disk but also doesn't count as being mlocked. Then, the vDSO implementation of getrandom() is introduced in a generic manner and hooked into random.c. Next, this is implemented on x86. (Also, though it's not ready for this pull, somebody has begun an arm64 implementation already) Finally, two vDSO selftests are added. There are also two housekeeping cleanup commits" * tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: MAINTAINERS: add random.h headers to RNG subsection random: note that RNDGETPOOL was removed in 2.6.9-rc2 selftests/vDSO: add tests for vgetrandom x86: vdso: Wire up getrandom() vDSO implementation random: introduce generic vDSO getrandom() implementation mm: add MAP_DROPPABLE for designating always lazily freeable mappings