summaryrefslogtreecommitdiff
path: root/lib/list_debug.c
AgeCommit message (Collapse)Author
2022-06-16lib/list_debug.c: Detect uninitialized listsGuenter Roeck
In some circumstances, attempts are made to add entries to or to remove entries from an uninitialized list. A prime example is amdgpu_bo_vm_destroy(): It is indirectly called from ttm_bo_init_reserved() if that function fails, and tries to remove an entry from a list. However, that list is only initialized in amdgpu_bo_create_vm() after the call to ttm_bo_init_reserved() returned success. This results in crashes such as BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 1479 Comm: chrome Not tainted 5.10.110-15768-g29a72e65dae5 Hardware name: Google Grunt/Grunt, BIOS Google_Grunt.11031.149.0 07/15/2020 RIP: 0010:__list_del_entry_valid+0x26/0x7d ... Call Trace: amdgpu_bo_vm_destroy+0x48/0x8b ttm_bo_init_reserved+0x1d7/0x1e0 amdgpu_bo_create+0x212/0x476 ? amdgpu_bo_user_destroy+0x23/0x23 ? kmem_cache_alloc+0x60/0x271 amdgpu_bo_create_vm+0x40/0x7d amdgpu_vm_pt_create+0xe8/0x24b ... Check if the list's prev and next pointers are NULL to catch such problems. Link: https://lkml.kernel.org/r/20220531222951.92073-1-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-01-20lib/list_debug.c: print more list debugging context in __list_del_entry_valid()Zhen Lei
Currently, the entry->prev and entry->next are considered to be valid as long as they are not LIST_POISON{1|2}. However, the memory may be corrupted. The prev->next is invalid probably because 'prev' is invalid, not because prev->next's content is illegal. Unfortunately, the printk and its subfunctions will modify the registers that hold the 'prev' and 'next', and we don't see this valuable information in the BUG context. So print the contents of 'entry->prev' and 'entry->next'. Here's an example: list_del corruption. prev->next should be c0ecbf74, but was c08410dc kernel BUG at lib/list_debug.c:53! ... ... PC is at __list_del_entry_valid+0x58/0x98 LR is at __list_del_entry_valid+0x58/0x98 psr: 60000093 sp : c0ecbf30 ip : 00000000 fp : 00000001 r10: c08410d0 r9 : 00000001 r8 : c0825e0c r7 : 20000013 r6 : c08410d0 r5 : c0ecbf74 r4 : c0ecbf74 r3 : c0825d08 r2 : 00000000 r1 : df7ce6f4 r0 : 00000044 ... ... Stack: (0xc0ecbf30 to 0xc0ecc000) bf20: c0ecbf74 c0164fd0 c0ecbf70 c0165170 bf40: c0eca000 c0840c00 c0840c00 c0824500 c0825e0c c0189bbc c088f404 60000013 bf60: 60000013 c0e85100 000004ec 00000000 c0ebcdc0 c0ecbf74 c0ecbf74 c0825d08 bf80: c0e807c0 c018965c 00000000 c013f2a0 c0e807c0 c013f154 00000000 00000000 bfa0: 00000000 00000000 00000000 c01001b0 00000000 00000000 00000000 00000000 bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 (__list_del_entry_valid) from (__list_del_entry+0xc/0x20) (__list_del_entry) from (finish_swait+0x60/0x7c) (finish_swait) from (rcu_gp_kthread+0x560/0xa20) (rcu_gp_kthread) from (kthread+0x14c/0x15c) (kthread) from (ret_from_fork+0x14/0x24) At first, I thought prev->next was overwritten. Later, I carefully analyzed the RCU code and the disassembly code. The error occurred when deleting a node from the list rcu_state.gp_wq. The System.map shows that the address of rcu_state is c0840c00. Then I use gdb to obtain the offset of rcu_state.gp_wq.task_list. (gdb) p &((struct rcu_state *)0)->gp_wq.task_list $1 = (struct list_head *) 0x4dc Again: list_del corruption. prev->next should be c0ecbf74, but was c08410dc c08410dc = c0840c00 + 0x4dc = &rcu_state.gp_wq.task_list Because rcu_state.gp_wq has at most one node, so I can guess that "prev = &rcu_state.gp_wq.task_list". But for other scenes, maybe I wasn't so lucky, I cannot figure out the value of 'prev'. Link: https://lkml.kernel.org/r/20211207025835.1909-1-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11lib/list_debug.c: print unmangled addressesMatthew Wilcox
The entire point of printing the pointers in list_debug is to see if there's any useful information in them (eg poison values, ASCII, etc); obscuring them to see if they compare equal makes them much less useful. If an attacker can force this message to be printed, we've already lost. Link: http://lkml.kernel.org/r/20180401223237.GV13332@bombadil.infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Tobin C. Harding <me@tobin.cc> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24bug: switch data corruption check to __must_checkKees Cook
The CHECK_DATA_CORRUPTION() macro was designed to have callers do something meaningful/protective on failure. However, using "return false" in the macro too strictly limits the design patterns of callers. Instead, let callers handle the logic test directly, but make sure that the result IS checked by forcing __must_check (which appears to not be able to be used directly on macro expressions). Link: http://lkml.kernel.org/r/20170206204547.GA125312@beast Signed-off-by: Kees Cook <keescook@chromium.org> Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-31bug: Provide toggle for BUG on data corruptionKees Cook
The kernel checks for cases of data structure corruption under some CONFIGs (e.g. CONFIG_DEBUG_LIST). When corruption is detected, some systems may want to BUG() immediately instead of letting the system run with known corruption. Usually these kinds of manipulation primitives can be used by security flaws to gain arbitrary memory write control. This provides a new config CONFIG_BUG_ON_DATA_CORRUPTION and a corresponding macro CHECK_DATA_CORRUPTION for handling these situations. Notably, even if not BUGing, the kernel should not continue processing the corrupted structure. This is inspired by similar hardening by Syed Rameez Mustafa in MSM kernels, and in PaX and Grsecurity, which is likely in response to earlier removal of the BUG calls in commit 924d9addb9b1 ("list debugging: use WARN() instead of BUG()"). Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com>
2016-10-31list: Split list_del() debug checking into separate functionKees Cook
Similar to the list_add() debug consolidation, this commit consolidates the debug checking performed during CONFIG_DEBUG_LIST into a new __list_del_entry_valid() function, and stops list updates when corruption is found. Refactored from same hardening in PaX and Grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com>
2016-10-31rculist: Consolidate DEBUG_LIST for list_add_rcu()Kees Cook
This commit consolidates the debug checking for list_add_rcu() into the new single __list_add_valid() debug function. Notably, this commit fixes the sanity check that was added in commit 17a801f4bfeb ("list_debug: WARN for adding something already in the list"), which wasn't checking RCU-protected lists. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com>
2016-10-31list: Split list_add() debug checking into separate functionKees Cook
Right now, __list_add() code is repeated either in list.h or in list_debug.c, but the only differences between the two versions are the debug checks. This commit therefore extracts these debug checks into a separate __list_add_valid() function and consolidates __list_add(). Additionally this new __list_add_valid() function will stop list manipulations if a corruption is detected, instead of allowing for further corruption that may lead to even worse conditions. This is slight refactoring of the same hardening done in PaX and Grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com>
2016-03-09list: kill list_force_poison()Dan Williams
Given we have uninitialized list_heads being passed to list_add() it will always be the case that those uninitialized values randomly trigger the poison value. Especially since a list_add() operation will seed the stack with the poison value for later stack allocations to trip over. For example, see these two false positive reports: list_add attempted on force-poisoned entry WARNING: at lib/list_debug.c:34 [..] NIP [c00000000043c390] __list_add+0xb0/0x150 LR [c00000000043c38c] __list_add+0xac/0x150 Call Trace: __list_add+0xac/0x150 (unreliable) __down+0x4c/0xf8 down+0x68/0x70 xfs_buf_lock+0x4c/0x150 [xfs] list_add attempted on force-poisoned entry(0000000000000500), new->next == d0000000059ecdb0, new->prev == 0000000000000500 WARNING: at lib/list_debug.c:33 [..] NIP [c00000000042db78] __list_add+0xa8/0x140 LR [c00000000042db74] __list_add+0xa4/0x140 Call Trace: __list_add+0xa4/0x140 (unreliable) rwsem_down_read_failed+0x6c/0x1a0 down_read+0x58/0x60 xfs_log_commit_cil+0x7c/0x600 [xfs] Fixes: commit 5c2c2587b132 ("mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Eryu Guan <eguan@redhat.com> Tested-by: Eryu Guan <eguan@redhat.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gupDan Williams
get_dev_page() enables paths like get_user_pages() to pin a dynamically mapped pfn-range (devm_memremap_pages()) while the resulting struct page objects are in use. Unlike get_page() it may fail if the device is, or is in the process of being, disabled. While the initial lookup of the range may be an expensive list walk, the result is cached to speed up subsequent lookups which are likely to be in the same mapped range. devm_memremap_pages() now requires a reference counter to be specified at init time. For pmem this means moving request_queue allocation into pmem_alloc() so the existing queue usage counter can track "device pages". ZONE_DEVICE pages always have an elevated count and will never be on an lru reclaim list. That space in 'struct page' can be redirected for other uses, but for safety introduce a poison value that will always trip __list_add() to assert. This allows half of the struct list_head storage to be reclaimed with some assurance to back up the assumption that the page count never goes to zero and a list_add() is never attempted. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Cc: Dave Hansen <dave@sr71.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-23list: Use WRITE_ONCE() when adding to lists and hlistsPaul E. McKenney
Code that does lockless emptiness testing of non-RCU lists is relying on the list-addition code to write the list head's ->next pointer atomically. This commit therefore adds WRITE_ONCE() to list-addition pointer stores that could affect the head's ->next pointer. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-06rcu: Fix broken strings in RCU's source code.Paul E. McKenney
Although the C language allows you to break strings across lines, doing this makes it hard for people to find the Linux kernel code corresponding to a given console message. This commit therefore fixes broken strings throughout RCU's source code. Suggested-by: Josh Triplett <josh@joshtriplett.org> Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-05-29list_debug: WARN for adding something already in the listChris Metcalf
We were bitten by this at one point and added an additional sanity test for DEBUG_LIST. You can't validly add a list_head to a list where either prev or next is the same as the thing you're adding. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-24rcu: List-debug variants of rcu list routines.Dave Jones
* Make __list_add_rcu check the next->prev and prev->next pointers just like __list_add does. * Make list_del_rcu use __list_del_entry, which does the same checking at deletion time. Has been running for a week here without anything being tripped up, but it seems worth adding for completeness just in case something ever does corrupt those lists. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-03-24Merge tag 'module-for-3.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull cleanup of fs/ and lib/ users of module.h from Paul Gortmaker: "Fix up files in fs/ and lib/ dirs to only use module.h if they really need it. These are trivial in scope vs the work done previously. We now have things where any few remaining cleanups can be farmed out to arch or subsystem maintainers, and I have done so when possible. What is remaining here represents the bits that don't clearly lie within a single arch/subsystem boundary, like the fs dir and the lib dir. Some duplicate includes arising from overlapping fixes from independent subsystem maintainer submissions are also quashed." Fix up trivial conflicts due to clashes with other include file cleanups (including some due to the previous bug.h cleanup pull). * tag 'module-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: lib: reduce the use of module.h wherever possible fs: reduce the use of module.h wherever possible includecheck: delete any duplicate instances of module.h
2012-03-07lib: reduce the use of module.h wherever possiblePaul Gortmaker
For files only using THIS_MODULE and/or EXPORT_SYMBOL, map them onto including export.h -- or if the file isn't even using those, then just delete the include. Fix up any implicit include dependencies that were being masked by module.h along the way. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-29bug.h: add include of it to various implicit C usersPaul Gortmaker
With bug.h currently living right in linux/kernel.h there are files that use BUG_ON and friends but are not including the header explicitly. Fix them up so we can remove the presence in kernel.h file. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-28lib: fix implicit users of kernel.h for TAINT_WARNPaul Gortmaker
A pending header cleanup will cause this to show up as: lib/average.c:38: error: 'TAINT_WARN' undeclared (first use in this function) lib/list_debug.c:24: error: 'TAINT_WARN' undeclared (first use in this function) and TAINT_WARN comes from include/linux/kernel.h file. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-02-18Expand CONFIG_DEBUG_LIST to several other list operationsLinus Torvalds
When list debugging is enabled, we aim to readably show list corruption errors, and the basic list_add/list_del operations end up having extra debugging code in them to do some basic validation of the list entries. However, "list_del_init()" and "list_move[_tail]()" ended up avoiding the debug code due to how they were written. This fixes that. So the _next_ time we have list_move() problems with stale list entries, we'll hopefully have an easier time finding them.. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09list debugging: warn when deleting a deleted entryBaruch Siach
Use the magic LIST_POISON* values to detect an incorrect use of list_del on a deleted entry. This DEBUG_LIST specific warning is easier to understand than the generic Oops message caused by LIST_POISON dereference. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25list debugging: use WARN() instead of BUG()Dave Jones
Arjan noted that the list_head debugging is BUG'ing when it detects corruption. By causing the box to panic immediately, we're possibly losing some bug reports. Changing this to a WARN() should mean we at the least start seeing reports collected at kerneloops.org Signed-off-by: Dave Jones <davej@redhat.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25lists: remove a redundant conditional definition of list_add()Robert P. J. Day
Remove the conditional surrounding the definition of list_add() from list.h since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently pick up from lib/list_debug.c will be absolutely identical, at which point you can remove that redundant definition from list_debug.c as well. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-12-07[PATCH] More list debugging contextDave Jones
Print the other (hopefully) known good pointer when list_head debugging too, which may yield additional clues. Also fix for 80-columns to win akpm brownie points. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] list_del-debug fixAndrew Morton
These two BUG_ON()s are redundant and undesired: we're checking for this condition further on in the function, only better. Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] list_del debug checkManfred Spraul
A list_del() debugging check. Has been in -mm for years. Dave moved list_del() out-of-line in the debug case, so this is now suitable for mainline. Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Debug variants of linked list macrosDave Jones
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>