linux.git - dakr's fork of kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Age	Commit message (Collapse)	Author
2023-10-22	bcachefs: trace_read_nopromote()	Kent Overstreet
	Add a tracepoint to print the reason a read wasn't promoted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Break up io.c	Kent Overstreet
	More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Convert btree_err_type to normal error codes	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Make topology repair a normal recovery pass	Kent Overstreet
	This adds bch2_run_explicit_recovery_pass(), for rewinding recovery and explicitly running a specific recovery pass - this is a more general replacement for how we were running topology repair before. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: bch2_run_explicit_recovery_pass()	Kent Overstreet
	This introduces bch2_run_explicit_recovery_pass() and uses it for when fsck detects that we need to re-run dead snaphots cleanup, and makes dead snapshot cleanup more like a normal recovery pass. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: overlapping_extents_found()	Kent Overstreet
	This improves the repair path for overlapping extents - we now verify that we find in the btree the overlapping extents that the algorithm detected, and fail the fsck run with a more useful error if it doesn't match. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Convert more -EROFS to private error codes	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: BCH_ERR_fsck -> EINVAL	Kent Overstreet
	When we return errors outside of bcachefs, we need to return a standard error code - fix this for BCH_ERR_fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix more lockdep splats in debug.c	Kent Overstreet
	Similar to previous fixes, we can't incur page faults while holding btree locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Use unlikely() in bch2_err_matches()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Convert -ENOENT to private error codes	Kent Overstreet
	As with previous conversions, replace -ENOENT uses with more informative private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: BTREE_ID_snapshot_tree	Kent Overstreet
	This adds a new btree which gets us a persistent per-snapshot-tree identifier. - BTREE_ID_snapshot_trees - KEY_TYPE_snapshot_tree - bch_snapshot now has a field that points to a snapshot_tree This is going to be used to designate one snapshot ID/subvolume out of a given tree of snapshots as the "main" subvolume, so that we can do quota accounting in that subvolume and not the rest. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: bch2_bkey_get_empty_slot()	Kent Overstreet
	Add a new helper for allocating a new slot in a btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Private error codes: ENOMEM	Kent Overstreet
	This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix stripe reuse path	Kent Overstreet
	It's possible that we reuse a stripe that doesn't have quite the same configuration as the stripe_head we're allocating from. In that case, we have to make sure that the new stripe uses the settings from the stripe we resue, not the stripe head, and make sure the buffer is allocated correctly. This fixes the ec_mixed_tiers test. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: RESERVE_stripe	Kent Overstreet
	Rework stripe creation path - new algorithm for deciding when to create new stripes or reuse existing stripes. We add a new allocation watermark, RESERVE_stripe, above RESERVE_none. Then we always try to create a new stripe by doing RESERVE_stripe allocations; if this fails, we reuse an existing stripe and allocate buckets for it with the reserve watermark for the given write (RESERVE_none or RESERVE_movinggc). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: don't block reads if we're promoting	Daniel Hill
	The promote path calls data_update_init() and now that we take locks here, there's potential for promote to block our read path, just error when we can't take the lock instead of blocking. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Data update support for unwritten extents	Kent Overstreet
	The data update path requires special support for unwritten extents - we still need to be able to move them, but there's no need to read or write anything. This patch adds a new error code to tell bch2_move_extent() that we're short circuiting the read, and adds bch2_update_unwritten_extent() to create a reservation then call __bch2_data_update_index_update(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: New on disk format: Backpointers	Kent Overstreet
	This patch adds backpointers: we now have a reverse index from device and offset on that device (specifically, offset within a bucket) back to btree nodes and (non cached) data extents. The first 40 backpointers within a bucket are stored in the alloc key; after that backpointers spill over to the next backpointers btree. This is to help avoid performance regressions from additional btree updates on large streaming workloads. This patch adds all the code for creating, checking and repairing backpointers. The next patch in the series is going to use backpointers for copygc - finally getting rid of the need to scan all extents to do copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Btree write buffer	Kent Overstreet
	This adds a new method of doing btree updates - a straight write buffer, implemented as a flat fixed size array. This is only useful when we don't need to read from the btree in order to do the update, and when reading is infrequent - perfect for the LRU btree. This will make LRU btree updates fast enough that we'll be able to use it for persistently indexing buckets by fragmentation, which will be a massive boost to copygc performance. Changes: - A new btree_insert_type enum, for btree_insert_entries. Specifies btree, btree key cache, or btree write buffer. - bch2_trans_update_buffered(): updates via the btree write buffer don't need a btree path, so we need a new update path. - Transaction commit path changes: The update to the btree write buffer both mutates global, and can fail if there isn't currently room. Therefore we do all write buffer updates in the transaction all at once, and also if it fails we have to revert filesystem usage counter changes. If there isn't room we flush the write buffer in the transaction commit error path and retry. - A new persistent option, for specifying the number of entries in the write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Convert EAGAIN errors to private error codes	Kent Overstreet
	More error code cleanup, for better error messages and debugability. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Convert EROFS errors to private error codes	Kent Overstreet
	More error code improvements - this gets us more useful error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix for long running btree transactions & key cache	Kent Overstreet
	While a btree transaction is running, we hold a SRCU read lock on the btree key cache that prevents btree key cache keys from being freed - this is so that relock() operations won't access freed memory. The downside of this is that long running btree transactions prevent memory from being freed from the key cache. This adds a check in bch2_trans_begin() - if the transaction has been running longer than 1 second, drop and retake the SRCU read lock and zero out pointers to unlock key cache paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: More errcode cleanup	Kent Overstreet
	We shouldn't be overloading standard error codes now that we have provisions for bcachefs-specific errorcodes: this patch converts super.c and super-io.c to per error site errcodes, with a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Kill btree_insert_ret enum	Kent Overstreet
	Replace with standard bcachefs-private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Add error path to btree_split()	Kent Overstreet
	The next patch in the series is (finally!) going to change btree splits (and interior updates in general) to not take intent locks all the way up to the root - instead only locking the nodes they'll need to modify. However, this will be introducing a race since if we're not holding a write lock on a btree node it can be written out by another thread, and then we might not have enough space for a new bset entry. We can handle this by retrying - we just need to introduce a new error path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Delete old deadlock avoidance code	Kent Overstreet
	This deletes our old lock ordering based deadlock avoidance code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Deadlock cycle detector	Kent Overstreet
	We've outgrown our own deadlock avoidance strategy. The btree iterator API provides an interface where the user doesn't need to concern themselves with lock ordering - different btree iterators can be traversed in any order. Without special care, this will lead to deadlocks. Our previous strategy was to define a lock ordering internally, and whenever we attempt to take a lock and trylock() fails, we'd check if the current btree transaction is holding any locks that cause a lock ordering violation. If so, we'd issue a transaction restart, and then bch2_trans_begin() would re-traverse all previously used iterators, but in the correct order. That approach had some issues, though. - Sometimes we'd issue transaction restarts unnecessarily, when no deadlock would have actually occured. Lock ordering restarts have become our primary cause of transaction restarts, on some workloads totally 20% of actual transaction commits. - To avoid deadlock or livelock, we'd often have to take intent locks when we only wanted a read lock: with the lock ordering approach, it is actually illegal to hold _any_ read lock while blocking on an intent lock, and this has been causing us unnecessary lock contention. - It was getting fragile - the various lock ordering rules are not trivial, and we'd been seeing occasional livelock issues related to this machinery. So, since bcachefs is already a relational database masquerading as a filesystem, we're stealing the next traditional database technique and switching to a cycle detector for avoiding deadlocks. When we block taking a btree lock, after adding ourself to the waitlist but before sleeping, we do a DFS of btree transactions waiting on other btree transactions, starting with the current transaction and walking our held locks, and transactions blocking on our held locks. If we find a cycle, we emit a transaction restart. Occasionally (e.g. the btree split path) we can not allow the lock() operation to fail, so if necessary we'll tell another transaction that it has to fail. Result: trans_restart_would_deadlock events are reduced by a factor of 10 to 100, and we'll be able to delete a whole bunch of grotty, fragile code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Add private error codes for ENOSPC	Kent Overstreet
	Continuing the saga of introducing private dedicated error codes for each error path, this patch converts ENOSPC to error codes that are subtypes of ENOSPC. We've recently had a test failure where we got -ENOSPC where we shouldn't have, and didn't have enough information to tell where it came from, so this patch will solve that problem. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Errcodes can now subtype standard error codes	Kent Overstreet
	The next patch is going to be adding private error codes for all the places we return -ENOSPC. Additionally, this patch updates return paths at all module boundaries to call bch2_err_class(), to return the standard error code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: BTREE_ITER_NO_NODE -> BCH_ERR codes	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Add distinct error code for key_cache_upgrade	Kent Overstreet
	This aids in debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Convert fsck errors to errcode.h	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: EINTR -> BCH_ERR_transaction_restart	Kent Overstreet
	Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Improved errcodes	Kent Overstreet
	Instead of overloading standard error codes (EINTR/EAGAIN), and defining short lists of error codes in multiple places that potentially end up overlapping & conflicting, we're now going to have one master list of error codes. Error codes are defined with an x-macro: thus we also have bch2_err_str() now. Also, error codes have a class field. Now, instead of checking for errors with ==, code should use bch2_err_matches(), which returns true if the error is equal to or a sub-error of the error class. This means we can define unique errors for every source location where an error is generated, which will help improve our error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: When fsck finds redundant snapshot keys, trigger snapshots cleanup	Kent Overstreet
	Fsck now checks for keys in different snapshot IDs that are now redundant due to other snapshots being deleted - it needs to for its own algorithms to not get confused. When it detects this it should re-run the post snapshot deletion cleanup - this patch does that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Convert bucket_alloc_ret to negative error codes	Kent Overstreet
	Start a new header, errcode.h, for bcachefs-private error codes - more error codes will be converted later. This patch just converts bucket_alloc_ret so that they can be mixed with standard error codes and passed as ERR_PTR errors - the ec.c code was doing this already, but incorrectly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>