diff options
author | Filipe Manana <fdmanana@suse.com> | 2021-03-11 14:31:06 +0000 |
---|---|---|
committer | David Sterba <dsterba@suse.com> | 2021-03-16 20:32:22 +0100 |
commit | 485df75554257e883d0ce39bb886e8212349748e (patch) | |
tree | 88e824c2d9aefc7678b1e9d556c44ad597ce4096 /include/trace | |
parent | dbcc7d57bffc0c8cac9dac11bec548597d59a6a5 (diff) |
btrfs: always pin deleted leaves when there are active tree mod log users
When freeing a tree block we may end up adding its extent back to the
free space cache/tree, as long as there are no more references for it,
it was created in the current transaction and writeback for it never
happened. This is generally fine, however when we have tree mod log
operations it can result in inconsistent versions of a btree after
unwinding extent buffers with the recorded tree mod log operations.
This is because:
* We only log operations for nodes (adding and removing key/pointers),
for leaves we don't do anything;
* This means that we can log a MOD_LOG_KEY_REMOVE_WHILE_FREEING operation
for a node that points to a leaf that was deleted;
* Before we apply the logged operation to unwind a node, we can have
that leaf's extent allocated again, either as a node or as a leaf, and
possibly for another btree. This is possible if the leaf was created in
the current transaction and writeback for it never started, in which
case btrfs_free_tree_block() returns its extent back to the free space
cache/tree;
* Then, before applying the tree mod log operation, some task allocates
the metadata extent just freed before, and uses it either as a leaf or
as a node for some btree (can be the same or another one, it does not
matter);
* After applying the MOD_LOG_KEY_REMOVE_WHILE_FREEING operation we now
get the target node with an item pointing to the metadata extent that
now has content different from what it had before the leaf was deleted.
It might now belong to a different btree and be a node and not a leaf
anymore.
As a consequence, the results of searches after the unwinding can be
unpredictable and produce unexpected results.
So make sure we pin extent buffers corresponding to leaves when there
are tree mod log users.
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'include/trace')
0 files changed, 0 insertions, 0 deletions