summaryrefslogtreecommitdiff
path: root/fs/btrfs
diff options
context:
space:
mode:
authorQu Wenruo <wqu@suse.com>2024-01-27 10:18:36 +1030
committerDavid Sterba <dsterba@suse.com>2024-03-05 17:13:23 +0100
commitdd6a5719098a9eb0801af9978542115ac5115a02 (patch)
treea3f63011470dff404cd9560b7d0f3000da305d38 /fs/btrfs
parent25da852d83e93bb2019434bc05e7cdfa62c07240 (diff)
btrfs: tree-checker: dump the page status if hit something wrong
[BUG] There is a bug report about very suspicious tree-checker got triggered: BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] SELinux: inode_doinit_use_xattr: getxattr returned 117 for dev=dm-0 ino=5737268 [ANALYZE] The root cause is still unclear, but there are some clues already: - Unaligned eb bytenr The block bytenr is 8550954455682405139, which is not even aligned to 2. This bytenr is fetched from extent buffer header, not from eb->start. This means, at the initial time of read, eb header bytenr is still correct (the very basis check to continue read), but later something wrong happened, got at least the first page corrupted. Thus we got such obviously incorrect value. - Invalid extent buffer header owner The read itself is triggered for subvolume 256, but the eb header owner is 11858205567642294356, which is not really possible. The problem here is, subvolume id is limited to (1 << 48 - 1), and this one definitely goes beyond that limit. So this value is another garbage. We already got two garbage from an extent buffer, which passed the initial bytenr and csum checks, but later the contents become garbage at some point. This looks like a page lifespan problem (e.g. we didn't properly hold the page). [ENHANCEMENT] The current tree-checker only outputs things from the extent buffer, nothing with the page status. So this patch would enhance the tree-checker output by also dumping the first page, which would look like this: page:00000000aa9f3ce8 refcount:4 mapcount:0 mapping:00000000169aa6b6 index:0x1d0c pfn:0x1022e5 memcg:ffff888103456000 aops:btree_aops [btrfs] ino:1 flags: 0x2ffff0000008000(private|node=0|zone=2|lastcpupid=0xffff) page_type: 0xffffffff() raw: 02ffff0000008000 0000000000000000 dead000000000122 ffff88811e06e220 raw: 0000000000001d0c ffff888102fdb1d8 00000004ffffffff ffff888103456000 page dumped because: eb page dump BTRFS critical (device dm-3): corrupt leaf: root=5 block=30457856 slot=6 ino=257 file_offset=0, invalid disk_bytenr for file extent, have 10617606235235216665, should be aligned to 4096 BTRFS error (device dm-3): read time tree block corruption detected on logical 30457856 mirror 1 From the dump we can see some extra info, something can help us to do extra cross-checks: - Page refcount if it's too low, it definitely means something bad. - Page aops Any mapped eb page should have btree_aops with inode number 1. - Page index Since a mapped eb page should has its bytenr matching the page position, (index << PAGE_SHIFT) should match the bytenr of the bytenr from the critical line. - Page Private flags A mapped eb page should have Private flag set to indicate it's managed by btrfs. Link: https://lore.kernel.org/linux-btrfs/CAHk-=whNdMaN9ntZ47XRKP6DBes2E5w7fi-0U3H2+PS18p+Pzw@mail.gmail.com/ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs')
-rw-r--r--fs/btrfs/tree-checker.c6
1 files changed, 6 insertions, 0 deletions
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 4fa95eca285e..c8fbcae4e88e 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -65,6 +65,7 @@ static void generic_err(const struct extent_buffer *eb, int slot,
vaf.fmt = fmt;
vaf.va = &args;
+ dump_page(folio_page(eb->folios[0], 0), "eb page dump");
btrfs_crit(fs_info,
"corrupt %s: root=%llu block=%llu slot=%d, %pV",
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -92,6 +93,7 @@ static void file_extent_err(const struct extent_buffer *eb, int slot,
vaf.fmt = fmt;
vaf.va = &args;
+ dump_page(folio_page(eb->folios[0], 0), "eb page dump");
btrfs_crit(fs_info,
"corrupt %s: root=%llu block=%llu slot=%d ino=%llu file_offset=%llu, %pV",
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -152,6 +154,7 @@ static void dir_item_err(const struct extent_buffer *eb, int slot,
vaf.fmt = fmt;
vaf.va = &args;
+ dump_page(folio_page(eb->folios[0], 0), "eb page dump");
btrfs_crit(fs_info,
"corrupt %s: root=%llu block=%llu slot=%d ino=%llu, %pV",
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -647,6 +650,7 @@ static void block_group_err(const struct extent_buffer *eb, int slot,
vaf.fmt = fmt;
vaf.va = &args;
+ dump_page(folio_page(eb->folios[0], 0), "eb page dump");
btrfs_crit(fs_info,
"corrupt %s: root=%llu block=%llu slot=%d bg_start=%llu bg_len=%llu, %pV",
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -1003,6 +1007,7 @@ static void dev_item_err(const struct extent_buffer *eb, int slot,
vaf.fmt = fmt;
vaf.va = &args;
+ dump_page(folio_page(eb->folios[0], 0), "eb page dump");
btrfs_crit(eb->fs_info,
"corrupt %s: root=%llu block=%llu slot=%d devid=%llu %pV",
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -1258,6 +1263,7 @@ static void extent_err(const struct extent_buffer *eb, int slot,
vaf.fmt = fmt;
vaf.va = &args;
+ dump_page(folio_page(eb->folios[0], 0), "eb page dump");
btrfs_crit(eb->fs_info,
"corrupt %s: block=%llu slot=%d extent bytenr=%llu len=%llu %pV",
btrfs_header_level(eb) == 0 ? "leaf" : "node",