mm/memtest: add results of early memtest to /proc/meminfo

Currently the memtest results were only presented in dmesg. When running a large fleet of devices without ECC RAM it's currently not easy to do bulk monitoring for memory corruption. You have to parse dmesg, but that's a ring buffer so the error might disappear after some time. In general I do not consider dmesg to be a great API to query RAM status. In several companies I've seen such errors remain undetected and cause issues for way too long. So I think it makes sense to provide a monitoring API, so that we can safely detect and act upon them. This adds /proc/meminfo entry which can be easily used by scripts. Link: https://lkml.kernel.org/r/20230321103430.7130-1-tomas.mudrunka@gmail.com Signed-off-by: Tomas Mudrunka <tomas.mudrunka@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
author: Tomas Mudrunka <tomas.mudrunka@gmail.com> 2023-03-21 11:34:30 +0100
committer: Andrew Morton <akpm@linux-foundation.org> 2023-04-05 19:42:55 -0700
commit: bd23024b9774e681cbe6cc3afcb24244dfcb2390 (patch)
tree: 660d52ca5ef5b776a2299b5a189add72d34c39c9 /fs/proc/meminfo.c
parent: c9bb52738b39fabc8b6b9446f0d194eedb3e5a10 (diff)
1 files changed, 13 insertions, 0 deletions
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 440960110a42..b43d0bd42762 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -6,6 +6,7 @@
 #include <linux/hugetlb.h>
 #include <linux/mman.h>
 #include <linux/mmzone.h>
+#include <linux/memblock.h>
 #include <linux/proc_fs.h>
 #include <linux/percpu.h>
 #include <linux/seq_file.h>
@@ -131,6 +132,18 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 	show_val_kb(m, "VmallocChunk:   ", 0ul);
 	show_val_kb(m, "Percpu:         ", pcpu_nr_pages());
 
+#ifdef CONFIG_MEMTEST
+	if (early_memtest_done) {
+		unsigned long early_memtest_bad_size_kb;
+
+		early_memtest_bad_size_kb = early_memtest_bad_size>>10;
+		if (early_memtest_bad_size && !early_memtest_bad_size_kb)
+			early_memtest_bad_size_kb = 1;
+		/* When 0 is reported, it means there actually was a successful test */
+		seq_printf(m, "EarlyMemtestBad:   %5lu kB\n", early_memtest_bad_size_kb);
+	}
+#endif
+
 #ifdef CONFIG_MEMORY_FAILURE
 	seq_printf(m, "HardwareCorrupted: %5lu kB\n",
 		   atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10));
author	Tomas Mudrunka <tomas.mudrunka@gmail.com>	2023-03-21 11:34:30 +0100
committer	Andrew Morton <akpm@linux-foundation.org>	2023-04-05 19:42:55 -0700
commit	bd23024b9774e681cbe6cc3afcb24244dfcb2390 (patch)
tree	660d52ca5ef5b776a2299b5a189add72d34c39c9 /fs/proc/meminfo.c
parent	c9bb52738b39fabc8b6b9446f0d194eedb3e5a10 (diff)