diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-08-29 14:53:51 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-08-29 14:53:51 -0700 |
commit | d68b4b6f307d155475cce541f2aee938032ed22e (patch) | |
tree | c2a6487ac8b1bce963b5b352b42e461a6fa8da15 | |
parent | b96a3e9142fdf346b05b20e867b4f0dfca119e96 (diff) | |
parent | dce8f8ed1de1d9d6d27c5ccd202ce4ec163b100c (diff) |
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
205 files changed, 2853 insertions, 1392 deletions
diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory index d8b0f80b9e33..a95e0f17c35a 100644 --- a/Documentation/ABI/testing/sysfs-devices-memory +++ b/Documentation/ABI/testing/sysfs-devices-memory @@ -110,3 +110,11 @@ Description: link is created for memory section 9 on node0. /sys/devices/system/node/node0/memory9 -> ../../memory/memory9 + +What: /sys/devices/system/memory/crash_hotplug +Date: Aug 2023 +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> +Description: + (RO) indicates whether or not the kernel directly supports + modifying the crash elfcorehdr for memory hot un/plug and/or + on/offline changes. diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 183a07c4f191..7ecd5c8161a6 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -688,3 +688,11 @@ Description: (RO) the list of CPUs that are isolated and don't participate in load balancing. These CPUs are set by boot parameter "isolcpus=". + +What: /sys/devices/system/cpu/crash_hotplug +Date: Aug 2023 +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> +Description: + (RO) indicates whether or not the kernel directly supports + modifying the crash elfcorehdr for CPU hot un/plug and/or + on/offline changes. diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 2994958c7ce8..cfe034cf1e87 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -291,6 +291,14 @@ The following files are currently defined: Availability depends on the CONFIG_ARCH_MEMORY_PROBE kernel configuration option. ``uevent`` read-write: generic udev file for device subsystems. +``crash_hotplug`` read-only: when changes to the system memory map + occur due to hot un/plug of memory, this file contains + '1' if the kernel updates the kdump capture kernel memory + map itself (via elfcorehdr), or '0' if userspace must update + the kdump capture kernel memory map. + + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel + configuration option. ====================== ========================================================= .. note:: diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst index b9ae591d0b18..9511e405aabd 100644 --- a/Documentation/core-api/cpu_hotplug.rst +++ b/Documentation/core-api/cpu_hotplug.rst @@ -741,6 +741,24 @@ will receive all events. A script like:: can process the event further. +When changes to the CPUs in the system occur, the sysfs file +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel +updates the kdump capture kernel list of CPUs itself (via elfcorehdr), +or '0' if userspace must update the kdump capture kernel list of CPUs. + +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration +option. + +To skip userspace processing of CPU hot un/plug events for kdump +(i.e. the unload-then-reload to obtain a current list of CPUs), this sysfs +file can be used in a udev rule as follows: + + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" + +For a CPU hot un/plug event, if the architecture supports kernel updates +of the elfcorehdr (which contains the list of CPUs), then the rule skips +the unload-then-reload of the kdump capture kernel. + Kernel Inline Documentations Reference ====================================== diff --git a/arch/Kconfig b/arch/Kconfig index 63c5d6a2022b..ec49c0100550 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -11,19 +11,6 @@ source "arch/$(SRCARCH)/Kconfig" menu "General architecture-dependent options" -config CRASH_CORE - bool - -config KEXEC_CORE - select CRASH_CORE - bool - -config KEXEC_ELF - bool - -config HAVE_IMA_KEXEC - bool - config ARCH_HAS_SUBPAGE_FAULTS bool help @@ -748,7 +735,9 @@ config HAS_LTO_CLANG depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm) depends on ARCH_SUPPORTS_LTO_CLANG depends on !FTRACE_MCOUNT_USE_RECORDMCOUNT - depends on !KASAN || KASAN_HW_TAGS + # https://github.com/ClangBuiltLinux/linux/issues/1721 + depends on (!KASAN || KASAN_HW_TAGS || CLANG_VERSION >= 170000) || !DEBUG_INFO + depends on (!KCOV || CLANG_VERSION >= 170000) || !DEBUG_INFO depends on !GCOV_KERNEL help The compiler and Kconfig options support building with Clang's diff --git a/arch/arc/configs/axs101_defconfig b/arch/arc/configs/axs101_defconfig index 81764160451f..89720d6d7e0d 100644 --- a/arch/arc/configs/axs101_defconfig +++ b/arch/arc/configs/axs101_defconfig @@ -9,7 +9,7 @@ CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set diff --git a/arch/arc/configs/axs103_defconfig b/arch/arc/configs/axs103_defconfig index d5181275490e..73ec01ed0492 100644 --- a/arch/arc/configs/axs103_defconfig +++ b/arch/arc/configs/axs103_defconfig @@ -9,7 +9,7 @@ CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set diff --git a/arch/arc/configs/axs103_smp_defconfig b/arch/arc/configs/axs103_smp_defconfig index 07c89281c2e3..4da0f626fa9d 100644 --- a/arch/arc/configs/axs103_smp_defconfig +++ b/arch/arc/configs/axs103_smp_defconfig @@ -9,7 +9,7 @@ CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/arc/configs/haps_hs_smp_defconfig b/arch/arc/configs/haps_hs_smp_defconfig index 61107e8bac33..6fc98c1b9b36 100644 --- a/arch/arc/configs/haps_hs_smp_defconfig +++ b/arch/arc/configs/haps_hs_smp_defconfig @@ -11,7 +11,7 @@ CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/arc/configs/hsdk_defconfig b/arch/arc/configs/hsdk_defconfig index 4ee2a1507b57..9e79154b5535 100644 --- a/arch/arc/configs/hsdk_defconfig +++ b/arch/arc/configs/hsdk_defconfig @@ -9,7 +9,7 @@ CONFIG_NAMESPACES=y # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y CONFIG_BLK_DEV_RAM=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/arc/configs/nsim_700_defconfig b/arch/arc/configs/nsim_700_defconfig index 3e9829775992..51092c39e360 100644 --- a/arch/arc/configs/nsim_700_defconfig +++ b/arch/arc/configs/nsim_700_defconfig @@ -12,7 +12,7 @@ CONFIG_NAMESPACES=y # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/arc/configs/nsimosci_defconfig b/arch/arc/configs/nsimosci_defconfig index 502c87f351c8..70c17bca4939 100644 --- a/arch/arc/configs/nsimosci_defconfig +++ b/arch/arc/configs/nsimosci_defconfig @@ -11,7 +11,7 @@ CONFIG_NAMESPACES=y # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/arc/configs/nsimosci_hs_defconfig b/arch/arc/configs/nsimosci_hs_defconfig index f721cc3997d0..59a3b6642fe7 100644 --- a/arch/arc/configs/nsimosci_hs_defconfig +++ b/arch/arc/configs/nsimosci_hs_defconfig @@ -11,7 +11,7 @@ CONFIG_NAMESPACES=y # CONFIG_PID_NS is not set CONFIG_BLK_DEV_INITRD=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/arc/configs/tb10x_defconfig b/arch/arc/configs/tb10x_defconfig index 941bbadd6bf2..1a68e4beebca 100644 --- a/arch/arc/configs/tb10x_defconfig +++ b/arch/arc/configs/tb10x_defconfig @@ -16,7 +16,7 @@ CONFIG_INITRAMFS_ROOT_GID=501 # CONFIG_RD_GZIP is not set CONFIG_KALLSYMS_ALL=y # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_COMPAT_BRK is not set CONFIG_ISA_ARCOMPACT=y CONFIG_MODULES=y diff --git a/arch/arc/configs/vdk_hs38_defconfig b/arch/arc/configs/vdk_hs38_defconfig index d3ef189c75f8..50c343913825 100644 --- a/arch/arc/configs/vdk_hs38_defconfig +++ b/arch/arc/configs/vdk_hs38_defconfig @@ -4,7 +4,7 @@ CONFIG_HIGH_RES_TIMERS=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set diff --git a/arch/arc/configs/vdk_hs38_smp_defconfig b/arch/arc/configs/vdk_hs38_smp_defconfig index 944b347025fd..6d9e1d9f71d2 100644 --- a/arch/arc/configs/vdk_hs38_smp_defconfig +++ b/arch/arc/configs/vdk_hs38_smp_defconfig @@ -4,7 +4,7 @@ CONFIG_HIGH_RES_TIMERS=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 7a27550ff3c1..9557808e8937 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -250,7 +250,7 @@ config ARCH_MTD_XIP bool config ARM_PATCH_PHYS_VIRT - bool "Patch physical to virtual translations at runtime" if EMBEDDED + bool "Patch physical to virtual translations at runtime" if !ARCH_MULTIPLATFORM default y depends on MMU help @@ -1645,20 +1645,8 @@ config XIP_DEFLATED_DATA copied, saving some precious ROM space. A possible drawback is a slightly longer boot delay. -config KEXEC - bool "Kexec system call (EXPERIMENTAL)" - depends on (!SMP || PM_SLEEP_SMP) - depends on MMU - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. +config ARCH_SUPPORTS_KEXEC + def_bool (!SMP || PM_SLEEP_SMP) && MMU config ATAGS_PROC bool "Export atags in procfs" @@ -1668,17 +1656,8 @@ config ATAGS_PROC Should the atags used to boot the kernel be exported in an "atags" file in procfs. Useful with kexec. -config CRASH_DUMP - bool "Build kdump crash kernel (EXPERIMENTAL)" - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. The crash dump kernel must be compiled to a - memory address not used by the main kernel - - For more details see Documentation/admin-guide/kdump/kdump.rst +config ARCH_SUPPORTS_CRASH_DUMP + def_bool y config AUTO_ZRELADDR bool "Auto calculation of the decompressed kernel image address" if !ARCH_MULTIPLATFORM diff --git a/arch/arm/configs/aspeed_g4_defconfig b/arch/arm/configs/aspeed_g4_defconfig index a5c65d28ca63..ed66a2958912 100644 --- a/arch/arm/configs/aspeed_g4_defconfig +++ b/arch/arm/configs/aspeed_g4_defconfig @@ -15,7 +15,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_UID16 is not set # CONFIG_SYSFS_SYSCALL is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_ASPEED=y diff --git a/arch/arm/configs/aspeed_g5_defconfig b/arch/arm/configs/aspeed_g5_defconfig index c7c11cbaa39d..b6487a4b45c0 100644 --- a/arch/arm/configs/aspeed_g5_defconfig +++ b/arch/arm/configs/aspeed_g5_defconfig @@ -15,7 +15,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_UID16 is not set # CONFIG_SYSFS_SYSCALL is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y CONFIG_ARCH_MULTI_V6=y CONFIG_ARCH_ASPEED=y diff --git a/arch/arm/configs/at91_dt_defconfig b/arch/arm/configs/at91_dt_defconfig index 8cc602853cc5..71b5acc78187 100644 --- a/arch/arm/configs/at91_dt_defconfig +++ b/arch/arm/configs/at91_dt_defconfig @@ -7,7 +7,7 @@ CONFIG_CGROUPS=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_ARCH_MULTI_V4T=y CONFIG_ARCH_MULTI_V5=y # CONFIG_ARCH_MULTI_V7 is not set diff --git a/arch/arm/configs/axm55xx_defconfig b/arch/arm/configs/axm55xx_defconfig index d1c550894a65..516689dc6cf1 100644 --- a/arch/arm/configs/axm55xx_defconfig +++ b/arch/arm/configs/axm55xx_defconfig @@ -21,7 +21,7 @@ CONFIG_NAMESPACES=y CONFIG_SCHED_AUTOGROUP=y CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_ARCH_AXXIA=y CONFIG_ARM_LPAE=y diff --git a/arch/arm/configs/bcm2835_defconfig b/arch/arm/configs/bcm2835_defconfig index ce092abcd323..225a16c0323c 100644 --- a/arch/arm/configs/bcm2835_defconfig +++ b/arch/arm/configs/bcm2835_defconfig @@ -19,7 +19,7 @@ CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_CC_STACKPROTECTOR_REGULAR=y CONFIG_ARCH_MULTI_V6=y diff --git a/arch/arm/configs/clps711x_defconfig b/arch/arm/configs/clps711x_defconfig index adcee238822a..d7ed1e7c6a90 100644 --- a/arch/arm/configs/clps711x_defconfig +++ b/arch/arm/configs/clps711x_defconfig @@ -3,7 +3,7 @@ CONFIG_SYSVIPC=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_BLK_DEV_INITRD=y CONFIG_RD_LZMA=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_JUMP_LABEL=y CONFIG_PARTITION_ADVANCED=y CONFIG_ARCH_CLPS711X=y diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig index 1cb145633a91..d95686d19401 100644 --- a/arch/arm/configs/keystone_defconfig +++ b/arch/arm/configs/keystone_defconfig @@ -14,7 +14,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_ELF_CORE is not set # CONFIG_BASE_FULL is not set CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_ARCH_KEYSTONE=y CONFIG_ARM_LPAE=y diff --git a/arch/arm/configs/lpc18xx_defconfig b/arch/arm/configs/lpc18xx_defconfig index 56eae6a0a311..d169da9b2824 100644 --- a/arch/arm/configs/lpc18xx_defconfig +++ b/arch/arm/configs/lpc18xx_defconfig @@ -14,7 +14,7 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_SIGNALFD is not set # CONFIG_EVENTFD is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_MMU is not set CONFIG_ARCH_LPC18XX=y CONFIG_SET_MEM_PARAM=y diff --git a/arch/arm/configs/lpc32xx_defconfig b/arch/arm/configs/lpc32xx_defconfig index e2b0ff0b253f..42053e45f730 100644 --- a/arch/arm/configs/lpc32xx_defconfig +++ b/arch/arm/configs/lpc32xx_defconfig @@ -9,7 +9,7 @@ CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_LPC32XX=y CONFIG_AEABI=y diff --git a/arch/arm/configs/milbeaut_m10v_defconfig b/arch/arm/configs/milbeaut_m10v_defconfig index 7d4284502325..f5eeac9c65c3 100644 --- a/arch/arm/configs/milbeaut_m10v_defconfig +++ b/arch/arm/configs/milbeaut_m10v_defconfig @@ -3,7 +3,7 @@ CONFIG_NO_HZ_IDLE=y CONFIG_HIGH_RES_TIMERS=y CONFIG_CGROUPS=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y CONFIG_ARCH_MILBEAUT=y CONFIG_ARCH_MILBEAUT_M10V=y diff --git a/arch/arm/configs/moxart_defconfig b/arch/arm/configs/moxart_defconfig index ea31f116d577..bb6a5222e42f 100644 --- a/arch/arm/configs/moxart_defconfig +++ b/arch/arm/configs/moxart_defconfig @@ -10,7 +10,7 @@ CONFIG_IKCONFIG_PROC=y # CONFIG_TIMERFD is not set # CONFIG_EVENTFD is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_BLK_DEV_BSG is not set CONFIG_ARCH_MULTI_V4=y # CONFIG_ARCH_MULTI_V7 is not set diff --git a/arch/arm/configs/multi_v4t_defconfig b/arch/arm/configs/multi_v4t_defconfig index b60000a89aff..a7fabf1d88ff 100644 --- a/arch/arm/configs/multi_v4t_defconfig +++ b/arch/arm/configs/multi_v4t_defconfig @@ -2,7 +2,7 @@ CONFIG_KERNEL_LZMA=y CONFIG_SYSVIPC=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_ARCH_MULTI_V4T=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_AT91=y diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig index c7b2550d706c..766f2f83e2c9 100644 --- a/arch/arm/configs/multi_v7_defconfig +++ b/arch/arm/configs/multi_v7_defconfig @@ -3,7 +3,7 @@ CONFIG_NO_HZ_IDLE=y CONFIG_HIGH_RES_TIMERS=y CONFIG_CGROUPS=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y CONFIG_ARCH_VIRT=y CONFIG_ARCH_AIROHA=y diff --git a/arch/arm/configs/pxa_defconfig b/arch/arm/configs/pxa_defconfig index b0c3355e2599..23c131b0854b 100644 --- a/arch/arm/configs/pxa_defconfig +++ b/arch/arm/configs/pxa_defconfig @@ -11,7 +11,7 @@ CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=13 CONFIG_BLK_DEV_INITRD=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y # CONFIG_ARCH_MULTI_V7 is not set CONFIG_ARCH_PXA=y diff --git a/arch/arm/configs/qcom_defconfig b/arch/arm/configs/qcom_defconfig index ab686b9c2ad8..737d51412eb2 100644 --- a/arch/arm/configs/qcom_defconfig +++ b/arch/arm/configs/qcom_defconfig @@ -7,7 +7,7 @@ CONFIG_IKCONFIG_PROC=y CONFIG_CGROUPS=y CONFIG_BLK_DEV_INITRD=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_ARCH_QCOM=y CONFIG_ARCH_MSM8X60=y diff --git a/arch/arm/configs/sama5_defconfig b/arch/arm/configs/sama5_defconfig index 56ee511210de..0e030063130f 100644 --- a/arch/arm/configs/sama5_defconfig +++ b/arch/arm/configs/sama5_defconfig @@ -5,7 +5,7 @@ CONFIG_HIGH_RES_TIMERS=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_CGROUPS=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_ARCH_AT91=y CONFIG_SOC_SAMA5D2=y CONFIG_SOC_SAMA5D3=y diff --git a/arch/arm/configs/sama7_defconfig b/arch/arm/configs/sama7_defconfig index a89767c89d17..be0cfed4ecf1 100644 --- a/arch/arm/configs/sama7_defconfig +++ b/arch/arm/configs/sama7_defconfig @@ -12,7 +12,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_FHANDLE is not set # CONFIG_IO_URING is not set CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_ARCH_AT91=y CONFIG_SOC_SAMA7G5=y CONFIG_ATMEL_CLOCKSOURCE_TCB=y diff --git a/arch/arm/configs/socfpga_defconfig b/arch/arm/configs/socfpga_defconfig index d6dfae196f84..e82c3866b810 100644 --- a/arch/arm/configs/socfpga_defconfig +++ b/arch/arm/configs/socfpga_defconfig @@ -7,7 +7,7 @@ CONFIG_CGROUPS=y CONFIG_CPUSETS=y CONFIG_NAMESPACES=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_ARCH_INTEL_SOCFPGA=y CONFIG_ARM_THUMBEE=y diff --git a/arch/arm/configs/stm32_defconfig b/arch/arm/configs/stm32_defconfig index dc1a32f50b7e..e95aba916547 100644 --- a/arch/arm/configs/stm32_defconfig +++ b/arch/arm/configs/stm32_defconfig @@ -11,7 +11,7 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_SIGNALFD is not set # CONFIG_EVENTFD is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_BLK_DEV_BSG is not set # CONFIG_MMU is not set CONFIG_ARCH_STM32=y diff --git a/arch/arm/configs/tegra_defconfig b/arch/arm/configs/tegra_defconfig index 3c6af935e932..613f07b8ce15 100644 --- a/arch/arm/configs/tegra_defconfig +++ b/arch/arm/configs/tegra_defconfig @@ -14,7 +14,7 @@ CONFIG_NAMESPACES=y CONFIG_USER_NS=y CONFIG_BLK_DEV_INITRD=y # CONFIG_ELF_CORE is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y CONFIG_ARCH_TEGRA=y CONFIG_SMP=y diff --git a/arch/arm/configs/vf610m4_defconfig b/arch/arm/configs/vf610m4_defconfig index 2e47cc57a928..963ff0a03311 100644 --- a/arch/arm/configs/vf610m4_defconfig +++ b/arch/arm/configs/vf610m4_defconfig @@ -5,7 +5,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_RD_XZ is not set # CONFIG_RD_LZ4 is not set CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_MMU is not set CONFIG_ARCH_MXC=y CONFIG_SOC_VF610=y diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 18605f1b3580..26c1d2ced4ce 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -32,7 +32,7 @@ void handle_IRQ(unsigned int, struct pt_regs *); #include <linux/cpumask.h> extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask, - bool exclude_self); + int exclude_cpu); #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace #endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 6756203e45f3..3431c0553f45 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -846,7 +846,7 @@ static void raise_nmi(cpumask_t *mask) __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask); } -void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { - nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi); + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, raise_nmi); } diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c060a26dec28..b10515c0200b 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1460,60 +1460,28 @@ config PARAVIRT_TIME_ACCOUNTING If in doubt, say N here. -config KEXEC - depends on PM_SLEEP_SMP - select KEXEC_CORE - bool "kexec system call" - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - -config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select HAVE_IMA_KEXEC if IMA - help - This is new version of kexec system call. This system call is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. +config ARCH_SUPPORTS_KEXEC + def_bool PM_SLEEP_SMP -config KEXEC_SIG - bool "Verify kernel signature during kexec_file_load() syscall" - depends on KEXEC_FILE - help - Select this option to verify a signature with loaded kernel - image. If configured, any attempt of loading a image without - valid signature will fail. +config ARCH_SUPPORTS_KEXEC_FILE + def_bool y - In addition to that option, you need to enable signature - verification for the corresponding kernel image type being - loaded in order for this to work. +config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE + select HAVE_IMA_KEXEC if IMA -config KEXEC_IMAGE_VERIFY_SIG - bool "Enable Image signature verification support" - default y - depends on KEXEC_SIG - depends on EFI && SIGNED_PE_FILE_VERIFICATION - help - Enable Image signature verification support. +config ARCH_SUPPORTS_KEXEC_SIG + def_bool y -comment "Support for PE file signature verification disabled" - depends on KEXEC_SIG - depends on !EFI || !SIGNED_PE_FILE_VERIFICATION +config ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG + def_bool y -config CRASH_DUMP - bool "Build kdump crash kernel" - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. +config ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG + def_bool y - For more details see Documentation/admin-guide/kdump/kdump.rst +config ARCH_SUPPORTS_CRASH_DUMP + def_bool y config TRANS_TABLE def_bool y diff --git a/arch/hexagon/configs/comet_defconfig b/arch/hexagon/configs/comet_defconfig index 9b2b1cc0794a..6cb764947596 100644 --- a/arch/hexagon/configs/comet_defconfig +++ b/arch/hexagon/configs/comet_defconfig @@ -14,7 +14,7 @@ CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=18 CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_BLK_DEV_BSG is not set CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 3ab75f36c037..53faa122b0f4 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -362,31 +362,13 @@ config IA64_HP_AML_NFW the "force" module parameter, e.g., with the "aml_nfw.force" kernel command line option. -config KEXEC - bool "kexec system call" - depends on !SMP || HOTPLUG_CPU - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. +endmenu -config CRASH_DUMP - bool "kernel crash dumps" - depends on IA64_MCA_RECOVERY && (!SMP || HOTPLUG_CPU) - help - Generate crash dump after being started by kexec. +config ARCH_SUPPORTS_KEXEC + def_bool !SMP || HOTPLUG_CPU -endmenu +config ARCH_SUPPORTS_CRASH_DUMP + def_bool IA64_MCA_RECOVERY && (!SMP || HOTPLUG_CPU) menu "Power management and ACPI options" diff --git a/arch/ia64/include/asm/cmpxchg.h b/arch/ia64/include/asm/cmpxchg.h index 8b2e644ef6a1..d85ee1a0a227 100644 --- a/arch/ia64/include/asm/cmpxchg.h +++ b/arch/ia64/include/asm/cmpxchg.h @@ -13,4 +13,21 @@ #define arch_cmpxchg_local arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg64 +#ifdef CONFIG_IA64_DEBUG_CMPXCHG +# define CMPXCHG_BUGCHECK_DECL int _cmpxchg_bugcheck_count = 128; +# define CMPXCHG_BUGCHECK(v) \ +do { \ + if (_cmpxchg_bugcheck_count-- <= 0) { \ + void *ip; \ + extern int _printk(const char *fmt, ...); \ + ip = (void *) ia64_getreg(_IA64_REG_IP); \ + _printk("CMPXCHG_BUGCHECK: stuck at %p on word %p\n", ip, (v));\ + break; \ + } \ +} while (0) +#else /* !CONFIG_IA64_DEBUG_CMPXCHG */ +# define CMPXCHG_BUGCHECK_DECL +# define CMPXCHG_BUGCHECK(v) +#endif /* !CONFIG_IA64_DEBUG_CMPXCHG */ + #endif /* _ASM_IA64_CMPXCHG_H */ diff --git a/arch/ia64/include/uapi/asm/cmpxchg.h b/arch/ia64/include/uapi/asm/cmpxchg.h index 85cba138311f..a59b5de6eec6 100644 --- a/arch/ia64/include/uapi/asm/cmpxchg.h +++ b/arch/ia64/include/uapi/asm/cmpxchg.h @@ -133,23 +133,6 @@ extern long ia64_cmpxchg_called_with_bad_pointer(void); #define cmpxchg64_local cmpxchg64 #endif -#ifdef CONFIG_IA64_DEBUG_CMPXCHG -# define CMPXCHG_BUGCHECK_DECL int _cmpxchg_bugcheck_count = 128; -# define CMPXCHG_BUGCHECK(v) \ -do { \ - if (_cmpxchg_bugcheck_count-- <= 0) { \ - void *ip; \ - extern int _printk(const char *fmt, ...); \ - ip = (void *) ia64_getreg(_IA64_REG_IP); \ - _printk("CMPXCHG_BUGCHECK: stuck at %p on word %p\n", ip, (v));\ - break; \ - } \ -} while (0) -#else /* !CONFIG_IA64_DEBUG_CMPXCHG */ -# define CMPXCHG_BUGCHECK_DECL -# define CMPXCHG_BUGCHECK(v) -#endif /* !CONFIG_IA64_DEBUG_CMPXCHG */ - #endif /* !__ASSEMBLY__ */ #endif /* _UAPI_ASM_IA64_CMPXCHG_H */ diff --git a/arch/ia64/kernel/mca.c b/arch/ia64/kernel/mca.c index 92ede80d17fe..2671688d349a 100644 --- a/arch/ia64/kernel/mca.c +++ b/arch/ia64/kernel/mca.c @@ -1630,10 +1630,10 @@ default_monarch_init_process(struct notifier_block *self, unsigned long val, voi } printk("\n\n"); if (read_trylock(&tasklist_lock)) { - do_each_thread (g, t) { + for_each_process_thread(g, t) { printk("\nBacktrace of pid %d (%s)\n", t->pid, t->comm); show_stack(t, NULL, KERN_DEFAULT); - } while_each_thread (g, t); + } read_unlock(&tasklist_lock); } /* FIXME: This will not restore zapped printk locks. */ diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig index 34d7a3de9a79..ecf282dee513 100644 --- a/arch/loongarch/Kconfig +++ b/arch/loongarch/Kconfig @@ -538,28 +538,16 @@ config CPU_HAS_PREFETCH bool default y -config KEXEC - bool "Kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. +config ARCH_SUPPORTS_KEXEC + def_bool y - The name comes from the similarity to the exec system call. +config ARCH_SUPPORTS_CRASH_DUMP + def_bool y -config CRASH_DUMP - bool "Build kdump crash kernel" +config ARCH_SELECTS_CRASH_DUMP + def_bool y + depends on CRASH_DUMP select RELOCATABLE - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. - - For more details see Documentation/admin-guide/kdump/kdump.rst config RELOCATABLE bool "Relocatable kernel" diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h index a115e8999c69..218b4da0ea90 100644 --- a/arch/loongarch/include/asm/irq.h +++ b/arch/loongarch/include/asm/irq.h @@ -40,7 +40,7 @@ void spurious_interrupt(void); #define NR_IRQS_LEGACY 16 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace -void arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self); +void arch_trigger_cpumask_backtrace(const struct cpumask *mask, int exclude_cpu); #define MAX_IO_PICS 2 #define NR_IRQS (64 + (256 * MAX_IO_PICS)) diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c index 4ee1e9d6a65f..ba457e43f5be 100644 --- a/arch/loongarch/kernel/process.c +++ b/arch/loongarch/kernel/process.c @@ -338,9 +338,9 @@ static void raise_backtrace(cpumask_t *mask) } } -void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { - nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace); + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, raise_backtrace); } #ifdef CONFIG_64BIT diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig index dc792b321f1e..3e318bf9504c 100644 --- a/arch/m68k/Kconfig +++ b/arch/m68k/Kconfig @@ -89,23 +89,8 @@ config MMU_SUN3 bool depends on MMU && !MMU_MOTOROLA && !MMU_COLDFIRE -config KEXEC - bool "kexec system call" - depends on M68KCLASSIC && MMU - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. +config ARCH_SUPPORTS_KEXEC + def_bool M68KCLASSIC && MMU config BOOTINFO_PROC bool "Export bootinfo in procfs" diff --git a/arch/m68k/configs/amcore_defconfig b/arch/m68k/configs/amcore_defconfig index 041adcf6ecfc..67a0d157122d 100644 --- a/arch/m68k/configs/amcore_defconfig +++ b/arch/m68k/configs/amcore_defconfig @@ -8,7 +8,7 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_AIO is not set # CONFIG_ADVISE_SYSCALLS is not set # CONFIG_MEMBARRIER is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/m68k/configs/m5475evb_defconfig b/arch/m68k/configs/m5475evb_defconfig index 93f7c7a07553..2473dc30228e 100644 --- a/arch/m68k/configs/m5475evb_defconfig +++ b/arch/m68k/configs/m5475evb_defconfig @@ -8,7 +8,7 @@ CONFIG_LOG_BUF_SHIFT=14 # CONFIG_EVENTFD is not set # CONFIG_SHMEM is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MODULES=y # CONFIG_BLK_DEV_BSG is not set CONFIG_COLDFIRE=y diff --git a/arch/m68k/configs/stmark2_defconfig b/arch/m68k/configs/stmark2_defconfig index 8898ae321779..7787a4dd7c3c 100644 --- a/arch/m68k/configs/stmark2_defconfig +++ b/arch/m68k/configs/stmark2_defconfig @@ -9,7 +9,7 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_AIO is not set # CONFIG_ADVISE_SYSCALLS is not set # CONFIG_MEMBARRIER is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set CONFIG_COLDFIRE=y diff --git a/arch/microblaze/configs/mmu_defconfig b/arch/microblaze/configs/mmu_defconfig index 38dcdce367ec..7b2d7f6f23c0 100644 --- a/arch/microblaze/configs/mmu_defconfig +++ b/arch/microblaze/configs/mmu_defconfig @@ -7,7 +7,7 @@ CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y # CONFIG_BASE_FULL is not set CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_XILINX_MICROBLAZE0_USE_MSR_INSTR=1 CONFIG_XILINX_MICROBLAZE0_USE_PCMP_INSTR=1 CONFIG_XILINX_MICROBLAZE0_USE_BARREL=1 diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index fc6fba925aea..bc8421859006 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -2878,33 +2878,11 @@ config HZ config SCHED_HRTICK def_bool HIGH_RES_TIMERS -config KEXEC - bool "Kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - -config CRASH_DUMP - bool "Kernel crash dumps" - help - Generate crash dump after being started by kexec. - This should be normally only set in special crash dump kernels - which are loaded in the main kernel with kexec-tools into - a specially reserved region and then later executed after - a crash by kdump/kexec. The crash dump kernel must be compiled - to a memory address not used by the main kernel or firmware using - PHYSICAL_START. +config ARCH_SUPPORTS_KEXEC + def_bool y + +config ARCH_SUPPORTS_CRASH_DUMP + def_bool y config PHYSICAL_START hex "Physical address where the kernel is loaded" diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c index c5561016f577..1ad2602a0383 100644 --- a/arch/mips/cavium-octeon/setup.c +++ b/arch/mips/cavium-octeon/setup.c @@ -1240,7 +1240,7 @@ static int __init octeon_no_pci_init(void) */ octeon_dummy_iospace = vzalloc(IO_SPACE_LIMIT); set_io_port_base((unsigned long)octeon_dummy_iospace); - ioport_resource.start = MAX_RESOURCE; + ioport_resource.start = RESOURCE_SIZE_MAX; ioport_resource.end = 0; return 0; } diff --git a/arch/mips/configs/ath25_defconfig b/arch/mips/configs/ath25_defconfig index afd1c16242e9..1d939ba9738d 100644 --- a/arch/mips/configs/ath25_defconfig +++ b/arch/mips/configs/ath25_defconfig @@ -11,7 +11,7 @@ CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_FHANDLE is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/mips/configs/ath79_defconfig b/arch/mips/configs/ath79_defconfig index 0b741716c852..8caa03a41327 100644 --- a/arch/mips/configs/ath79_defconfig +++ b/arch/mips/configs/ath79_defconfig @@ -5,7 +5,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_RD_GZIP is not set # CONFIG_AIO is not set # CONFIG_KALLSYMS is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/mips/configs/bcm47xx_defconfig b/arch/mips/configs/bcm47xx_defconfig index 62c462a23edc..6a68a96d13f8 100644 --- a/arch/mips/configs/bcm47xx_defconfig +++ b/arch/mips/configs/bcm47xx_defconfig @@ -2,7 +2,7 @@ CONFIG_SYSVIPC=y CONFIG_HIGH_RES_TIMERS=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_BCM47XX=y CONFIG_PCI=y # CONFIG_SUSPEND is not set diff --git a/arch/mips/configs/ci20_defconfig b/arch/mips/configs/ci20_defconfig index 812287a5b4fd..cdf2a782dee1 100644 --- a/arch/mips/configs/ci20_defconfig +++ b/arch/mips/configs/ci20_defconfig @@ -18,7 +18,7 @@ CONFIG_NAMESPACES=y CONFIG_USER_NS=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MACH_INGENIC_SOC=y CONFIG_JZ4780_CI20=y CONFIG_HIGHMEM=y diff --git a/arch/mips/configs/cu1000-neo_defconfig b/arch/mips/configs/cu1000-neo_defconfig index afe39ce7568e..19517beaf540 100644 --- a/arch/mips/configs/cu1000-neo_defconfig +++ b/arch/mips/configs/cu1000-neo_defconfig @@ -15,7 +15,7 @@ CONFIG_NAMESPACES=y CONFIG_USER_NS=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set CONFIG_MACH_INGENIC_SOC=y diff --git a/arch/mips/configs/cu1830-neo_defconfig b/arch/mips/configs/cu1830-neo_defconfig index 347c9fd1c21d..b403e67ab105 100644 --- a/arch/mips/configs/cu1830-neo_defconfig +++ b/arch/mips/configs/cu1830-neo_defconfig @@ -15,7 +15,7 @@ CONFIG_NAMESPACES=y CONFIG_USER_NS=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set CONFIG_MACH_INGENIC_SOC=y diff --git a/arch/mips/configs/db1xxx_defconfig b/arch/mips/configs/db1xxx_defconfig index 08d29b56e175..b2d9253ff786 100644 --- a/arch/mips/configs/db1xxx_defconfig +++ b/arch/mips/configs/db1xxx_defconfig @@ -17,7 +17,7 @@ CONFIG_CGROUP_FREEZER=y CONFIG_CGROUP_DEVICE=y CONFIG_CGROUP_CPUACCT=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MIPS_ALCHEMY=y CONFIG_HZ_100=y CONFIG_PCI=y diff --git a/arch/mips/configs/gcw0_defconfig b/arch/mips/configs/gcw0_defconfig index 460683b52285..bc1ef66e3999 100644 --- a/arch/mips/configs/gcw0_defconfig +++ b/arch/mips/configs/gcw0_defconfig @@ -2,7 +2,7 @@ CONFIG_DEFAULT_HOSTNAME="gcw0" CONFIG_NO_HZ_IDLE=y CONFIG_HIGH_RES_TIMERS=y CONFIG_PREEMPT_VOLUNTARY=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_MACH_INGENIC_SOC=y CONFIG_JZ4770_GCW0=y diff --git a/arch/mips/configs/generic_defconfig b/arch/mips/configs/generic_defconfig index c2cd2b181ef3..071e2205c7ed 100644 --- a/arch/mips/configs/generic_defconfig +++ b/arch/mips/configs/generic_defconfig @@ -17,7 +17,7 @@ CONFIG_SCHED_AUTOGROUP=y CONFIG_BLK_DEV_INITRD=y CONFIG_BPF_SYSCALL=y CONFIG_USERFAULTFD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set CONFIG_CPU_LITTLE_ENDIAN=y diff --git a/arch/mips/configs/loongson2k_defconfig b/arch/mips/configs/loongson2k_defconfig index ec3ee8df737d..4b7f914d01d0 100644 --- a/arch/mips/configs/loongson2k_defconfig +++ b/arch/mips/configs/loongson2k_defconfig @@ -18,7 +18,7 @@ CONFIG_SCHED_AUTOGROUP=y CONFIG_SYSFS_DEPRECATED=y CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MACH_LOONGSON64=y # CONFIG_CPU_LOONGSON3_CPUCFG_EMULATION is not set CONFIG_HZ_256=y diff --git a/arch/mips/configs/loongson3_defconfig b/arch/mips/configs/loongson3_defconfig index 129426351237..2b4133176930 100644 --- a/arch/mips/configs/loongson3_defconfig +++ b/arch/mips/configs/loongson3_defconfig @@ -26,7 +26,7 @@ CONFIG_SYSFS_DEPRECATED=y CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y CONFIG_BPF_SYSCALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_EVENTS=y CONFIG_MACH_LOONGSON64=y CONFIG_CPU_HAS_MSA=y diff --git a/arch/mips/configs/malta_qemu_32r6_defconfig b/arch/mips/configs/malta_qemu_32r6_defconfig index 82183ec6bc31..b21f48863d81 100644 --- a/arch/mips/configs/malta_qemu_32r6_defconfig +++ b/arch/mips/configs/malta_qemu_32r6_defconfig @@ -5,7 +5,7 @@ CONFIG_NO_HZ=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MIPS_MALTA=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R6=y diff --git a/arch/mips/configs/maltaaprp_defconfig b/arch/mips/configs/maltaaprp_defconfig index 9a199867a5e7..ecfa8a396c33 100644 --- a/arch/mips/configs/maltaaprp_defconfig +++ b/arch/mips/configs/maltaaprp_defconfig @@ -5,7 +5,7 @@ CONFIG_AUDIT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MIPS_MALTA=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R2=y diff --git a/arch/mips/configs/maltasmvp_defconfig b/arch/mips/configs/maltasmvp_defconfig index e5502d66a474..5cb4f262a4ea 100644 --- a/arch/mips/configs/maltasmvp_defconfig +++ b/arch/mips/configs/maltasmvp_defconfig @@ -5,7 +5,7 @@ CONFIG_NO_HZ=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MIPS_MALTA=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R2=y diff --git a/arch/mips/configs/maltasmvp_eva_defconfig b/arch/mips/configs/maltasmvp_eva_defconfig index a378aad97138..5e1498296782 100644 --- a/arch/mips/configs/maltasmvp_eva_defconfig +++ b/arch/mips/configs/maltasmvp_eva_defconfig @@ -5,7 +5,7 @@ CONFIG_NO_HZ=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MIPS_MALTA=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R2=y diff --git a/arch/mips/configs/maltaup_defconfig b/arch/mips/configs/maltaup_defconfig index fc6f88cae7be..c8594505d676 100644 --- a/arch/mips/configs/maltaup_defconfig +++ b/arch/mips/configs/maltaup_defconfig @@ -6,7 +6,7 @@ CONFIG_NO_HZ=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MIPS_MALTA=y CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_CPU_MIPS32_R2=y diff --git a/arch/mips/configs/omega2p_defconfig b/arch/mips/configs/omega2p_defconfig index 91fe2822f897..7c1c1b974d8f 100644 --- a/arch/mips/configs/omega2p_defconfig +++ b/arch/mips/configs/omega2p_defconfig @@ -17,7 +17,7 @@ CONFIG_NAMESPACES=y CONFIG_USER_NS=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/mips/configs/pic32mzda_defconfig b/arch/mips/configs/pic32mzda_defconfig index 0e494c24246f..166d2ad372d1 100644 --- a/arch/mips/configs/pic32mzda_defconfig +++ b/arch/mips/configs/pic32mzda_defconfig @@ -7,7 +7,7 @@ CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=14 CONFIG_RELAY=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_COMPAT_BRK is not set CONFIG_MACH_PIC32=y CONFIG_DTB_PIC32_MZDA_SK=y diff --git a/arch/mips/configs/qi_lb60_defconfig b/arch/mips/configs/qi_lb60_defconfig index c27c8c7151a1..5f5b0254d75e 100644 --- a/arch/mips/configs/qi_lb60_defconfig +++ b/arch/mips/configs/qi_lb60_defconfig @@ -3,7 +3,7 @@ CONFIG_SYSVIPC=y # CONFIG_CROSS_MEMORY_ATTACH is not set CONFIG_LOG_BUF_SHIFT=14 CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set CONFIG_MACH_INGENIC_SOC=y diff --git a/arch/mips/configs/rs90_defconfig b/arch/mips/configs/rs90_defconfig index 85ea2a6775f5..4b9e36d6400e 100644 --- a/arch/mips/configs/rs90_defconfig +++ b/arch/mips/configs/rs90_defconfig @@ -15,7 +15,7 @@ CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y # CONFIG_IO_URING is not set # CONFIG_ADVISE_SYSCALLS is not set # CONFIG_KALLSYMS is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_PERF_EVENTS is not set CONFIG_PROFILING=y CONFIG_MACH_INGENIC_SOC=y diff --git a/arch/mips/configs/rt305x_defconfig b/arch/mips/configs/rt305x_defconfig index bf017d493002..332f9094e847 100644 --- a/arch/mips/configs/rt305x_defconfig +++ b/arch/mips/configs/rt305x_defconfig @@ -7,7 +7,7 @@ CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_AIO is not set CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/mips/configs/vocore2_defconfig b/arch/mips/configs/vocore2_defconfig index e47d4cc3353b..7c8ebb1b56da 100644 --- a/arch/mips/configs/vocore2_defconfig +++ b/arch/mips/configs/vocore2_defconfig @@ -17,7 +17,7 @@ CONFIG_NAMESPACES=y CONFIG_USER_NS=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/mips/configs/xway_defconfig b/arch/mips/configs/xway_defconfig index eb5acf1f24ae..08c0aa03fd56 100644 --- a/arch/mips/configs/xway_defconfig +++ b/arch/mips/configs/xway_defconfig @@ -7,7 +7,7 @@ CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_AIO is not set CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h index 75abfa834ab7..3a848e7e69f7 100644 --- a/arch/mips/include/asm/irq.h +++ b/arch/mips/include/asm/irq.h @@ -77,7 +77,7 @@ extern int cp0_fdc_irq; extern int get_c0_fdc_int(void); void arch_trigger_cpumask_backtrace(const struct cpumask *mask, - bool exclude_self); + int exclude_cpu); #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace #endif /* _ASM_IRQ_H */ diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index a3225912c862..5387ed0a5186 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -750,9 +750,9 @@ static void raise_backtrace(cpumask_t *mask) } } -void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { - nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace); + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, raise_backtrace); } int mips_get_process_fp_mode(struct task_struct *task) diff --git a/arch/nios2/configs/10m50_defconfig b/arch/nios2/configs/10m50_defconfig index 63151ebf1470..048f74e0dc6d 100644 --- a/arch/nios2/configs/10m50_defconfig +++ b/arch/nios2/configs/10m50_defconfig @@ -9,7 +9,7 @@ CONFIG_LOG_BUF_SHIFT=14 # CONFIG_EVENTFD is not set # CONFIG_SHMEM is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_NIOS2_MEM_BASE=0x8000000 diff --git a/arch/nios2/configs/3c120_defconfig b/arch/nios2/configs/3c120_defconfig index 0daf3038d7aa..48fe353f8d2d 100644 --- a/arch/nios2/configs/3c120_defconfig +++ b/arch/nios2/configs/3c120_defconfig @@ -9,7 +9,7 @@ CONFIG_LOG_BUF_SHIFT=14 # CONFIG_EVENTFD is not set # CONFIG_SHMEM is not set # CONFIG_AIO is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_NIOS2_MEM_BASE=0x10000000 diff --git a/arch/openrisc/configs/or1klitex_defconfig b/arch/openrisc/configs/or1klitex_defconfig index d3fb964b4f85..466f31a091be 100644 --- a/arch/openrisc/configs/or1klitex_defconfig +++ b/arch/openrisc/configs/or1klitex_defconfig @@ -6,7 +6,7 @@ CONFIG_USER_NS=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SGETMASK_SYSCALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_OPENRISC_BUILTIN_DTB="or1klitex" CONFIG_HZ_100=y CONFIG_OPENRISC_HAVE_SHADOW_GPRS=y diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig index 4fd36642a779..a15ab147af2e 100644 --- a/arch/parisc/Kconfig +++ b/arch/parisc/Kconfig @@ -359,29 +359,17 @@ config NR_CPUS default "8" if 64BIT default "16" -config KEXEC - bool "Kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - It is an ongoing process to be certain the hardware in a machine - shutdown, so do not be surprised if this code does not - initially work for you. - -config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select KEXEC_ELF - help - This enables the kexec_file_load() System call. This is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. - endmenu +config ARCH_SUPPORTS_KEXEC + def_bool y + +config ARCH_SUPPORTS_KEXEC_FILE + def_bool y + +config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE + select KEXEC_ELF + source "drivers/parisc/Kconfig" diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 938294c996dc..21edd664689e 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -592,41 +592,21 @@ config PPC64_SUPPORTS_MEMORY_FAILURE default "y" if PPC_POWERNV select ARCH_SUPPORTS_MEMORY_FAILURE -config KEXEC - bool "kexec system call" - depends on PPC_BOOK3S || PPC_E500 || (44x && !SMP) - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - -config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select HAVE_IMA_KEXEC if IMA - select KEXEC_ELF - depends on PPC64 - depends on CRYPTO=y - depends on CRYPTO_SHA256=y - help - This is a new version of the kexec system call. This call is - file based and takes in file descriptors as system call arguments - for kernel and initramfs as opposed to a list of segments as is the - case for the older kexec call. +config ARCH_SUPPORTS_KEXEC + def_bool PPC_BOOK3S || PPC_E500 || (44x && !SMP) + +config ARCH_SUPPORTS_KEXEC_FILE + def_bool PPC64 && CRYPTO=y && CRYPTO_SHA256=y -config ARCH_HAS_KEXEC_PURGATORY +config ARCH_SUPPORTS_KEXEC_PURGATORY def_bool KEXEC_FILE +config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE + select KEXEC_ELF + select HAVE_IMA_KEXEC if IMA + config PPC64_BIG_ENDIAN_ELF_ABI_V2 # Option is available to BFD, but LLD does not support ELFv1 so this is # always true there. @@ -686,14 +666,13 @@ config RELOCATABLE_TEST loaded at, which tends to be non-zero and therefore test the relocation code. -config CRASH_DUMP - bool "Build a dump capture kernel" - depends on PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP) +config ARCH_SUPPORTS_CRASH_DUMP + def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP) + +config ARCH_SELECTS_CRASH_DUMP + def_bool y + depends on CRASH_DUMP select RELOCATABLE if PPC64 || 44x || PPC_85xx - help - Build a kernel suitable for use as a dump capture kernel. - The same kernel binary can be used as production kernel and dump - capture kernel. config FA_DUMP bool "Firmware-assisted dump" diff --git a/arch/powerpc/configs/40x/klondike_defconfig b/arch/powerpc/configs/40x/klondike_defconfig index acafbb8f6808..a974d1e945cc 100644 --- a/arch/powerpc/configs/40x/klondike_defconfig +++ b/arch/powerpc/configs/40x/klondike_defconfig @@ -4,7 +4,7 @@ CONFIG_LOG_BUF_SHIFT=14 CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_BLK_DEV_INITRD=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_APM8018X=y diff --git a/arch/powerpc/configs/44x/fsp2_defconfig b/arch/powerpc/configs/44x/fsp2_defconfig index 3fdfbb29b854..5492537f4c6c 100644 --- a/arch/powerpc/configs/44x/fsp2_defconfig +++ b/arch/powerpc/configs/44x/fsp2_defconfig @@ -15,7 +15,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_RD_LZ4 is not set CONFIG_KALLSYMS_ALL=y CONFIG_BPF_SYSCALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y diff --git a/arch/powerpc/configs/52xx/tqm5200_defconfig b/arch/powerpc/configs/52xx/tqm5200_defconfig index e6735b945327..688f703d8e22 100644 --- a/arch/powerpc/configs/52xx/tqm5200_defconfig +++ b/arch/powerpc/configs/52xx/tqm5200_defconfig @@ -3,7 +3,7 @@ CONFIG_LOG_BUF_SHIFT=14 CONFIG_BLK_DEV_INITRD=y # CONFIG_KALLSYMS is not set # CONFIG_EPOLL is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODVERSIONS=y diff --git a/arch/powerpc/configs/mgcoge_defconfig b/arch/powerpc/configs/mgcoge_defconfig index 2101bfe6db94..f65001e7877f 100644 --- a/arch/powerpc/configs/mgcoge_defconfig +++ b/arch/powerpc/configs/mgcoge_defconfig @@ -9,7 +9,7 @@ CONFIG_BLK_DEV_INITRD=y # CONFIG_RD_GZIP is not set CONFIG_KALLSYMS_ALL=y # CONFIG_PCSPKR_PLATFORM is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PARTITION_ADVANCED=y # CONFIG_PPC_PMAC is not set CONFIG_PPC_82xx=y diff --git a/arch/powerpc/configs/microwatt_defconfig b/arch/powerpc/configs/microwatt_defconfig index 795a127908e7..a64fb1ef8c75 100644 --- a/arch/powerpc/configs/microwatt_defconfig +++ b/arch/powerpc/configs/microwatt_defconfig @@ -8,7 +8,7 @@ CONFIG_CGROUPS=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_SLUB_DEBUG is not set # CONFIG_COMPAT_BRK is not set diff --git a/arch/powerpc/configs/ps3_defconfig b/arch/powerpc/configs/ps3_defconfig index 1ea732c19235..2b175ddf82f0 100644 --- a/arch/powerpc/configs/ps3_defconfig +++ b/arch/powerpc/configs/ps3_defconfig @@ -3,7 +3,7 @@ CONFIG_POSIX_MQUEUE=y CONFIG_HIGH_RES_TIMERS=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_PERF_EVENTS is not set CONFIG_PROFILING=y CONFIG_PPC64=y diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index f257cacb49a9..ba1a5974e714 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -55,7 +55,7 @@ int irq_choose_cpu(const struct cpumask *mask); #if defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask, - bool exclude_self); + int exclude_cpu); #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace #endif diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c index 5de8597eaab8..b15f15dcacb5 100644 --- a/arch/powerpc/kernel/stacktrace.c +++ b/arch/powerpc/kernel/stacktrace.c @@ -221,8 +221,8 @@ static void raise_backtrace_ipi(cpumask_t *mask) } } -void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { - nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi); + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, raise_backtrace_ipi); } #endif /* defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) */ diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index edb2dd1f53eb..8c464a5d8246 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -245,7 +245,7 @@ static void watchdog_smp_panic(int cpu) __cpumask_clear_cpu(c, &wd_smp_cpus_ipi); } } else { - trigger_allbutself_cpu_backtrace(); + trigger_allbutcpu_cpu_backtrace(cpu); cpumask_clear(&wd_smp_cpus_ipi); } @@ -416,7 +416,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) xchg(&__wd_nmi_output, 1); // see wd_lockup_ipi if (sysctl_hardlockup_all_cpu_backtrace) - trigger_allbutself_cpu_backtrace(); + trigger_allbutcpu_cpu_backtrace(cpu); if (hardlockup_panic) nmi_panic(regs, "Hard LOCKUP"); diff --git a/arch/riscv/Kbuild b/arch/riscv/Kbuild index afa83e307a2e..d25ad1c19f88 100644 --- a/arch/riscv/Kbuild +++ b/arch/riscv/Kbuild @@ -5,7 +5,7 @@ obj-$(CONFIG_BUILTIN_DTB) += boot/dts/ obj-y += errata/ obj-$(CONFIG_KVM) += kvm/ -obj-$(CONFIG_ARCH_HAS_KEXEC_PURGATORY) += purgatory/ +obj-$(CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY) += purgatory/ # for cleaning subdir- += boot diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index b08d0e1a93b9..f9fbfbf11ad2 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -656,48 +656,30 @@ config RISCV_BOOT_SPINWAIT If unsure what to do here, say N. -config KEXEC - bool "Kexec system call" - depends on MMU +config ARCH_SUPPORTS_KEXEC + def_bool MMU + +config ARCH_SELECTS_KEXEC + def_bool y + depends on KEXEC select HOTPLUG_CPU if SMP - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - The name comes from the similarity to the exec system call. +config ARCH_SUPPORTS_KEXEC_FILE + def_bool 64BIT && MMU -config KEXEC_FILE - bool "kexec file based systmem call" - depends on 64BIT && MMU +config ARCH_SELECTS_KEXEC_FILE + def_bool y + depends on KEXEC_FILE select HAVE_IMA_KEXEC if IMA - select KEXEC_CORE select KEXEC_ELF - help - This is new version of kexec system call. This system call is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. - - If you don't know what to do here, say Y. -config ARCH_HAS_KEXEC_PURGATORY +config ARCH_SUPPORTS_KEXEC_PURGATORY def_bool KEXEC_FILE depends on CRYPTO=y depends on CRYPTO_SHA256=y -config CRASH_DUMP - bool "Build kdump crash kernel" - help - Generate crash dump after being started by kexec. This should - be normally only set in special crash dump kernels which are - loaded in the main kernel with kexec-tools into a specially - reserved region and then later executed after a crash by - kdump/kexec. - - For more details see Documentation/admin-guide/kdump/kdump.rst +config ARCH_SUPPORTS_CRASH_DUMP + def_bool y config COMPAT bool "Kernel support for 32-bit U-mode" diff --git a/arch/riscv/configs/nommu_k210_defconfig b/arch/riscv/configs/nommu_k210_defconfig index e36fffd6fb18..146c46d0525b 100644 --- a/arch/riscv/configs/nommu_k210_defconfig +++ b/arch/riscv/configs/nommu_k210_defconfig @@ -21,7 +21,7 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_IO_URING is not set # CONFIG_ADVISE_SYSCALLS is not set # CONFIG_KALLSYMS is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set CONFIG_SLUB=y diff --git a/arch/riscv/configs/nommu_k210_sdcard_defconfig b/arch/riscv/configs/nommu_k210_sdcard_defconfig index c1ad85f0a4f7..95d8d1808f19 100644 --- a/arch/riscv/configs/nommu_k210_sdcard_defconfig +++ b/arch/riscv/configs/nommu_k210_sdcard_defconfig @@ -13,7 +13,7 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_IO_URING is not set # CONFIG_ADVISE_SYSCALLS is not set # CONFIG_KALLSYMS is not set -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_COMPAT_BRK is not set CONFIG_SLUB=y diff --git a/arch/riscv/kernel/elf_kexec.c b/arch/riscv/kernel/elf_kexec.c index c08bb5c3b385..f4099059ed8f 100644 --- a/arch/riscv/kernel/elf_kexec.c +++ b/arch/riscv/kernel/elf_kexec.c @@ -260,7 +260,7 @@ static void *elf_kexec_load(struct kimage *image, char *kernel_buf, cmdline = modified_cmdline; } -#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY +#ifdef CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY /* Add purgatory to the image */ kbuf.top_down = true; kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; @@ -274,7 +274,7 @@ static void *elf_kexec_load(struct kimage *image, char *kernel_buf, sizeof(kernel_start), 0); if (ret) pr_err("Error update purgatory ret=%d\n", ret); -#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */ +#endif /* CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY */ /* Add the initrd to the image */ if (initrd != NULL) { diff --git a/arch/s390/Kbuild b/arch/s390/Kbuild index 8e4d74f51115..a5d3503b353c 100644 --- a/arch/s390/Kbuild +++ b/arch/s390/Kbuild @@ -7,7 +7,7 @@ obj-$(CONFIG_S390_HYPFS) += hypfs/ obj-$(CONFIG_APPLDATA_BASE) += appldata/ obj-y += net/ obj-$(CONFIG_PCI) += pci/ -obj-$(CONFIG_ARCH_HAS_KEXEC_PURGATORY) += purgatory/ +obj-$(CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY) += purgatory/ # for cleaning subdir- += boot tools diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 4f364cfe9dd0..661b6de69c27 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -215,6 +215,7 @@ config S390 select HAVE_VIRT_CPU_ACCOUNTING_IDLE select IOMMU_HELPER if PCI select IOMMU_SUPPORT if PCI + select KEXEC select MMU_GATHER_MERGE_VMAS select MMU_GATHER_NO_GATHER select MMU_GATHER_RCU_TABLE_FREE @@ -246,6 +247,25 @@ config PGTABLE_LEVELS source "kernel/livepatch/Kconfig" +config ARCH_SUPPORTS_KEXEC + def_bool y + +config ARCH_SUPPORTS_KEXEC_FILE + def_bool CRYPTO && CRYPTO_SHA256 && CRYPTO_SHA256_S390 + +config ARCH_SUPPORTS_KEXEC_SIG + def_bool MODULE_SIG_FORMAT + +config ARCH_SUPPORTS_KEXEC_PURGATORY + def_bool KEXEC_FILE + +config ARCH_SUPPORTS_CRASH_DUMP + def_bool y + help + Refer to <file:Documentation/arch/s390/zfcpdump.rst> for more details on this. + This option also enables s390 zfcpdump. + See also <file:Documentation/arch/s390/zfcpdump.rst> + menu "Processor type and features" config HAVE_MARCH_Z10_FEATURES @@ -484,36 +504,6 @@ config SCHED_TOPOLOGY source "kernel/Kconfig.hz" -config KEXEC - def_bool y - select KEXEC_CORE - -config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - depends on CRYPTO - depends on CRYPTO_SHA256 - depends on CRYPTO_SHA256_S390 - help - Enable the kexec file based system call. In contrast to the normal - kexec system call this system call takes file descriptors for the - kernel and initramfs as arguments. - -config ARCH_HAS_KEXEC_PURGATORY - def_bool y - depends on KEXEC_FILE - -config KEXEC_SIG - bool "Verify kernel signature during kexec_file_load() syscall" - depends on KEXEC_FILE && MODULE_SIG_FORMAT - help - This option makes kernel signature verification mandatory for - the kexec_file_load() syscall. - - In addition to that option, you need to enable signature - verification for the corresponding kernel image type being - loaded in order for this to work. - config CERT_STORE bool "Get user certificates via DIAG320" depends on KEYS @@ -746,22 +736,6 @@ config VFIO_AP endmenu -menu "Dump support" - -config CRASH_DUMP - bool "kernel crash dumps" - select KEXEC - help - Generate crash dump after being started by kexec. - Crash dump kernels are loaded in the main kernel with kexec-tools - into a specially reserved region and then later executed after - a crash by kdump/kexec. - Refer to <file:Documentation/arch/s390/zfcpdump.rst> for more details on this. - This option also enables s390 zfcpdump. - See also <file:Documentation/arch/s390/zfcpdump.rst> - -endmenu - config CCW def_bool y diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index 6be32254211c..33530b044953 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -549,44 +549,14 @@ menu "Kernel features" source "kernel/Kconfig.hz" -config KEXEC - bool "kexec system call (EXPERIMENTAL)" - depends on MMU - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - -config CRASH_DUMP - bool "kernel crash dumps (EXPERIMENTAL)" - depends on BROKEN_ON_SMP - help - Generate crash dump after being started by kexec. - This should be normally only set in special crash dump kernels - which are loaded in the main kernel with kexec-tools into - a specially reserved region and then later executed after - a crash by kdump/kexec. The crash dump kernel must be compiled - to a memory address not used by the main kernel using - PHYSICAL_START. - - For more details see Documentation/admin-guide/kdump/kdump.rst - -config KEXEC_JUMP - bool "kexec jump (EXPERIMENTAL)" - depends on KEXEC && HIBERNATION - help - Jump between original kernel and kexeced kernel and invoke - code via KEXEC +config ARCH_SUPPORTS_KEXEC + def_bool MMU + +config ARCH_SUPPORTS_CRASH_DUMP + def_bool BROKEN_ON_SMP + +config ARCH_SUPPORTS_KEXEC_JUMP + def_bool y config PHYSICAL_START hex "Physical address where the kernel is loaded" if (EXPERT || CRASH_DUMP) diff --git a/arch/sh/configs/rsk7264_defconfig b/arch/sh/configs/rsk7264_defconfig index de9e350cdf28..a88cb3b77957 100644 --- a/arch/sh/configs/rsk7264_defconfig +++ b/arch/sh/configs/rsk7264_defconfig @@ -9,7 +9,7 @@ CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS_ALL=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PERF_COUNTERS=y # CONFIG_VM_EVENT_COUNTERS is not set CONFIG_MMAP_ALLOW_UNINITIALIZED=y diff --git a/arch/sh/configs/rsk7269_defconfig b/arch/sh/configs/rsk7269_defconfig index 7ca6a5c25a9f..d9a7ce783c9b 100644 --- a/arch/sh/configs/rsk7269_defconfig +++ b/arch/sh/configs/rsk7269_defconfig @@ -1,6 +1,6 @@ CONFIG_LOG_BUF_SHIFT=14 CONFIG_CC_OPTIMIZE_FOR_SIZE=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y # CONFIG_VM_EVENT_COUNTERS is not set # CONFIG_BLK_DEV_BSG is not set CONFIG_SWAP_IO_SPACE=y diff --git a/arch/sparc/include/asm/irq_64.h b/arch/sparc/include/asm/irq_64.h index b436029f1ced..8c4c0c87f998 100644 --- a/arch/sparc/include/asm/irq_64.h +++ b/arch/sparc/include/asm/irq_64.h @@ -87,7 +87,7 @@ static inline unsigned long get_softint(void) } void arch_trigger_cpumask_backtrace(const struct cpumask *mask, - bool exclude_self); + int exclude_cpu); #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace extern void *hardirq_stack[NR_CPUS]; diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c index b51d8fb0ecdc..1ea3f37fa985 100644 --- a/arch/sparc/kernel/process_64.c +++ b/arch/sparc/kernel/process_64.c @@ -236,7 +236,7 @@ static void __global_reg_poll(struct global_reg_snapshot *gp) } } -void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { struct thread_info *tp = current_thread_info(); struct pt_regs *regs = get_irq_regs(); @@ -252,7 +252,7 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) memset(global_cpu_snapshot, 0, sizeof(global_cpu_snapshot)); - if (cpumask_test_cpu(this_cpu, mask) && !exclude_self) + if (cpumask_test_cpu(this_cpu, mask) && this_cpu != exclude_cpu) __global_reg_self(tp, regs, this_cpu); smp_fetch_global_regs(); @@ -260,7 +260,7 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) for_each_cpu(cpu, mask) { struct global_reg_snapshot *gp; - if (exclude_self && cpu == this_cpu) + if (cpu == exclude_cpu) continue; gp = &global_cpu_snapshot[cpu].reg; diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ad3a53e28aac..bd9a1804cf72 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2006,88 +2006,37 @@ config EFI_RUNTIME_MAP source "kernel/Kconfig.hz" -config KEXEC - bool "kexec system call" - select KEXEC_CORE - help - kexec is a system call that implements the ability to shutdown your - current kernel, and to start another kernel. It is like a reboot - but it is independent of the system firmware. And like a reboot - you can start any kernel with it, not just Linux. - - The name comes from the similarity to the exec system call. - - It is an ongoing process to be certain the hardware in a machine - is properly shutdown, so do not be surprised if this code does not - initially work for you. As of this writing the exact hardware - interface is strongly in flux, so no good recommendation can be - made. - -config KEXEC_FILE - bool "kexec file based system call" - select KEXEC_CORE - select HAVE_IMA_KEXEC if IMA - depends on X86_64 - depends on CRYPTO=y - depends on CRYPTO_SHA256=y - help - This is new version of kexec system call. This system call is - file based and takes file descriptors as system call argument - for kernel and initramfs as opposed to list of segments as - accepted by previous system call. +config ARCH_SUPPORTS_KEXEC + def_bool y -config ARCH_HAS_KEXEC_PURGATORY - def_bool KEXEC_FILE +config ARCH_SUPPORTS_KEXEC_FILE + def_bool X86_64 && CRYPTO && CRYPTO_SHA256 -config KEXEC_SIG - bool "Verify kernel signature during kexec_file_load() syscall" +config ARCH_SELECTS_KEXEC_FILE + def_bool y depends on KEXEC_FILE - help + select HAVE_IMA_KEXEC if IMA - This option makes the kexec_file_load() syscall check for a valid - signature of the kernel image. The image can still be loaded without - a valid signature unless you also enable KEXEC_SIG_FORCE, though if - there's a signature that we can check, then it must be valid. +config ARCH_SUPPORTS_KEXEC_PURGATORY + def_bool KEXEC_FILE - In addition to this option, you need to enable signature - verification for the corresponding kernel image type being - loaded in order for this to work. +config ARCH_SUPPORTS_KEXEC_SIG + def_bool y -config KEXEC_SIG_FORCE - bool "Require a valid signature in kexec_file_load() syscall" - depends on KEXEC_SIG - help - This option makes kernel signature verification mandatory for - the kexec_file_load() syscall. +config ARCH_SUPPORTS_KEXEC_SIG_FORCE + def_bool y -config KEXEC_BZIMAGE_VERIFY_SIG - bool "Enable bzImage signature verification support" - depends on KEXEC_SIG - depends on SIGNED_PE_FILE_VERIFICATION - select SYSTEM_TRUSTED_KEYRING - help - Enable bzImage signature verification support. +config ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG + def_bool y -config CRASH_DUMP - bool "kernel crash dumps" - depends on X86_64 || (X86_32 && HIGHMEM) - help - Generate crash dump after being started by kexec. - This should be normally only set in special crash dump kernels - which are loaded in the main kernel with kexec-tools into - a specially reserved region and then later executed after - a crash by kdump/kexec. The crash dump kernel must be compiled - to a memory address not used by the main kernel or BIOS using - PHYSICAL_START, or it must be built as a relocatable image - (CONFIG_RELOCATABLE=y). - For more details see Documentation/admin-guide/kdump/kdump.rst +config ARCH_SUPPORTS_KEXEC_JUMP + def_bool y -config KEXEC_JUMP - bool "kexec jump" - depends on KEXEC && HIBERNATION - help - Jump between original kernel and kexeced kernel and invoke - code in physical address mode via KEXEC +config ARCH_SUPPORTS_CRASH_DUMP + def_bool X86_64 || (X86_32 && HIGHMEM) + +config ARCH_SUPPORTS_CRASH_HOTPLUG + def_bool y config PHYSICAL_START hex "Physical address where the kernel is loaded" if (EXPERT || CRASH_DUMP) diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h index 29e083b92813..836c170d3087 100644 --- a/arch/x86/include/asm/irq.h +++ b/arch/x86/include/asm/irq.h @@ -42,7 +42,7 @@ extern void init_ISA_irqs(void); #ifdef CONFIG_X86_LOCAL_APIC void arch_trigger_cpumask_backtrace(const struct cpumask *mask, - bool exclude_self); + int exclude_cpu); #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace #endif diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h index 5b77bbc28f96..3be6a98751f0 100644 --- a/arch/x86/include/asm/kexec.h +++ b/arch/x86/include/asm/kexec.h @@ -209,6 +209,24 @@ typedef void crash_vmclear_fn(void); extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss; extern void kdump_nmi_shootdown_cpus(void); +#ifdef CONFIG_CRASH_HOTPLUG +void arch_crash_handle_hotplug_event(struct kimage *image); +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event + +#ifdef CONFIG_HOTPLUG_CPU +int arch_crash_hotplug_cpu_support(void); +#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support +#endif + +#ifdef CONFIG_MEMORY_HOTPLUG +int arch_crash_hotplug_memory_support(void); +#define crash_hotplug_memory_support arch_crash_hotplug_memory_support +#endif + +unsigned int arch_crash_get_elfcorehdr_size(void); +#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size +#endif + #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_KEXEC_H */ diff --git a/arch/x86/include/asm/rmwcc.h b/arch/x86/include/asm/rmwcc.h index 7fa611216417..4b081e0d3306 100644 --- a/arch/x86/include/asm/rmwcc.h +++ b/arch/x86/include/asm/rmwcc.h @@ -2,12 +2,7 @@ #ifndef _ASM_X86_RMWcc #define _ASM_X86_RMWcc -/* This counts to 12. Any more, it will return 13th argument. */ -#define __RMWcc_ARGS(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _n, X...) _n -#define RMWcc_ARGS(X...) __RMWcc_ARGS(, ##X, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) - -#define __RMWcc_CONCAT(a, b) a ## b -#define RMWcc_CONCAT(a, b) __RMWcc_CONCAT(a, b) +#include <linux/args.h> #define __CLOBBERS_MEM(clb...) "memory", ## clb @@ -48,7 +43,7 @@ cc_label: c = true; \ #define GEN_UNARY_RMWcc_3(op, var, cc) \ GEN_UNARY_RMWcc_4(op, var, cc, "%[var]") -#define GEN_UNARY_RMWcc(X...) RMWcc_CONCAT(GEN_UNARY_RMWcc_, RMWcc_ARGS(X))(X) +#define GEN_UNARY_RMWcc(X...) CONCATENATE(GEN_UNARY_RMWcc_, COUNT_ARGS(X))(X) #define GEN_BINARY_RMWcc_6(op, var, cc, vcon, _val, arg0) \ __GEN_RMWcc(op " %[val], " arg0, var, cc, \ @@ -57,7 +52,7 @@ cc_label: c = true; \ #define GEN_BINARY_RMWcc_5(op, var, cc, vcon, val) \ GEN_BINARY_RMWcc_6(op, var, cc, vcon, val, "%[var]") -#define GEN_BINARY_RMWcc(X...) RMWcc_CONCAT(GEN_BINARY_RMWcc_, RMWcc_ARGS(X))(X) +#define GEN_BINARY_RMWcc(X...) CONCATENATE(GEN_BINARY_RMWcc_, COUNT_ARGS(X))(X) #define GEN_UNARY_SUFFIXED_RMWcc(op, suffix, var, cc, clobbers...) \ __GEN_RMWcc(op " %[var]\n\t" suffix, var, cc, \ diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h index a6e8373a5170..3fa87e5e11ab 100644 --- a/arch/x86/include/asm/sections.h +++ b/arch/x86/include/asm/sections.h @@ -2,8 +2,6 @@ #ifndef _ASM_X86_SECTIONS_H #define _ASM_X86_SECTIONS_H -#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed - #include <asm-generic/sections.h> #include <asm/extable.h> @@ -18,20 +16,4 @@ extern char __end_of_kernel_reserve[]; extern unsigned long _brk_start, _brk_end; -static inline bool arch_is_kernel_initmem_freed(unsigned long addr) -{ - /* - * If _brk_start has not been cleared, brk allocation is incomplete, - * and we can not make assumptions about its use. - */ - if (_brk_start) - return 0; - - /* - * After brk allocation is complete, space between _brk_end and _end - * is available for allocation. - */ - return addr >= _brk_end && addr < (unsigned long)&_end; -} - #endif /* _ASM_X86_SECTIONS_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 4070a01c11b7..00df34c263cc 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -33,11 +33,10 @@ KCSAN_SANITIZE := n KMSAN_SANITIZE_head$(BITS).o := n KMSAN_SANITIZE_nmi.o := n -# If instrumentation of this dir is enabled, boot hangs during first second. -# Probably could be more selective here, but note that files related to irqs, -# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to -# non-deterministic coverage. -KCOV_INSTRUMENT := n +# If instrumentation of the following files is enabled, boot hangs during +# first second. +KCOV_INSTRUMENT_head$(BITS).o := n +KCOV_INSTRUMENT_sev.o := n CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c index 34a992e275ef..d6e01f924299 100644 --- a/arch/x86/kernel/apic/hw_nmi.c +++ b/arch/x86/kernel/apic/hw_nmi.c @@ -34,9 +34,9 @@ static void nmi_raise_cpu_backtrace(cpumask_t *mask) apic->send_IPI_mask(mask, NMI_VECTOR); } -void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) { - nmi_trigger_cpumask_backtrace(mask, exclude_self, + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, nmi_raise_cpu_backtrace); } diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index cdd92ab43cda..587c7743fd21 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -158,8 +158,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs) crash_save_cpu(regs, safe_smp_processor_id()); } -#ifdef CONFIG_KEXEC_FILE - +#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_HOTPLUG) static int get_nr_ram_ranges_callback(struct resource *res, void *arg) { unsigned int *nr_ranges = arg; @@ -231,7 +230,7 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg) /* Prepare elf headers. Return addr and size */ static int prepare_elf_headers(struct kimage *image, void **addr, - unsigned long *sz) + unsigned long *sz, unsigned long *nr_mem_ranges) { struct crash_mem *cmem; int ret; @@ -249,6 +248,9 @@ static int prepare_elf_headers(struct kimage *image, void **addr, if (ret) goto out; + /* Return the computed number of memory ranges, for hotplug usage */ + *nr_mem_ranges = cmem->nr_ranges; + /* By default prepare 64bit headers */ ret = crash_prepare_elf64_headers(cmem, IS_ENABLED(CONFIG_X86_64), addr, sz); @@ -256,7 +258,9 @@ out: vfree(cmem); return ret; } +#endif +#ifdef CONFIG_KEXEC_FILE static int add_e820_entry(struct boot_params *params, struct e820_entry *entry) { unsigned int nr_e820_entries; @@ -371,18 +375,42 @@ out: int crash_load_segments(struct kimage *image) { int ret; + unsigned long pnum = 0; struct kexec_buf kbuf = { .image = image, .buf_min = 0, .buf_max = ULONG_MAX, .top_down = false }; /* Prepare elf headers and add a segment */ - ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz); + ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz, &pnum); if (ret) return ret; - image->elf_headers = kbuf.buffer; - image->elf_headers_sz = kbuf.bufsz; + image->elf_headers = kbuf.buffer; + image->elf_headers_sz = kbuf.bufsz; + kbuf.memsz = kbuf.bufsz; + +#ifdef CONFIG_CRASH_HOTPLUG + /* + * The elfcorehdr segment size accounts for VMCOREINFO, kernel_map, + * maximum CPUs and maximum memory ranges. + */ + if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) + pnum = 2 + CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES; + else + pnum += 2 + CONFIG_NR_CPUS_DEFAULT; + + if (pnum < (unsigned long)PN_XNUM) { + kbuf.memsz = pnum * sizeof(Elf64_Phdr); + kbuf.memsz += sizeof(Elf64_Ehdr); + + image->elfcorehdr_index = image->nr_segments; + + /* Mark as usable to crash kernel, else crash kernel fails on boot */ + image->elf_headers_sz = kbuf.memsz; + } else { + pr_err("number of Phdrs %lu exceeds max\n", pnum); + } +#endif - kbuf.memsz = kbuf.bufsz; kbuf.buf_align = ELF_CORE_HEADER_ALIGN; kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(&kbuf); @@ -395,3 +423,103 @@ int crash_load_segments(struct kimage *image) return ret; } #endif /* CONFIG_KEXEC_FILE */ + +#ifdef CONFIG_CRASH_HOTPLUG + +#undef pr_fmt +#define pr_fmt(fmt) "crash hp: " fmt + +/* These functions provide the value for the sysfs crash_hotplug nodes */ +#ifdef CONFIG_HOTPLUG_CPU +int arch_crash_hotplug_cpu_support(void) +{ + return crash_check_update_elfcorehdr(); +} +#endif + +#ifdef CONFIG_MEMORY_HOTPLUG +int arch_crash_hotplug_memory_support(void) +{ + return crash_check_update_elfcorehdr(); +} +#endif + +unsigned int arch_crash_get_elfcorehdr_size(void) +{ + unsigned int sz; + + /* kernel_map, VMCOREINFO and maximum CPUs */ + sz = 2 + CONFIG_NR_CPUS_DEFAULT; + if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) + sz += CONFIG_CRASH_MAX_MEMORY_RANGES; + sz *= sizeof(Elf64_Phdr); + return sz; +} + +/** + * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes + * @image: a pointer to kexec_crash_image + * + * Prepare the new elfcorehdr and replace the existing elfcorehdr. + */ +void arch_crash_handle_hotplug_event(struct kimage *image) +{ + void *elfbuf = NULL, *old_elfcorehdr; + unsigned long nr_mem_ranges; + unsigned long mem, memsz; + unsigned long elfsz = 0; + + /* + * As crash_prepare_elf64_headers() has already described all + * possible CPUs, there is no need to update the elfcorehdr + * for additional CPU changes. + */ + if ((image->file_mode || image->elfcorehdr_updated) && + ((image->hp_action == KEXEC_CRASH_HP_ADD_CPU) || + (image->hp_action == KEXEC_CRASH_HP_REMOVE_CPU))) + return; + + /* + * Create the new elfcorehdr reflecting the changes to CPU and/or + * memory resources. + */ + if (prepare_elf_headers(image, &elfbuf, &elfsz, &nr_mem_ranges)) { + pr_err("unable to create new elfcorehdr"); + goto out; + } + + /* + * Obtain address and size of the elfcorehdr segment, and + * check it against the new elfcorehdr buffer. + */ + mem = image->segment[image->elfcorehdr_index].mem; + memsz = image->segment[image->elfcorehdr_index].memsz; + if (elfsz > memsz) { + pr_err("update elfcorehdr elfsz %lu > memsz %lu", + elfsz, memsz); + goto out; + } + + /* + * Copy new elfcorehdr over the old elfcorehdr at destination. + */ + old_elfcorehdr = kmap_local_page(pfn_to_page(mem >> PAGE_SHIFT)); + if (!old_elfcorehdr) { + pr_err("mapping elfcorehdr segment failed\n"); + goto out; + } + + /* + * Temporarily invalidate the crash image while the + * elfcorehdr is updated. + */ + xchg(&kexec_crash_image, NULL); + memcpy_flushcache(old_elfcorehdr, elfbuf, elfsz); + xchg(&kexec_crash_image, image); + kunmap_local(old_elfcorehdr); + pr_debug("updated elfcorehdr\n"); + +out: + vfree(elfbuf); +} +#endif diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c index dd40d3fea74e..631512f7ec85 100644 --- a/arch/x86/pci/amd_bus.c +++ b/arch/x86/pci/amd_bus.c @@ -51,6 +51,14 @@ static struct pci_root_info __init *find_pci_root_info(int node, int link) return NULL; } +static inline resource_size_t cap_resource(u64 val) +{ + if (val > RESOURCE_SIZE_MAX) + return RESOURCE_SIZE_MAX; + + return val; +} + /** * early_root_info_init() * called before pcibios_scan_root and pci_scan_bus diff --git a/arch/x86/pci/bus_numa.c b/arch/x86/pci/bus_numa.c index 2752c02e3f0e..e4a525e59eaf 100644 --- a/arch/x86/pci/bus_numa.c +++ b/arch/x86/pci/bus_numa.c @@ -101,7 +101,7 @@ void update_res(struct pci_root_info *info, resource_size_t start, if (start > end) return; - if (start == MAX_RESOURCE) + if (start == RESOURCE_SIZE_MAX) return; if (!merge) diff --git a/arch/xtensa/configs/cadence_csp_defconfig b/arch/xtensa/configs/cadence_csp_defconfig index 8c66b9307f34..91c4c4cae8a7 100644 --- a/arch/xtensa/configs/cadence_csp_defconfig +++ b/arch/xtensa/configs/cadence_csp_defconfig @@ -21,7 +21,7 @@ CONFIG_INITRAMFS_SOURCE="$$KERNEL_INITRAMFS_SOURCE" # CONFIG_RD_LZO is not set # CONFIG_RD_LZ4 is not set CONFIG_CC_OPTIMIZE_FOR_SIZE=y -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y CONFIG_PROFILING=y CONFIG_MODULES=y CONFIG_MODULE_FORCE_LOAD=y diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index fe6690ecf563..43dab03958f1 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -282,6 +282,16 @@ static ssize_t print_cpus_nohz_full(struct device *dev, static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL); #endif +#ifdef CONFIG_CRASH_HOTPLUG +static ssize_t crash_hotplug_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + return sysfs_emit(buf, "%d\n", crash_hotplug_cpu_support()); +} +static DEVICE_ATTR_ADMIN_RO(crash_hotplug); +#endif + static void cpu_device_release(struct device *dev) { /* @@ -469,6 +479,9 @@ static struct attribute *cpu_root_attrs[] = { #ifdef CONFIG_NO_HZ_FULL &dev_attr_nohz_full.attr, #endif +#ifdef CONFIG_CRASH_HOTPLUG + &dev_attr_crash_hotplug.attr, +#endif #ifdef CONFIG_GENERIC_CPU_AUTOPROBE &dev_attr_modalias.attr, #endif diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 8191709c9ad2..f3b9a4d0fa3b 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -497,6 +497,16 @@ static ssize_t auto_online_blocks_store(struct device *dev, static DEVICE_ATTR_RW(auto_online_blocks); +#ifdef CONFIG_CRASH_HOTPLUG +#include <linux/kexec.h> +static ssize_t crash_hotplug_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", crash_hotplug_memory_support()); +} +static DEVICE_ATTR_RO(crash_hotplug); +#endif + /* * Some architectures will have custom drivers to do this, and * will not need to do it from userspace. The fake hot-add code @@ -896,6 +906,9 @@ static struct attribute *memory_root_attrs[] = { &dev_attr_block_size_bytes.attr, &dev_attr_auto_online_blocks.attr, +#ifdef CONFIG_CRASH_HOTPLUG + &dev_attr_crash_hotplug.attr, +#endif NULL }; diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 0fcc8615fb4f..1052b0f2d4cf 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -692,23 +692,23 @@ static const struct file_operations full_fops = { static const struct memdev { const char *name; - umode_t mode; const struct file_operations *fops; fmode_t fmode; + umode_t mode; } devlist[] = { #ifdef CONFIG_DEVMEM - [DEVMEM_MINOR] = { "mem", 0, &mem_fops, FMODE_UNSIGNED_OFFSET }, + [DEVMEM_MINOR] = { "mem", &mem_fops, FMODE_UNSIGNED_OFFSET, 0 }, #endif - [3] = { "null", 0666, &null_fops, FMODE_NOWAIT }, + [3] = { "null", &null_fops, FMODE_NOWAIT, 0666 }, #ifdef CONFIG_DEVPORT - [4] = { "port", 0, &port_fops, 0 }, + [4] = { "port", &port_fops, 0, 0 }, #endif - [5] = { "zero", 0666, &zero_fops, FMODE_NOWAIT }, - [7] = { "full", 0666, &full_fops, 0 }, - [8] = { "random", 0666, &random_fops, FMODE_NOWAIT }, - [9] = { "urandom", 0666, &urandom_fops, FMODE_NOWAIT }, + [5] = { "zero", &zero_fops, FMODE_NOWAIT, 0666 }, + [7] = { "full", &full_fops, 0, 0666 }, + [8] = { "random", &random_fops, FMODE_NOWAIT, 0666 }, + [9] = { "urandom", &urandom_fops, FMODE_NOWAIT, 0666 }, #ifdef CONFIG_PRINTK - [11] = { "kmsg", 0644, &kmsg_fops, 0 }, + [11] = { "kmsg", &kmsg_fops, 0, 0644 }, #endif }; diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c index 6b2d8a1e2aa9..290e856fe9e9 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c @@ -21,6 +21,7 @@ * DEALINGS IN THE SOFTWARE. */ +#include <linux/math.h> #include <linux/string_helpers.h> #include "i915_reg.h" diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.h b/drivers/gpu/drm/i915/display/intel_dpll_mgr.h index ba62eb5d7c51..04e6810954b2 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.h +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.h @@ -29,13 +29,6 @@ #include "intel_wakeref.h" -/*FIXME: Move this to a more appropriate place. */ -#define abs_diff(a, b) ({ \ - typeof(a) __a = (a); \ - typeof(b) __b = (b); \ - (void) (&__a == &__b); \ - __a > __b ? (__a - __b) : (__b - __a); }) - enum tc_port; struct drm_i915_private; struct intel_atomic_state; diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c index af1612044eef..841316582ea9 100644 --- a/drivers/gpu/ipu-v3/ipu-image-convert.c +++ b/drivers/gpu/ipu-v3/ipu-image-convert.c @@ -7,7 +7,10 @@ #include <linux/interrupt.h> #include <linux/dma-mapping.h> +#include <linux/math.h> + #include <video/imx-ipu-image-convert.h> + #include "ipu-prv.h" /* @@ -543,7 +546,7 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, unsigned int in_pos; unsigned int in_pos_aligned; unsigned int in_pos_rounded; - unsigned int abs_diff; + unsigned int diff; /* * Tiles in the right row / bottom column may not be allowed to @@ -575,15 +578,11 @@ static void find_best_seam(struct ipu_image_convert_ctx *ctx, (in_edge - in_pos_rounded) % in_burst) continue; - if (in_pos < in_pos_aligned) - abs_diff = in_pos_aligned - in_pos; - else - abs_diff = in_pos - in_pos_aligned; - - if (abs_diff < min_diff) { + diff = abs_diff(in_pos, in_pos_aligned); + if (diff < min_diff) { in_seam = in_pos_rounded; out_seam = out_pos; - min_diff = abs_diff; + min_diff = diff; } } diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index 09e422da482f..4b9036c6d45b 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -89,6 +89,7 @@ config ALPINE_MSI config AL_FIC bool "Amazon's Annapurna Labs Fabric Interrupt Controller" depends on OF + depends on HAS_IOMEM select GENERIC_IRQ_CHIP select IRQ_DOMAIN help diff --git a/drivers/net/ethernet/altera/Kconfig b/drivers/net/ethernet/altera/Kconfig index 17985319088c..4ef819a9a1ad 100644 --- a/drivers/net/ethernet/altera/Kconfig +++ b/drivers/net/ethernet/altera/Kconfig @@ -2,6 +2,7 @@ config ALTERA_TSE tristate "Altera Triple-Speed Ethernet MAC support" depends on HAS_DMA + depends on HAS_IOMEM select PHYLIB select PHYLINK select PCS_LYNX diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c index 82d35dbbfa6c..9be63a1f1f0c 100644 --- a/drivers/tty/serial/omap-serial.c +++ b/drivers/tty/serial/omap-serial.c @@ -222,16 +222,11 @@ static inline int calculate_baud_abs_diff(struct uart_port *port, unsigned int baud, unsigned int mode) { unsigned int n = port->uartclk / (mode * baud); - int abs_diff; if (n == 0) n = 1; - abs_diff = baud - (port->uartclk / (mode * n)); - if (abs_diff < 0) - abs_diff = -abs_diff; - - return abs_diff; + return abs_diff(baud, port->uartclk / (mode * n)); } /* diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 63db04b9113a..27d8e3a1aace 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -3031,7 +3031,7 @@ void __do_SAK(struct tty_struct *tty) } while_each_pid_task(session, PIDTYPE_SID, p); /* Now kill any processes that happen to have the tty open */ - do_each_thread(g, p) { + for_each_process_thread(g, p) { if (p->signal->tty == tty) { tty_notice(tty, "SAK: killed process %d (%s): by controlling tty\n", task_pid_nr(p), p->comm); @@ -3048,7 +3048,7 @@ void __do_SAK(struct tty_struct *tty) PIDTYPE_SID); } task_unlock(p); - } while_each_thread(g, p); + } read_unlock(&tasklist_lock); put_pid(session); } diff --git a/drivers/video/fbdev/core/svgalib.c b/drivers/video/fbdev/core/svgalib.c index 9e01322fabe3..2cba158888ea 100644 --- a/drivers/video/fbdev/core/svgalib.c +++ b/drivers/video/fbdev/core/svgalib.c @@ -14,6 +14,7 @@ #include <linux/kernel.h> #include <linux/string.h> #include <linux/fb.h> +#include <linux/math.h> #include <linux/svga.h> #include <asm/types.h> #include <asm/io.h> @@ -372,12 +373,6 @@ EXPORT_SYMBOL(svga_get_caps); * F_VCO = (F_BASE * M) / N * F_OUT = F_VCO / (2^R) */ - -static inline u32 abs_diff(u32 a, u32 b) -{ - return (a > b) ? (a - b) : (b - a); -} - int svga_compute_pll(const struct svga_pll *pll, u32 f_wanted, u16 *m, u16 *n, u16 *r, int node) { u16 am, an, ar; diff --git a/fs/adfs/dir_f.h b/fs/adfs/dir_f.h index a5393e6cf9f4..4e6c53d59ebd 100644 --- a/fs/adfs/dir_f.h +++ b/fs/adfs/dir_f.h @@ -58,9 +58,4 @@ struct adfs_newdirtail { __u8 dircheckbyte; } __attribute__((packed)); -union adfs_dirtail { - struct adfs_olddirtail old; - struct adfs_newdirtail new; -}; - #endif diff --git a/fs/efs/efs.h b/fs/efs/efs.h index 13a4d9622633..918d2b9abb76 100644 --- a/fs/efs/efs.h +++ b/fs/efs/efs.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * Copyright (c) 1999 Al Smith + * Copyright (c) 1999 Al Smith, <Al.Smith@aeschi.ch.eu.org> * * Portions derived from work (c) 1995,1996 Christian Vogelgsang. * Portions derived from IRIX header files (c) 1988 Silicon Graphics @@ -19,9 +19,6 @@ #define EFS_VERSION "1.0a" -static const char cprt[] = "EFS: "EFS_VERSION" - (c) 1999 Al Smith <Al.Smith@aeschi.ch.eu.org>"; - - /* 1 block is 512 bytes */ #define EFS_BLOCKSIZE_BITS 9 #define EFS_BLOCKSIZE (1 << EFS_BLOCKSIZE_BITS) diff --git a/fs/exec.c b/fs/exec.c index 0b9484358a49..6518e33ea813 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1277,8 +1277,8 @@ int begin_new_exec(struct linux_binprm * bprm) /* * Must be called _before_ exec_mmap() as bprm->mm is - * not visible until then. This also enables the update - * to be lockless. + * not visible until then. Doing it here also ensures + * we don't race against replace_mm_exe_file(). */ retval = set_mm_exe_file(bprm->mm, bprm->file); if (retval) diff --git a/fs/fs_struct.c b/fs/fs_struct.c index 04b3f5b9c629..64c2d0814ed6 100644 --- a/fs/fs_struct.c +++ b/fs/fs_struct.c @@ -62,7 +62,7 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root) int count = 0; read_lock(&tasklist_lock); - do_each_thread(g, p) { + for_each_process_thread(g, p) { task_lock(p); fs = p->fs; if (fs) { @@ -79,7 +79,7 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root) spin_unlock(&fs->lock); } task_unlock(p); - } while_each_thread(g, p); + } read_unlock(&tasklist_lock); while (count--) path_put(old_root); diff --git a/fs/hfsplus/extents.c b/fs/hfsplus/extents.c index 7a542f3dbe50..3c572e44f2ad 100644 --- a/fs/hfsplus/extents.c +++ b/fs/hfsplus/extents.c @@ -448,9 +448,9 @@ int hfsplus_file_extend(struct inode *inode, bool zeroout) if (sbi->alloc_file->i_size * 8 < sbi->total_blocks - sbi->free_blocks + 8) { /* extend alloc file */ - pr_err("extend alloc file! (%llu,%u,%u)\n", - sbi->alloc_file->i_size * 8, - sbi->total_blocks, sbi->free_blocks); + pr_err_ratelimited("extend alloc file! (%llu,%u,%u)\n", + sbi->alloc_file->i_size * 8, + sbi->total_blocks, sbi->free_blocks); return -ENOSPC; } diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c index 6ce8617b562d..7342de296ec3 100644 --- a/fs/nilfs2/alloc.c +++ b/fs/nilfs2/alloc.c @@ -205,7 +205,8 @@ static int nilfs_palloc_get_block(struct inode *inode, unsigned long blkoff, int ret; spin_lock(lock); - if (prev->bh && blkoff == prev->blkoff) { + if (prev->bh && blkoff == prev->blkoff && + likely(buffer_uptodate(prev->bh))) { get_bh(prev->bh); *bhp = prev->bh; spin_unlock(lock); diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index d588c719d743..1a8bd5993476 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -1025,7 +1025,7 @@ int nilfs_load_inode_block(struct inode *inode, struct buffer_head **pbh) int err; spin_lock(&nilfs->ns_inode_lock); - if (ii->i_bh == NULL) { + if (ii->i_bh == NULL || unlikely(!buffer_uptodate(ii->i_bh))) { spin_unlock(&nilfs->ns_inode_lock); err = nilfs_ifile_get_inode_block(ii->i_root->ifile, inode->i_ino, pbh); @@ -1034,7 +1034,10 @@ int nilfs_load_inode_block(struct inode *inode, struct buffer_head **pbh) spin_lock(&nilfs->ns_inode_lock); if (ii->i_bh == NULL) ii->i_bh = *pbh; - else { + else if (unlikely(!buffer_uptodate(ii->i_bh))) { + __brelse(ii->i_bh); + ii->i_bh = *pbh; + } else { brelse(*pbh); *pbh = ii->i_bh; } diff --git a/fs/ocfs2/cluster/netdebug.c b/fs/ocfs2/cluster/netdebug.c index 35c05c18de59..bc27301eab6d 100644 --- a/fs/ocfs2/cluster/netdebug.c +++ b/fs/ocfs2/cluster/netdebug.c @@ -44,17 +44,17 @@ static LIST_HEAD(send_tracking); void o2net_debug_add_nst(struct o2net_send_tracking *nst) { - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); list_add(&nst->st_net_debug_item, &send_tracking); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); } void o2net_debug_del_nst(struct o2net_send_tracking *nst) { - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); if (!list_empty(&nst->st_net_debug_item)) list_del_init(&nst->st_net_debug_item); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); } static struct o2net_send_tracking @@ -84,9 +84,9 @@ static void *nst_seq_start(struct seq_file *seq, loff_t *pos) { struct o2net_send_tracking *nst, *dummy_nst = seq->private; - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); nst = next_nst(dummy_nst); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); return nst; } @@ -95,13 +95,13 @@ static void *nst_seq_next(struct seq_file *seq, void *v, loff_t *pos) { struct o2net_send_tracking *nst, *dummy_nst = seq->private; - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); nst = next_nst(dummy_nst); list_del_init(&dummy_nst->st_net_debug_item); if (nst) list_add(&dummy_nst->st_net_debug_item, &nst->st_net_debug_item); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); return nst; /* unused, just needs to be null when done */ } @@ -112,7 +112,7 @@ static int nst_seq_show(struct seq_file *seq, void *v) ktime_t now; s64 sock, send, status; - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); nst = next_nst(dummy_nst); if (!nst) goto out; @@ -145,7 +145,7 @@ static int nst_seq_show(struct seq_file *seq, void *v) (long long)status); out: - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); return 0; } @@ -191,16 +191,16 @@ static const struct file_operations nst_seq_fops = { void o2net_debug_add_sc(struct o2net_sock_container *sc) { - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); list_add(&sc->sc_net_debug_item, &sock_containers); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); } void o2net_debug_del_sc(struct o2net_sock_container *sc) { - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); list_del_init(&sc->sc_net_debug_item); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); } struct o2net_sock_debug { @@ -236,9 +236,9 @@ static void *sc_seq_start(struct seq_file *seq, loff_t *pos) struct o2net_sock_debug *sd = seq->private; struct o2net_sock_container *sc, *dummy_sc = sd->dbg_sock; - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); sc = next_sc(dummy_sc); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); return sc; } @@ -248,12 +248,12 @@ static void *sc_seq_next(struct seq_file *seq, void *v, loff_t *pos) struct o2net_sock_debug *sd = seq->private; struct o2net_sock_container *sc, *dummy_sc = sd->dbg_sock; - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); sc = next_sc(dummy_sc); list_del_init(&dummy_sc->sc_net_debug_item); if (sc) list_add(&dummy_sc->sc_net_debug_item, &sc->sc_net_debug_item); - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); return sc; /* unused, just needs to be null when done */ } @@ -349,7 +349,7 @@ static int sc_seq_show(struct seq_file *seq, void *v) struct o2net_sock_debug *sd = seq->private; struct o2net_sock_container *sc, *dummy_sc = sd->dbg_sock; - spin_lock(&o2net_debug_lock); + spin_lock_bh(&o2net_debug_lock); sc = next_sc(dummy_sc); if (sc) { @@ -359,7 +359,7 @@ static int sc_seq_show(struct seq_file *seq, void *v) sc_show_sock_stats(seq, sc); } - spin_unlock(&o2net_debug_lock); + spin_unlock_bh(&o2net_debug_lock); return 0; } diff --git a/fs/ocfs2/cluster/quorum.c b/fs/ocfs2/cluster/quorum.c index 189c111bc371..15d0ed9c13e5 100644 --- a/fs/ocfs2/cluster/quorum.c +++ b/fs/ocfs2/cluster/quorum.c @@ -93,7 +93,7 @@ static void o2quo_make_decision(struct work_struct *work) int lowest_hb, lowest_reachable = 0, fence = 0; struct o2quo_state *qs = &o2quo_state; - spin_lock(&qs->qs_lock); + spin_lock_bh(&qs->qs_lock); lowest_hb = find_first_bit(qs->qs_hb_bm, O2NM_MAX_NODES); if (lowest_hb != O2NM_MAX_NODES) @@ -146,14 +146,14 @@ static void o2quo_make_decision(struct work_struct *work) out: if (fence) { - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); o2quo_fence_self(); } else { mlog(ML_NOTICE, "not fencing this node, heartbeating: %d, " "connected: %d, lowest: %d (%sreachable)\n", qs->qs_heartbeating, qs->qs_connected, lowest_hb, lowest_reachable ? "" : "un"); - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); } @@ -196,7 +196,7 @@ void o2quo_hb_up(u8 node) { struct o2quo_state *qs = &o2quo_state; - spin_lock(&qs->qs_lock); + spin_lock_bh(&qs->qs_lock); qs->qs_heartbeating++; mlog_bug_on_msg(qs->qs_heartbeating == O2NM_MAX_NODES, @@ -211,7 +211,7 @@ void o2quo_hb_up(u8 node) else o2quo_clear_hold(qs, node); - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); } /* hb going down releases any holds we might have had due to this node from @@ -220,7 +220,7 @@ void o2quo_hb_down(u8 node) { struct o2quo_state *qs = &o2quo_state; - spin_lock(&qs->qs_lock); + spin_lock_bh(&qs->qs_lock); qs->qs_heartbeating--; mlog_bug_on_msg(qs->qs_heartbeating < 0, @@ -233,7 +233,7 @@ void o2quo_hb_down(u8 node) o2quo_clear_hold(qs, node); - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); } /* this tells us that we've decided that the node is still heartbeating @@ -245,14 +245,14 @@ void o2quo_hb_still_up(u8 node) { struct o2quo_state *qs = &o2quo_state; - spin_lock(&qs->qs_lock); + spin_lock_bh(&qs->qs_lock); mlog(0, "node %u\n", node); qs->qs_pending = 1; o2quo_clear_hold(qs, node); - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); } /* This is analogous to hb_up. as a node's connection comes up we delay the @@ -264,7 +264,7 @@ void o2quo_conn_up(u8 node) { struct o2quo_state *qs = &o2quo_state; - spin_lock(&qs->qs_lock); + spin_lock_bh(&qs->qs_lock); qs->qs_connected++; mlog_bug_on_msg(qs->qs_connected == O2NM_MAX_NODES, @@ -279,7 +279,7 @@ void o2quo_conn_up(u8 node) else o2quo_clear_hold(qs, node); - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); } /* we've decided that we won't ever be connecting to the node again. if it's @@ -290,7 +290,7 @@ void o2quo_conn_err(u8 node) { struct o2quo_state *qs = &o2quo_state; - spin_lock(&qs->qs_lock); + spin_lock_bh(&qs->qs_lock); if (test_bit(node, qs->qs_conn_bm)) { qs->qs_connected--; @@ -307,7 +307,7 @@ void o2quo_conn_err(u8 node) mlog(0, "node %u, %d total\n", node, qs->qs_connected); - spin_unlock(&qs->qs_lock); + spin_unlock_bh(&qs->qs_lock); } void o2quo_init(void) diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index c19c730c26e2..e8e7d47265aa 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -114,9 +114,9 @@ int ocfs2_compute_replay_slots(struct ocfs2_super *osb) if (osb->replay_map) return 0; - replay_map = kzalloc(sizeof(struct ocfs2_replay_map) + - (osb->max_slots * sizeof(char)), GFP_KERNEL); - + replay_map = kzalloc(struct_size(replay_map, rm_replay_slots, + osb->max_slots), + GFP_KERNEL); if (!replay_map) { mlog_errno(-ENOMEM); return -ENOMEM; @@ -178,16 +178,13 @@ int ocfs2_recovery_init(struct ocfs2_super *osb) osb->recovery_thread_task = NULL; init_waitqueue_head(&osb->recovery_event); - rm = kzalloc(sizeof(struct ocfs2_recovery_map) + - osb->max_slots * sizeof(unsigned int), + rm = kzalloc(struct_size(rm, rm_entries, osb->max_slots), GFP_KERNEL); if (!rm) { mlog_errno(-ENOMEM); return -ENOMEM; } - rm->rm_entries = (unsigned int *)((char *)rm + - sizeof(struct ocfs2_recovery_map)); osb->recovery_map = rm; return 0; diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h index 41c382f68529..41c9fe7e62f9 100644 --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -29,7 +29,7 @@ struct ocfs2_dinode; struct ocfs2_recovery_map { unsigned int rm_used; - unsigned int *rm_entries; + unsigned int rm_entries[]; }; diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c index e4a684d45308..5cd6d7771cea 100644 --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -1535,6 +1535,10 @@ static int ocfs2_rename(struct mnt_idmap *idmap, status = ocfs2_add_entry(handle, new_dentry, old_inode, OCFS2_I(old_inode)->ip_blkno, new_dir_bh, &target_insert); + if (status < 0) { + mlog_errno(status); + goto bail; + } } inode_set_ctime_current(old_inode); diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c index 988d1c076861..6b906424902b 100644 --- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -1517,8 +1517,7 @@ static int ocfs2_show_options(struct seq_file *s, struct dentry *root) seq_printf(s, ",localflocks,"); if (osb->osb_cluster_stack[0]) - seq_show_option_n(s, "cluster_stack", osb->osb_cluster_stack, - OCFS2_STACK_LABEL_LEN); + seq_show_option(s, "cluster_stack", osb->osb_cluster_stack); if (opts & OCFS2_MOUNT_USRQUOTA) seq_printf(s, ",usrquota"); if (opts & OCFS2_MOUNT_GRPQUOTA) diff --git a/fs/proc/base.c b/fs/proc/base.c index 02e1cf03d94b..ffd54617c354 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3815,11 +3815,10 @@ static struct task_struct *first_tid(struct pid *pid, int tid, loff_t f_pos, /* If we haven't found our starting place yet start * with the leader and walk nr threads forward. */ - pos = task = task->group_leader; - do { + for_each_thread(task, pos) { if (!nr--) goto found; - } while_each_thread(task, pos); + }; fail: pos = NULL; goto out; diff --git a/include/kunit/test.h b/include/kunit/test.h index d33114097d0d..68ff01aee244 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -12,6 +12,7 @@ #include <kunit/assert.h> #include <kunit/try-catch.h> +#include <linux/args.h> #include <linux/compiler.h> #include <linux/container_of.h> #include <linux/err.h> diff --git a/include/linux/args.h b/include/linux/args.h new file mode 100644 index 000000000000..8ff60a54eb7d --- /dev/null +++ b/include/linux/args.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _LINUX_ARGS_H +#define _LINUX_ARGS_H + +/* + * How do these macros work? + * + * In __COUNT_ARGS() _0 to _12 are just placeholders from the start + * in order to make sure _n is positioned over the correct number + * from 12 to 0 (depending on X, which is a variadic argument list). + * They serve no purpose other than occupying a position. Since each + * macro parameter must have a distinct identifier, those identifiers + * are as good as any. + * + * In COUNT_ARGS() we use actual integers, so __COUNT_ARGS() returns + * that as _n. + */ + +/* This counts to 12. Any more, it will return 13th argument. */ +#define __COUNT_ARGS(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _n, X...) _n +#define COUNT_ARGS(X...) __COUNT_ARGS(, ##X, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) + +/* Concatenate two parameters, but allow them to be expanded beforehand. */ +#define __CONCAT(a, b) a ## b +#define CONCATENATE(a, b) __CONCAT(a, b) + +#endif /* _LINUX_ARGS_H */ diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index f196c19f8e55..7c67c17321d4 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -5,6 +5,7 @@ #ifndef __LINUX_ARM_SMCCC_H #define __LINUX_ARM_SMCCC_H +#include <linux/args.h> #include <linux/init.h> #include <uapi/linux/const.h> @@ -413,31 +414,26 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, #endif -#define ___count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x +#define __constraint_read_2 "r" (arg0) +#define __constraint_read_3 __constraint_read_2, "r" (arg1) +#define __constraint_read_4 __constraint_read_3, "r" (arg2) +#define __constraint_read_5 __constraint_read_4, "r" (arg3) +#define __constraint_read_6 __constraint_read_5, "r" (arg4) +#define __constraint_read_7 __constraint_read_6, "r" (arg5) +#define __constraint_read_8 __constraint_read_7, "r" (arg6) +#define __constraint_read_9 __constraint_read_8, "r" (arg7) -#define __count_args(...) \ - ___count_args(__VA_ARGS__, 7, 6, 5, 4, 3, 2, 1, 0) - -#define __constraint_read_0 "r" (arg0) -#define __constraint_read_1 __constraint_read_0, "r" (arg1) -#define __constraint_read_2 __constraint_read_1, "r" (arg2) -#define __constraint_read_3 __constraint_read_2, "r" (arg3) -#define __constraint_read_4 __constraint_read_3, "r" (arg4) -#define __constraint_read_5 __constraint_read_4, "r" (arg5) -#define __constraint_read_6 __constraint_read_5, "r" (arg6) -#define __constraint_read_7 __constraint_read_6, "r" (arg7) - -#define __declare_arg_0(a0, res) \ +#define __declare_arg_2(a0, res) \ struct arm_smccc_res *___res = res; \ register unsigned long arg0 asm("r0") = (u32)a0 -#define __declare_arg_1(a0, a1, res) \ +#define __declare_arg_3(a0, a1, res) \ typeof(a1) __a1 = a1; \ struct arm_smccc_res *___res = res; \ register unsigned long arg0 asm("r0") = (u32)a0; \ register typeof(a1) arg1 asm("r1") = __a1 -#define __declare_arg_2(a0, a1, a2, res) \ +#define __declare_arg_4(a0, a1, a2, res) \ typeof(a1) __a1 = a1; \ typeof(a2) __a2 = a2; \ struct arm_smccc_res *___res = res; \ @@ -445,7 +441,7 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, register typeof(a1) arg1 asm("r1") = __a1; \ register typeof(a2) arg2 asm("r2") = __a2 -#define __declare_arg_3(a0, a1, a2, a3, res) \ +#define __declare_arg_5(a0, a1, a2, a3, res) \ typeof(a1) __a1 = a1; \ typeof(a2) __a2 = a2; \ typeof(a3) __a3 = a3; \ @@ -455,34 +451,26 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, register typeof(a2) arg2 asm("r2") = __a2; \ register typeof(a3) arg3 asm("r3") = __a3 -#define __declare_arg_4(a0, a1, a2, a3, a4, res) \ +#define __declare_arg_6(a0, a1, a2, a3, a4, res) \ typeof(a4) __a4 = a4; \ - __declare_arg_3(a0, a1, a2, a3, res); \ + __declare_arg_5(a0, a1, a2, a3, res); \ register typeof(a4) arg4 asm("r4") = __a4 -#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res) \ +#define __declare_arg_7(a0, a1, a2, a3, a4, a5, res) \ typeof(a5) __a5 = a5; \ - __declare_arg_4(a0, a1, a2, a3, a4, res); \ + __declare_arg_6(a0, a1, a2, a3, a4, res); \ register typeof(a5) arg5 asm("r5") = __a5 -#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res) \ +#define __declare_arg_8(a0, a1, a2, a3, a4, a5, a6, res) \ typeof(a6) __a6 = a6; \ - __declare_arg_5(a0, a1, a2, a3, a4, a5, res); \ + __declare_arg_7(a0, a1, a2, a3, a4, a5, res); \ register typeof(a6) arg6 asm("r6") = __a6 -#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res) \ +#define __declare_arg_9(a0, a1, a2, a3, a4, a5, a6, a7, res) \ typeof(a7) __a7 = a7; \ - __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res); \ + __declare_arg_8(a0, a1, a2, a3, a4, a5, a6, res); \ register typeof(a7) arg7 asm("r7") = __a7 -#define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__) -#define __declare_args(count, ...) ___declare_args(count, __VA_ARGS__) - -#define ___constraints(count) \ - : __constraint_read_ ## count \ - : smccc_sve_clobbers "memory" -#define __constraints(count) ___constraints(count) - /* * We have an output list that is not necessarily used, and GCC feels * entitled to optimise the whole sequence away. "volatile" is what @@ -494,11 +482,14 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, register unsigned long r1 asm("r1"); \ register unsigned long r2 asm("r2"); \ register unsigned long r3 asm("r3"); \ - __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \ + CONCATENATE(__declare_arg_, \ + COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__); \ asm volatile(SMCCC_SVE_CHECK \ inst "\n" : \ "=r" (r0), "=r" (r1), "=r" (r2), "=r" (r3) \ - __constraints(__count_args(__VA_ARGS__))); \ + : CONCATENATE(__constraint_read_, \ + COUNT_ARGS(__VA_ARGS__)) \ + : smccc_sve_clobbers "memory"); \ if (___res) \ *___res = (typeof(*___res)){r0, r1, r2, r3}; \ } while (0) @@ -542,8 +533,12 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1, */ #define __fail_smccc_1_1(...) \ do { \ - __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \ - asm ("" : __constraints(__count_args(__VA_ARGS__))); \ + CONCATENATE(__declare_arg_, \ + COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__); \ + asm ("" : \ + : CONCATENATE(__constraint_read_, \ + COUNT_ARGS(__VA_ARGS__)) \ + : smccc_sve_clobbers "memory"); \ if (___res) \ ___res->a0 = SMCCC_RET_NOT_SUPPORTED; \ } while (0) diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h index de62a722431e..0c06561bf5ff 100644 --- a/include/linux/crash_core.h +++ b/include/linux/crash_core.h @@ -28,6 +28,8 @@ VMCOREINFO_BYTES) typedef u32 note_buf_t[CRASH_CORE_NOTE_BYTES/4]; +/* Per cpu memory for storing cpu states in case of system crash. */ +extern note_buf_t __percpu *crash_notes; void crash_update_vmcoreinfo_safecopy(void *ptr); void crash_save_vmcoreinfo(void); @@ -84,4 +86,29 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram, int parse_crashkernel_low(char *cmdline, unsigned long long system_ram, unsigned long long *crash_size, unsigned long long *crash_base); +/* Alignment required for elf header segment */ +#define ELF_CORE_HEADER_ALIGN 4096 + +struct crash_mem { + unsigned int max_nr_ranges; + unsigned int nr_ranges; + struct range ranges[]; +}; + +extern int crash_exclude_mem_range(struct crash_mem *mem, + unsigned long long mstart, + unsigned long long mend); +extern int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map, + void **addr, unsigned long *sz); + +struct kimage; +struct kexec_segment; + +#define KEXEC_CRASH_HP_NONE 0 +#define KEXEC_CRASH_HP_ADD_CPU 1 +#define KEXEC_CRASH_HP_REMOVE_CPU 2 +#define KEXEC_CRASH_HP_ADD_MEMORY 3 +#define KEXEC_CRASH_HP_REMOVE_MEMORY 4 +#define KEXEC_CRASH_HP_INVALID_CPU -1U + #endif /* LINUX_CRASH_CORE_H */ diff --git a/include/linux/genl_magic_func.h b/include/linux/genl_magic_func.h index 2984b0cb24b1..d4da060b7532 100644 --- a/include/linux/genl_magic_func.h +++ b/include/linux/genl_magic_func.h @@ -2,6 +2,7 @@ #ifndef GENL_MAGIC_FUNC_H #define GENL_MAGIC_FUNC_H +#include <linux/args.h> #include <linux/build_bug.h> #include <linux/genl_magic_struct.h> @@ -23,7 +24,7 @@ #define GENL_struct(tag_name, tag_number, s_name, s_fields) \ [tag_name] = { .type = NLA_NESTED }, -static struct nla_policy CONCAT_(GENL_MAGIC_FAMILY, _tla_nl_policy)[] = { +static struct nla_policy CONCATENATE(GENL_MAGIC_FAMILY, _tla_nl_policy)[] = { #include GENL_MAGIC_INCLUDE_FILE }; @@ -209,7 +210,7 @@ static int s_name ## _from_attrs_for_change(struct s_name *s, \ * Magic: define op number to op name mapping {{{1 * {{{2 */ -static const char *CONCAT_(GENL_MAGIC_FAMILY, _genl_cmd_to_str)(__u8 cmd) +static const char *CONCATENATE(GENL_MAGIC_FAMILY, _genl_cmd_to_str)(__u8 cmd) { switch (cmd) { #undef GENL_op @@ -235,7 +236,7 @@ static const char *CONCAT_(GENL_MAGIC_FAMILY, _genl_cmd_to_str)(__u8 cmd) .cmd = op_name, \ }, -#define ZZZ_genl_ops CONCAT_(GENL_MAGIC_FAMILY, _genl_ops) +#define ZZZ_genl_ops CONCATENATE(GENL_MAGIC_FAMILY, _genl_ops) static struct genl_ops ZZZ_genl_ops[] __read_mostly = { #include GENL_MAGIC_INCLUDE_FILE }; @@ -248,32 +249,32 @@ static struct genl_ops ZZZ_genl_ops[] __read_mostly = { * and provide register/unregister functions. * {{{2 */ -#define ZZZ_genl_family CONCAT_(GENL_MAGIC_FAMILY, _genl_family) +#define ZZZ_genl_family CONCATENATE(GENL_MAGIC_FAMILY, _genl_family) static struct genl_family ZZZ_genl_family; /* * Magic: define multicast groups * Magic: define multicast group registration helper */ -#define ZZZ_genl_mcgrps CONCAT_(GENL_MAGIC_FAMILY, _genl_mcgrps) +#define ZZZ_genl_mcgrps CONCATENATE(GENL_MAGIC_FAMILY, _genl_mcgrps) static const struct genl_multicast_group ZZZ_genl_mcgrps[] = { #undef GENL_mc_group #define GENL_mc_group(group) { .name = #group, }, #include GENL_MAGIC_INCLUDE_FILE }; -enum CONCAT_(GENL_MAGIC_FAMILY, group_ids) { +enum CONCATENATE(GENL_MAGIC_FAMILY, group_ids) { #undef GENL_mc_group -#define GENL_mc_group(group) CONCAT_(GENL_MAGIC_FAMILY, _group_ ## group), +#define GENL_mc_group(group) CONCATENATE(GENL_MAGIC_FAMILY, _group_ ## group), #include GENL_MAGIC_INCLUDE_FILE }; #undef GENL_mc_group #define GENL_mc_group(group) \ -static int CONCAT_(GENL_MAGIC_FAMILY, _genl_multicast_ ## group)( \ +static int CONCATENATE(GENL_MAGIC_FAMILY, _genl_multicast_ ## group)( \ struct sk_buff *skb, gfp_t flags) \ { \ unsigned int group_id = \ - CONCAT_(GENL_MAGIC_FAMILY, _group_ ## group); \ + CONCATENATE(GENL_MAGIC_FAMILY, _group_ ## group); \ return genlmsg_multicast(&ZZZ_genl_family, skb, 0, \ group_id, flags); \ } @@ -289,8 +290,8 @@ static struct genl_family ZZZ_genl_family __ro_after_init = { #ifdef GENL_MAGIC_FAMILY_HDRSZ .hdrsize = NLA_ALIGN(GENL_MAGIC_FAMILY_HDRSZ), #endif - .maxattr = ARRAY_SIZE(CONCAT_(GENL_MAGIC_FAMILY, _tla_nl_policy))-1, - .policy = CONCAT_(GENL_MAGIC_FAMILY, _tla_nl_policy), + .maxattr = ARRAY_SIZE(CONCATENATE(GENL_MAGIC_FAMILY, _tla_nl_policy))-1, + .policy = CONCATENATE(GENL_MAGIC_FAMILY, _tla_nl_policy), .ops = ZZZ_genl_ops, .n_ops = ARRAY_SIZE(ZZZ_genl_ops), .mcgrps = ZZZ_genl_mcgrps, @@ -299,12 +300,12 @@ static struct genl_family ZZZ_genl_family __ro_after_init = { .module = THIS_MODULE, }; -int CONCAT_(GENL_MAGIC_FAMILY, _genl_register)(void) +int CONCATENATE(GENL_MAGIC_FAMILY, _genl_register)(void) { return genl_register_family(&ZZZ_genl_family); } -void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void) +void CONCATENATE(GENL_MAGIC_FAMILY, _genl_unregister)(void) { genl_unregister_family(&ZZZ_genl_family); } diff --git a/include/linux/genl_magic_struct.h b/include/linux/genl_magic_struct.h index f81d48987528..a419d93789ff 100644 --- a/include/linux/genl_magic_struct.h +++ b/include/linux/genl_magic_struct.h @@ -14,14 +14,12 @@ # error "you need to define GENL_MAGIC_INCLUDE_FILE before inclusion" #endif +#include <linux/args.h> #include <linux/genetlink.h> #include <linux/types.h> -#define CONCAT__(a,b) a ## b -#define CONCAT_(a,b) CONCAT__(a,b) - -extern int CONCAT_(GENL_MAGIC_FAMILY, _genl_register)(void); -extern void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void); +extern int CONCATENATE(GENL_MAGIC_FAMILY, _genl_register)(void); +extern void CONCATENATE(GENL_MAGIC_FAMILY, _genl_unregister)(void); /* * Extension of genl attribute validation policies {{{2 diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 0d91e0af0125..cee8fe87e9f4 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -29,6 +29,7 @@ #include <linux/panic.h> #include <linux/printk.h> #include <linux/build_bug.h> +#include <linux/sprintf.h> #include <linux/static_call_types.h> #include <linux/instruction_pointer.h> #include <asm/byteorder.h> @@ -203,35 +204,6 @@ static inline void might_fault(void) { } void do_exit(long error_code) __noreturn; -extern int num_to_str(char *buf, int size, - unsigned long long num, unsigned int width); - -/* lib/printf utilities */ - -extern __printf(2, 3) int sprintf(char *buf, const char * fmt, ...); -extern __printf(2, 0) int vsprintf(char *buf, const char *, va_list); -extern __printf(3, 4) -int snprintf(char *buf, size_t size, const char *fmt, ...); -extern __printf(3, 0) -int vsnprintf(char *buf, size_t size, const char *fmt, va_list args); -extern __printf(3, 4) -int scnprintf(char *buf, size_t size, const char *fmt, ...); -extern __printf(3, 0) -int vscnprintf(char *buf, size_t size, const char *fmt, va_list args); -extern __printf(2, 3) __malloc -char *kasprintf(gfp_t gfp, const char *fmt, ...); -extern __printf(2, 0) __malloc -char *kvasprintf(gfp_t gfp, const char *fmt, va_list args); -extern __printf(2, 0) -const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args); - -extern __scanf(2, 3) -int sscanf(const char *, const char *, ...); -extern __scanf(2, 0) -int vsscanf(const char *, const char *, va_list); - -extern int no_hash_pointers_enable(char *str); - extern int get_option(char **str, int *pint); extern char *get_options(const char *str, int nints, int *ints); extern unsigned long long memparse(const char *ptr, char **retptr); @@ -457,13 +429,6 @@ ftrace_vprintk(const char *fmt, va_list ap) static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { } #endif /* CONFIG_TRACING */ -/* This counts to 12. Any more, it will return 13th argument. */ -#define __COUNT_ARGS(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _n, X...) _n -#define COUNT_ARGS(X...) __COUNT_ARGS(, ##X, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0) - -#define __CONCAT(a, b) a ## b -#define CONCATENATE(a, b) __CONCAT(a, b) - /* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */ #ifdef CONFIG_FTRACE_MCOUNT_RECORD # define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 22b5cd24f581..32c78078552c 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -33,6 +33,7 @@ extern note_buf_t __percpu *crash_notes; #include <linux/compat.h> #include <linux/ioport.h> #include <linux/module.h> +#include <linux/highmem.h> #include <asm/kexec.h> /* Verify architecture specific macros are defined */ @@ -230,21 +231,6 @@ static inline int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf) } #endif -/* Alignment required for elf header segment */ -#define ELF_CORE_HEADER_ALIGN 4096 - -struct crash_mem { - unsigned int max_nr_ranges; - unsigned int nr_ranges; - struct range ranges[]; -}; - -extern int crash_exclude_mem_range(struct crash_mem *mem, - unsigned long long mstart, - unsigned long long mend); -extern int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map, - void **addr, unsigned long *sz); - #ifndef arch_kexec_apply_relocations_add /* * arch_kexec_apply_relocations_add - apply relocations of type RELA @@ -334,6 +320,10 @@ struct kimage { unsigned int preserve_context : 1; /* If set, we are using file mode kexec syscall */ unsigned int file_mode:1; +#ifdef CONFIG_CRASH_HOTPLUG + /* If set, allow changes to elfcorehdr of kexec_load'd image */ + unsigned int update_elfcorehdr:1; +#endif #ifdef ARCH_HAS_KIMAGE_ARCH struct kimage_arch arch; @@ -360,6 +350,12 @@ struct kimage { struct purgatory_info purgatory_info; #endif +#ifdef CONFIG_CRASH_HOTPLUG + int hp_action; + int elfcorehdr_index; + bool elfcorehdr_updated; +#endif + #ifdef CONFIG_IMA_KEXEC /* Virtual address of IMA measurement buffer for kexec syscall */ void *ima_buffer; @@ -404,9 +400,9 @@ bool kexec_load_permitted(int kexec_image_type); /* List of defined/legal kexec flags */ #ifndef CONFIG_KEXEC_JUMP -#define KEXEC_FLAGS KEXEC_ON_CRASH +#define KEXEC_FLAGS (KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR) #else -#define KEXEC_FLAGS (KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT) +#define KEXEC_FLAGS (KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT | KEXEC_UPDATE_ELFCOREHDR) #endif /* List of defined/legal kexec file flags */ @@ -490,6 +486,24 @@ static inline int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, g static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) { } #endif +#ifndef arch_crash_handle_hotplug_event +static inline void arch_crash_handle_hotplug_event(struct kimage *image) { } +#endif + +int crash_check_update_elfcorehdr(void); + +#ifndef crash_hotplug_cpu_support +static inline int crash_hotplug_cpu_support(void) { return 0; } +#endif + +#ifndef crash_hotplug_memory_support +static inline int crash_hotplug_memory_support(void) { return 0; } +#endif + +#ifndef crash_get_elfcorehdr_size +static inline unsigned int crash_get_elfcorehdr_size(void) { return 0; } +#endif + #else /* !CONFIG_KEXEC_CORE */ struct pt_regs; struct task_struct; diff --git a/include/linux/kthread.h b/include/linux/kthread.h index f1f95a71a4bc..2c30ade43bc8 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -88,7 +88,6 @@ void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask); int kthread_stop(struct task_struct *k); bool kthread_should_stop(void); bool kthread_should_park(void); -bool __kthread_should_park(struct task_struct *k); bool kthread_should_stop_or_park(void); bool kthread_freezable_should_stop(bool *was_frozen); void *kthread_func(struct task_struct *k); diff --git a/include/linux/limits.h b/include/linux/limits.h index f6bcc9369010..38eb7f6f7e88 100644 --- a/include/linux/limits.h +++ b/include/linux/limits.h @@ -10,6 +10,8 @@ #define SSIZE_MAX ((ssize_t)(SIZE_MAX >> 1)) #define PHYS_ADDR_MAX (~(phys_addr_t)0) +#define RESOURCE_SIZE_MAX ((resource_size_t)~0) + #define U8_MAX ((u8)~0U) #define S8_MAX ((s8)(U8_MAX >> 1)) #define S8_MIN ((s8)(-S8_MAX - 1)) diff --git a/include/linux/math.h b/include/linux/math.h index 2d388650c556..dd4152711de7 100644 --- a/include/linux/math.h +++ b/include/linux/math.h @@ -156,6 +156,25 @@ __STRUCT_FRACT(u32) ({ signed type __x = (x); __x < 0 ? -__x : __x; }), other) /** + * abs_diff - return absolute value of the difference between the arguments + * @a: the first argument + * @b: the second argument + * + * @a and @b have to be of the same type. With this restriction we compare + * signed to signed and unsigned to unsigned. The result is the subtraction + * the smaller of the two from the bigger, hence result is always a positive + * value. + * + * Return: an absolute value of the difference between the @a and @b. + */ +#define abs_diff(a, b) ({ \ + typeof(a) __a = (a); \ + typeof(b) __b = (b); \ + (void)(&__a == &__b); \ + __a > __b ? (__a - __b) : (__b - __a); \ +}) + +/** * reciprocal_scale - "scale" a value into range [0, ep_ro) * @val: value * @ep_ro: right open interval endpoint diff --git a/include/linux/nmi.h b/include/linux/nmi.h index e3e6a64b98e0..e92e378df000 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -157,31 +157,31 @@ static inline void touch_nmi_watchdog(void) #ifdef arch_trigger_cpumask_backtrace static inline bool trigger_all_cpu_backtrace(void) { - arch_trigger_cpumask_backtrace(cpu_online_mask, false); + arch_trigger_cpumask_backtrace(cpu_online_mask, -1); return true; } -static inline bool trigger_allbutself_cpu_backtrace(void) +static inline bool trigger_allbutcpu_cpu_backtrace(int exclude_cpu) { - arch_trigger_cpumask_backtrace(cpu_online_mask, true); + arch_trigger_cpumask_backtrace(cpu_online_mask, exclude_cpu); return true; } static inline bool trigger_cpumask_backtrace(struct cpumask *mask) { - arch_trigger_cpumask_backtrace(mask, false); + arch_trigger_cpumask_backtrace(mask, -1); return true; } static inline bool trigger_single_cpu_backtrace(int cpu) { - arch_trigger_cpumask_backtrace(cpumask_of(cpu), false); + arch_trigger_cpumask_backtrace(cpumask_of(cpu), -1); return true; } /* generic implementation */ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask, - bool exclude_self, + int exclude_cpu, void (*raise)(cpumask_t *mask)); bool nmi_cpu_backtrace(struct pt_regs *regs); @@ -190,7 +190,7 @@ static inline bool trigger_all_cpu_backtrace(void) { return false; } -static inline bool trigger_allbutself_cpu_backtrace(void) +static inline bool trigger_allbutcpu_cpu_backtrace(int exclude_cpu) { return false; } diff --git a/include/linux/pci.h b/include/linux/pci.h index c69a2cc1f412..9c856e9aada5 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -23,7 +23,7 @@ #ifndef LINUX_PCI_H #define LINUX_PCI_H - +#include <linux/args.h> #include <linux/mod_devicetable.h> #include <linux/types.h> diff --git a/include/linux/range.h b/include/linux/range.h index 7efb6a9b069b..6ad0b73cb7ad 100644 --- a/include/linux/range.h +++ b/include/linux/range.h @@ -31,12 +31,4 @@ int clean_sort_range(struct range *range, int az); void sort_range(struct range *range, int nr_range); -#define MAX_RESOURCE ((resource_size_t)~0) -static inline resource_size_t cap_resource(u64 val) -{ - if (val > MAX_RESOURCE) - return MAX_RESOURCE; - - return val; -} #endif diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index 669e8cff40c7..0014d3adaf84 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -649,12 +649,9 @@ extern void flush_itimer_signals(void); extern bool current_is_single_threaded(void); /* - * Careful: do_each_thread/while_each_thread is a double loop so - * 'break' will not work as expected - use goto instead. + * Without tasklist/siglock it is only rcu-safe if g can't exit/exec, + * otherwise next_thread(t) will never reach g after list_del_rcu(g). */ -#define do_each_thread(g, t) \ - for (g = t = &init_task ; (g = t = next_task(g)) != &init_task ; ) do - #define while_each_thread(g, t) \ while ((t = next_thread(t)) != g) diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h new file mode 100644 index 000000000000..33dcbec71925 --- /dev/null +++ b/include/linux/sprintf.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_KERNEL_SPRINTF_H_ +#define _LINUX_KERNEL_SPRINTF_H_ + +#include <linux/compiler_attributes.h> +#include <linux/types.h> + +int num_to_str(char *buf, int size, unsigned long long num, unsigned int width); + +__printf(2, 3) int sprintf(char *buf, const char * fmt, ...); +__printf(2, 0) int vsprintf(char *buf, const char *, va_list); +__printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...); +__printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args); +__printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...); +__printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args); +__printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...); +__printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args); +__printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args); + +__scanf(2, 3) int sscanf(const char *, const char *, ...); +__scanf(2, 0) int vsscanf(const char *, const char *, va_list); + +/* These are for specific cases, do not use without real need */ +extern bool no_hash_pointers; +int no_hash_pointers_enable(char *str); + +#endif /* _LINUX_KERNEL_SPRINTF_H */ diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index 1f7fc1fc590c..e609cd7da47e 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -12,6 +12,8 @@ #undef __perf_task #define __perf_task(t) (t) +#include <linux/args.h> + /* cast any integer, pointer, or small struct to u64 */ #define UINTTYPE(size) \ __typeof__(__builtin_choose_expr(size == 1, (u8)1, \ diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h index 981016e05cfa..01766dd839b0 100644 --- a/include/uapi/linux/kexec.h +++ b/include/uapi/linux/kexec.h @@ -12,6 +12,7 @@ /* kexec flags for different usage scenarios */ #define KEXEC_ON_CRASH 0x00000001 #define KEXEC_PRESERVE_CONTEXT 0x00000002 +#define KEXEC_UPDATE_ELFCOREHDR 0x00000004 #define KEXEC_ARCH_MASK 0xffff0000 /* diff --git a/init/Kconfig b/init/Kconfig index 5e7d4885d1bf..6d35728b94b2 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1791,14 +1791,6 @@ config DEBUG_RSEQ If unsure, say N. -config EMBEDDED - bool "Embedded system" - select EXPERT - help - This option should be enabled if compiling the kernel for - an embedded system so certain expert options are available - for configuration. - config HAVE_PERF_EVENTS bool help @@ -1928,6 +1920,8 @@ config BINDGEN_VERSION_TEXT config TRACEPOINTS bool +source "kernel/Kconfig.kexec" + endmenu # General setup source "arch/Kconfig" diff --git a/ipc/sem.c b/ipc/sem.c index 00f88aa01ac5..a39cdc7bf88f 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -152,7 +152,7 @@ struct sem_undo { struct list_head list_id; /* per semaphore array list: * all undos for one array */ int semid; /* semaphore set identifier */ - short *semadj; /* array of adjustments */ + short semadj[]; /* array of adjustments */ /* one per semaphore */ }; @@ -1938,8 +1938,7 @@ static struct sem_undo *find_alloc_undo(struct ipc_namespace *ns, int semid) rcu_read_unlock(); /* step 2: allocate new undo structure */ - new = kvzalloc(sizeof(struct sem_undo) + sizeof(short)*nsems, - GFP_KERNEL_ACCOUNT); + new = kvzalloc(struct_size(new, semadj, nsems), GFP_KERNEL_ACCOUNT); if (!new) { ipc_rcu_putref(&sma->sem_perm, sem_rcu_free); return ERR_PTR(-ENOMEM); @@ -1967,7 +1966,6 @@ static struct sem_undo *find_alloc_undo(struct ipc_namespace *ns, int semid) goto success; } /* step 5: initialize & link new undo structure */ - new->semadj = (short *) &new[1]; new->ulp = ulp; new->semid = semid; assert_spin_locked(&ulp->lock); diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec new file mode 100644 index 000000000000..9bfe68fe9676 --- /dev/null +++ b/kernel/Kconfig.kexec @@ -0,0 +1,150 @@ +# SPDX-License-Identifier: GPL-2.0-only + +menu "Kexec and crash features" + +config CRASH_CORE + bool + +config KEXEC_CORE + select CRASH_CORE + bool + +config KEXEC_ELF + bool + +config HAVE_IMA_KEXEC + bool + +config KEXEC + bool "Enable kexec system call" + depends on ARCH_SUPPORTS_KEXEC + select KEXEC_CORE + help + kexec is a system call that implements the ability to shutdown your + current kernel, and to start another kernel. It is like a reboot + but it is independent of the system firmware. And like a reboot + you can start any kernel with it, not just Linux. + + The name comes from the similarity to the exec system call. + + It is an ongoing process to be certain the hardware in a machine + is properly shutdown, so do not be surprised if this code does not + initially work for you. As of this writing the exact hardware + interface is strongly in flux, so no good recommendation can be + made. + +config KEXEC_FILE + bool "Enable kexec file based system call" + depends on ARCH_SUPPORTS_KEXEC_FILE + select KEXEC_CORE + help + This is new version of kexec system call. This system call is + file based and takes file descriptors as system call argument + for kernel and initramfs as opposed to list of segments as + accepted by kexec system call. + +config KEXEC_SIG + bool "Verify kernel signature during kexec_file_load() syscall" + depends on ARCH_SUPPORTS_KEXEC_SIG + depends on KEXEC_FILE + help + This option makes the kexec_file_load() syscall check for a valid + signature of the kernel image. The image can still be loaded without + a valid signature unless you also enable KEXEC_SIG_FORCE, though if + there's a signature that we can check, then it must be valid. + + In addition to this option, you need to enable signature + verification for the corresponding kernel image type being + loaded in order for this to work. + +config KEXEC_SIG_FORCE + bool "Require a valid signature in kexec_file_load() syscall" + depends on ARCH_SUPPORTS_KEXEC_SIG_FORCE + depends on KEXEC_SIG + help + This option makes kernel signature verification mandatory for + the kexec_file_load() syscall. + +config KEXEC_IMAGE_VERIFY_SIG + bool "Enable Image signature verification support (ARM)" + default ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG + depends on ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG + depends on KEXEC_SIG + depends on EFI && SIGNED_PE_FILE_VERIFICATION + help + Enable Image signature verification support. + +config KEXEC_BZIMAGE_VERIFY_SIG + bool "Enable bzImage signature verification support" + depends on ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG + depends on KEXEC_SIG + depends on SIGNED_PE_FILE_VERIFICATION + select SYSTEM_TRUSTED_KEYRING + help + Enable bzImage signature verification support. + +config KEXEC_JUMP + bool "kexec jump" + depends on ARCH_SUPPORTS_KEXEC_JUMP + depends on KEXEC && HIBERNATION + help + Jump between original kernel and kexeced kernel and invoke + code in physical address mode via KEXEC + +config CRASH_DUMP + bool "kernel crash dumps" + depends on ARCH_SUPPORTS_CRASH_DUMP + depends on ARCH_SUPPORTS_KEXEC + select CRASH_CORE + select KEXEC_CORE + select KEXEC + help + Generate crash dump after being started by kexec. + This should be normally only set in special crash dump kernels + which are loaded in the main kernel with kexec-tools into + a specially reserved region and then later executed after + a crash by kdump/kexec. The crash dump kernel must be compiled + to a memory address not used by the main kernel or BIOS using + PHYSICAL_START, or it must be built as a relocatable image + (CONFIG_RELOCATABLE=y). + For more details see Documentation/admin-guide/kdump/kdump.rst + + For s390, this option also enables zfcpdump. + See also <file:Documentation/s390/zfcpdump.rst> + +config CRASH_HOTPLUG + bool "Update the crash elfcorehdr on system configuration changes" + default y + depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG) + depends on ARCH_SUPPORTS_CRASH_HOTPLUG + help + Enable direct update to the crash elfcorehdr (which contains + the list of CPUs and memory regions to be dumped upon a crash) + in response to hot plug/unplug or online/offline of CPUs or + memory. This is a much more advanced approach than userspace + attempting that. + + If unsure, say Y. + +config CRASH_MAX_MEMORY_RANGES + int "Specify the maximum number of memory regions for the elfcorehdr" + default 8192 + depends on CRASH_HOTPLUG + help + For the kexec_file_load() syscall path, specify the maximum number of + memory regions that the elfcorehdr buffer/segment can accommodate. + These regions are obtained via walk_system_ram_res(); eg. the + 'System RAM' entries in /proc/iomem. + This value is combined with NR_CPUS_DEFAULT and multiplied by + sizeof(Elf64_Phdr) to determine the final elfcorehdr memory buffer/ + segment size. + The value 8192, for example, covers a (sparsely populated) 1TiB system + consisting of 128MiB memblocks, while resulting in an elfcorehdr + memory buffer/segment size under 1MiB. This represents a sane choice + to accommodate both baremetal and virtual machine configurations. + + For the kexec_load() syscall path, CRASH_MAX_MEMORY_RANGES is part of + the computation behind the value provided through the + /sys/kernel/crash_elfcorehdr_size attribute. + +endmenu diff --git a/kernel/acct.c b/kernel/acct.c index 010667ce6080..10f769e13f72 100644 --- a/kernel/acct.c +++ b/kernel/acct.c @@ -445,7 +445,7 @@ static void fill_ac(acct_t *ac) memset(ac, 0, sizeof(acct_t)); ac->ac_version = ACCT_VERSION | ACCT_BYTEORDER; - strlcpy(ac->ac_comm, current->comm, sizeof(ac->ac_comm)); + strscpy(ac->ac_comm, current->comm, sizeof(ac->ac_comm)); /* calculate run_time in nsec*/ run_time = ktime_get_ns(); diff --git a/kernel/configs/tiny-base.config b/kernel/configs/tiny-base.config index 2f0e6bf6db2c..ffb9dcafca26 100644 --- a/kernel/configs/tiny-base.config +++ b/kernel/configs/tiny-base.config @@ -1 +1 @@ -CONFIG_EMBEDDED=y +CONFIG_EXPERT=y diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 693445e1f7f6..03a7932cde0a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -10,6 +10,9 @@ #include <linux/utsname.h> #include <linux/vmalloc.h> #include <linux/sizes.h> +#include <linux/kexec.h> +#include <linux/memory.h> +#include <linux/cpuhotplug.h> #include <asm/page.h> #include <asm/sections.h> @@ -17,6 +20,10 @@ #include <crypto/sha1.h> #include "kallsyms_internal.h" +#include "kexec_internal.h" + +/* Per cpu memory for storing cpu states in case of system crash. */ +note_buf_t __percpu *crash_notes; /* vmcoreinfo stuff */ unsigned char *vmcoreinfo_data; @@ -314,6 +321,187 @@ static int __init parse_crashkernel_dummy(char *arg) } early_param("crashkernel", parse_crashkernel_dummy); +int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map, + void **addr, unsigned long *sz) +{ + Elf64_Ehdr *ehdr; + Elf64_Phdr *phdr; + unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz; + unsigned char *buf; + unsigned int cpu, i; + unsigned long long notes_addr; + unsigned long mstart, mend; + + /* extra phdr for vmcoreinfo ELF note */ + nr_phdr = nr_cpus + 1; + nr_phdr += mem->nr_ranges; + + /* + * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping + * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64). + * I think this is required by tools like gdb. So same physical + * memory will be mapped in two ELF headers. One will contain kernel + * text virtual addresses and other will have __va(physical) addresses. + */ + + nr_phdr++; + elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr); + elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN); + + buf = vzalloc(elf_sz); + if (!buf) + return -ENOMEM; + + ehdr = (Elf64_Ehdr *)buf; + phdr = (Elf64_Phdr *)(ehdr + 1); + memcpy(ehdr->e_ident, ELFMAG, SELFMAG); + ehdr->e_ident[EI_CLASS] = ELFCLASS64; + ehdr->e_ident[EI_DATA] = ELFDATA2LSB; + ehdr->e_ident[EI_VERSION] = EV_CURRENT; + ehdr->e_ident[EI_OSABI] = ELF_OSABI; + memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD); + ehdr->e_type = ET_CORE; + ehdr->e_machine = ELF_ARCH; + ehdr->e_version = EV_CURRENT; + ehdr->e_phoff = sizeof(Elf64_Ehdr); + ehdr->e_ehsize = sizeof(Elf64_Ehdr); + ehdr->e_phentsize = sizeof(Elf64_Phdr); + + /* Prepare one phdr of type PT_NOTE for each possible CPU */ + for_each_possible_cpu(cpu) { + phdr->p_type = PT_NOTE; + notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu)); + phdr->p_offset = phdr->p_paddr = notes_addr; + phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t); + (ehdr->e_phnum)++; + phdr++; + } + + /* Prepare one PT_NOTE header for vmcoreinfo */ + phdr->p_type = PT_NOTE; + phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note(); + phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE; + (ehdr->e_phnum)++; + phdr++; + + /* Prepare PT_LOAD type program header for kernel text region */ + if (need_kernel_map) { + phdr->p_type = PT_LOAD; + phdr->p_flags = PF_R|PF_W|PF_X; + phdr->p_vaddr = (unsigned long) _text; + phdr->p_filesz = phdr->p_memsz = _end - _text; + phdr->p_offset = phdr->p_paddr = __pa_symbol(_text); + ehdr->e_phnum++; + phdr++; + } + + /* Go through all the ranges in mem->ranges[] and prepare phdr */ + for (i = 0; i < mem->nr_ranges; i++) { + mstart = mem->ranges[i].start; + mend = mem->ranges[i].end; + + phdr->p_type = PT_LOAD; + phdr->p_flags = PF_R|PF_W|PF_X; + phdr->p_offset = mstart; + + phdr->p_paddr = mstart; + phdr->p_vaddr = (unsigned long) __va(mstart); + phdr->p_filesz = phdr->p_memsz = mend - mstart + 1; + phdr->p_align = 0; + ehdr->e_phnum++; + pr_debug("Crash PT_LOAD ELF header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n", + phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz, + ehdr->e_phnum, phdr->p_offset); + phdr++; + } + + *addr = buf; + *sz = elf_sz; + return 0; +} + +int crash_exclude_mem_range(struct crash_mem *mem, + unsigned long long mstart, unsigned long long mend) +{ + int i, j; + unsigned long long start, end, p_start, p_end; + struct range temp_range = {0, 0}; + + for (i = 0; i < mem->nr_ranges; i++) { + start = mem->ranges[i].start; + end = mem->ranges[i].end; + p_start = mstart; + p_end = mend; + + if (mstart > end || mend < start) + continue; + + /* Truncate any area outside of range */ + if (mstart < start) + p_start = start; + if (mend > end) + p_end = end; + + /* Found completely overlapping range */ + if (p_start == start && p_end == end) { + mem->ranges[i].start = 0; + mem->ranges[i].end = 0; + if (i < mem->nr_ranges - 1) { + /* Shift rest of the ranges to left */ + for (j = i; j < mem->nr_ranges - 1; j++) { + mem->ranges[j].start = + mem->ranges[j+1].start; + mem->ranges[j].end = + mem->ranges[j+1].end; + } + + /* + * Continue to check if there are another overlapping ranges + * from the current position because of shifting the above + * mem ranges. + */ + i--; + mem->nr_ranges--; + continue; + } + mem->nr_ranges--; + return 0; + } + + if (p_start > start && p_end < end) { + /* Split original range */ + mem->ranges[i].end = p_start - 1; + temp_range.start = p_end + 1; + temp_range.end = end; + } else if (p_start != start) + mem->ranges[i].end = p_start - 1; + else + mem->ranges[i].start = p_end + 1; + break; + } + + /* If a split happened, add the split to array */ + if (!temp_range.end) + return 0; + + /* Split happened */ + if (i == mem->max_nr_ranges - 1) + return -ENOMEM; + + /* Location where new range should go */ + j = i + 1; + if (j < mem->nr_ranges) { + /* Move over all ranges one slot towards the end */ + for (i = mem->nr_ranges - 1; i >= j; i--) + mem->ranges[i + 1] = mem->ranges[i]; + } + + mem->ranges[j].start = temp_range.start; + mem->ranges[j].end = temp_range.end; + mem->nr_ranges++; + return 0; +} + Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type, void *data, size_t data_len) { @@ -513,3 +701,206 @@ static int __init crash_save_vmcoreinfo_init(void) } subsys_initcall(crash_save_vmcoreinfo_init); + +static int __init crash_notes_memory_init(void) +{ + /* Allocate memory for saving cpu registers. */ + size_t size, align; + + /* + * crash_notes could be allocated across 2 vmalloc pages when percpu + * is vmalloc based . vmalloc doesn't guarantee 2 continuous vmalloc + * pages are also on 2 continuous physical pages. In this case the + * 2nd part of crash_notes in 2nd page could be lost since only the + * starting address and size of crash_notes are exported through sysfs. + * Here round up the size of crash_notes to the nearest power of two + * and pass it to __alloc_percpu as align value. This can make sure + * crash_notes is allocated inside one physical page. + */ + size = sizeof(note_buf_t); + align = min(roundup_pow_of_two(sizeof(note_buf_t)), PAGE_SIZE); + + /* + * Break compile if size is bigger than PAGE_SIZE since crash_notes + * definitely will be in 2 pages with that. + */ + BUILD_BUG_ON(size > PAGE_SIZE); + + crash_notes = __alloc_percpu(size, align); + if (!crash_notes) { + pr_warn("Memory allocation for saving cpu register states failed\n"); + return -ENOMEM; + } + return 0; +} +subsys_initcall(crash_notes_memory_init); + +#ifdef CONFIG_CRASH_HOTPLUG +#undef pr_fmt +#define pr_fmt(fmt) "crash hp: " fmt + +/* + * This routine utilized when the crash_hotplug sysfs node is read. + * It reflects the kernel's ability/permission to update the crash + * elfcorehdr directly. + */ +int crash_check_update_elfcorehdr(void) +{ + int rc = 0; + + /* Obtain lock while reading crash information */ + if (!kexec_trylock()) { + pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n"); + return 0; + } + if (kexec_crash_image) { + if (kexec_crash_image->file_mode) + rc = 1; + else + rc = kexec_crash_image->update_elfcorehdr; + } + /* Release lock now that update complete */ + kexec_unlock(); + + return rc; +} + +/* + * To accurately reflect hot un/plug changes of cpu and memory resources + * (including onling and offlining of those resources), the elfcorehdr + * (which is passed to the crash kernel via the elfcorehdr= parameter) + * must be updated with the new list of CPUs and memories. + * + * In order to make changes to elfcorehdr, two conditions are needed: + * First, the segment containing the elfcorehdr must be large enough + * to permit a growing number of resources; the elfcorehdr memory size + * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. + * Second, purgatory must explicitly exclude the elfcorehdr from the + * list of segments it checks (since the elfcorehdr changes and thus + * would require an update to purgatory itself to update the digest). + */ +static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu) +{ + struct kimage *image; + + /* Obtain lock while changing crash information */ + if (!kexec_trylock()) { + pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n"); + return; + } + + /* Check kdump is not loaded */ + if (!kexec_crash_image) + goto out; + + image = kexec_crash_image; + + /* Check that updating elfcorehdr is permitted */ + if (!(image->file_mode || image->update_elfcorehdr)) + goto out; + + if (hp_action == KEXEC_CRASH_HP_ADD_CPU || + hp_action == KEXEC_CRASH_HP_REMOVE_CPU) + pr_debug("hp_action %u, cpu %u\n", hp_action, cpu); + else + pr_debug("hp_action %u\n", hp_action); + + /* + * The elfcorehdr_index is set to -1 when the struct kimage + * is allocated. Find the segment containing the elfcorehdr, + * if not already found. + */ + if (image->elfcorehdr_index < 0) { + unsigned long mem; + unsigned char *ptr; + unsigned int n; + + for (n = 0; n < image->nr_segments; n++) { + mem = image->segment[n].mem; + ptr = kmap_local_page(pfn_to_page(mem >> PAGE_SHIFT)); + if (ptr) { + /* The segment containing elfcorehdr */ + if (memcmp(ptr, ELFMAG, SELFMAG) == 0) + image->elfcorehdr_index = (int)n; + kunmap_local(ptr); + } + } + } + + if (image->elfcorehdr_index < 0) { + pr_err("unable to locate elfcorehdr segment"); + goto out; + } + + /* Needed in order for the segments to be updated */ + arch_kexec_unprotect_crashkres(); + + /* Differentiate between normal load and hotplug update */ + image->hp_action = hp_action; + + /* Now invoke arch-specific update handler */ + arch_crash_handle_hotplug_event(image); + + /* No longer handling a hotplug event */ + image->hp_action = KEXEC_CRASH_HP_NONE; + image->elfcorehdr_updated = true; + + /* Change back to read-only */ + arch_kexec_protect_crashkres(); + + /* Errors in the callback is not a reason to rollback state */ +out: + /* Release lock now that update complete */ + kexec_unlock(); +} + +static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v) +{ + switch (val) { + case MEM_ONLINE: + crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY, + KEXEC_CRASH_HP_INVALID_CPU); + break; + + case MEM_OFFLINE: + crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY, + KEXEC_CRASH_HP_INVALID_CPU); + break; + } + return NOTIFY_OK; +} + +static struct notifier_block crash_memhp_nb = { + .notifier_call = crash_memhp_notifier, + .priority = 0 +}; + +static int crash_cpuhp_online(unsigned int cpu) +{ + crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu); + return 0; +} + +static int crash_cpuhp_offline(unsigned int cpu) +{ + crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu); + return 0; +} + +static int __init crash_hotplug_init(void) +{ + int result = 0; + + if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) + register_memory_notifier(&crash_memhp_nb); + + if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) { + result = cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN, + "crash/cpuhp", crash_cpuhp_online, crash_cpuhp_offline); + } + + return result; +} + +subsys_initcall(crash_hotplug_init); +#endif diff --git a/kernel/cred.c b/kernel/cred.c index 811ad654abd1..98cb4eca23fb 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -4,6 +4,9 @@ * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@redhat.com) */ + +#define pr_fmt(fmt) "CRED: " fmt + #include <linux/export.h> #include <linux/cred.h> #include <linux/slab.h> @@ -835,32 +838,32 @@ EXPORT_SYMBOL(creds_are_invalid); static void dump_invalid_creds(const struct cred *cred, const char *label, const struct task_struct *tsk) { - printk(KERN_ERR "CRED: %s credentials: %p %s%s%s\n", + pr_err("%s credentials: %p %s%s%s\n", label, cred, cred == &init_cred ? "[init]" : "", cred == tsk->real_cred ? "[real]" : "", cred == tsk->cred ? "[eff]" : ""); - printk(KERN_ERR "CRED: ->magic=%x, put_addr=%p\n", + pr_err("->magic=%x, put_addr=%p\n", cred->magic, cred->put_addr); - printk(KERN_ERR "CRED: ->usage=%d, subscr=%d\n", + pr_err("->usage=%d, subscr=%d\n", atomic_read(&cred->usage), read_cred_subscribers(cred)); - printk(KERN_ERR "CRED: ->*uid = { %d,%d,%d,%d }\n", + pr_err("->*uid = { %d,%d,%d,%d }\n", from_kuid_munged(&init_user_ns, cred->uid), from_kuid_munged(&init_user_ns, cred->euid), from_kuid_munged(&init_user_ns, cred->suid), from_kuid_munged(&init_user_ns, cred->fsuid)); - printk(KERN_ERR "CRED: ->*gid = { %d,%d,%d,%d }\n", + pr_err("->*gid = { %d,%d,%d,%d }\n", from_kgid_munged(&init_user_ns, cred->gid), from_kgid_munged(&init_user_ns, cred->egid), from_kgid_munged(&init_user_ns, cred->sgid), from_kgid_munged(&init_user_ns, cred->fsgid)); #ifdef CONFIG_SECURITY - printk(KERN_ERR "CRED: ->security is %p\n", cred->security); + pr_err("->security is %p\n", cred->security); if ((unsigned long) cred->security >= PAGE_SIZE && (((unsigned long) cred->security & 0xffffff00) != (POISON_FREE << 24 | POISON_FREE << 16 | POISON_FREE << 8))) - printk(KERN_ERR "CRED: ->security {%x, %x}\n", + pr_err("->security {%x, %x}\n", ((u32*)cred->security)[0], ((u32*)cred->security)[1]); #endif @@ -871,8 +874,8 @@ static void dump_invalid_creds(const struct cred *cred, const char *label, */ void __noreturn __invalid_creds(const struct cred *cred, const char *file, unsigned line) { - printk(KERN_ERR "CRED: Invalid credentials\n"); - printk(KERN_ERR "CRED: At %s:%u\n", file, line); + pr_err("Invalid credentials\n"); + pr_err("At %s:%u\n", file, line); dump_invalid_creds(cred, "Specified", current); BUG(); } @@ -898,14 +901,14 @@ void __validate_process_creds(struct task_struct *tsk, return; invalid_creds: - printk(KERN_ERR "CRED: Invalid process credentials\n"); - printk(KERN_ERR "CRED: At %s:%u\n", file, line); + pr_err("Invalid process credentials\n"); + pr_err("At %s:%u\n", file, line); dump_invalid_creds(tsk->real_cred, "Real", tsk); if (tsk->cred != tsk->real_cred) dump_invalid_creds(tsk->cred, "Effective", tsk); else - printk(KERN_ERR "CRED: Effective creds == Real creds\n"); + pr_err("Effective creds == Real creds\n"); BUG(); } EXPORT_SYMBOL(__validate_process_creds); diff --git a/kernel/fork.c b/kernel/fork.c index f81149739eb9..a9c18d480dc5 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1404,8 +1404,8 @@ EXPORT_SYMBOL_GPL(mmput_async); * This changes mm's executable file (shown as symlink /proc/[pid]/exe). * * Main users are mmput() and sys_execve(). Callers prevent concurrent - * invocations: in mmput() nobody alive left, in execve task is single - * threaded. + * invocations: in mmput() nobody alive left, in execve it happens before + * the new mm is made visible to anyone. * * Can only fail if new_exe_file != NULL. */ @@ -1440,9 +1440,7 @@ int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) /** * replace_mm_exe_file - replace a reference to the mm's executable file * - * This changes mm's executable file (shown as symlink /proc/[pid]/exe), - * dealing with concurrent invocation and without grabbing the mmap lock in - * write mode. + * This changes mm's executable file (shown as symlink /proc/[pid]/exe). * * Main user is sys_prctl(PR_SET_MM_MAP/EXE_FILE). */ @@ -1472,22 +1470,20 @@ int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) return ret; } - /* set the new file, lockless */ ret = deny_write_access(new_exe_file); if (ret) return -EACCES; get_file(new_exe_file); - old_exe_file = xchg(&mm->exe_file, new_exe_file); + /* set the new file */ + mmap_write_lock(mm); + old_exe_file = rcu_dereference_raw(mm->exe_file); + rcu_assign_pointer(mm->exe_file, new_exe_file); + mmap_write_unlock(mm); + if (old_exe_file) { - /* - * Don't race with dup_mmap() getting the file and disallowing - * write access while someone might open the file writable. - */ - mmap_read_lock(mm); allow_write_access(old_exe_file); fput(old_exe_file); - mmap_read_unlock(mm); } return 0; } diff --git a/kernel/gcov/Makefile b/kernel/gcov/Makefile index 16f8ecc7d882..ccd02afaeffb 100644 --- a/kernel/gcov/Makefile +++ b/kernel/gcov/Makefile @@ -3,4 +3,6 @@ ccflags-y := -DSRCTREE='"$(srctree)"' -DOBJTREE='"$(objtree)"' obj-y := base.o fs.o obj-$(CONFIG_CC_IS_GCC) += gcc_base.o gcc_4_7.o +CFLAGS_gcc_base.o += -Wno-missing-prototypes -Wno-missing-declarations obj-$(CONFIG_CC_IS_CLANG) += clang.o +CFLAGS_clang.o += -Wno-missing-prototypes -Wno-missing-declarations diff --git a/kernel/kexec.c b/kernel/kexec.c index 92d301f98776..107f355eac10 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -129,6 +129,11 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments, if (flags & KEXEC_PRESERVE_CONTEXT) image->preserve_context = 1; +#ifdef CONFIG_CRASH_HOTPLUG + if (flags & KEXEC_UPDATE_ELFCOREHDR) + image->update_elfcorehdr = 1; +#endif + ret = machine_kexec_prepare(image); if (ret) goto out; diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index e2f2574d8b74..9dc728982d79 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -49,9 +49,6 @@ atomic_t __kexec_lock = ATOMIC_INIT(0); -/* Per cpu memory for storing cpu states in case of system crash. */ -note_buf_t __percpu *crash_notes; - /* Flag to indicate we are going to kexec a new kernel */ bool kexec_in_progress = false; @@ -277,6 +274,12 @@ struct kimage *do_kimage_alloc_init(void) /* Initialize the list of unusable pages */ INIT_LIST_HEAD(&image->unusable_pages); +#ifdef CONFIG_CRASH_HOTPLUG + image->hp_action = KEXEC_CRASH_HP_NONE; + image->elfcorehdr_index = -1; + image->elfcorehdr_updated = false; +#endif + return image; } @@ -1218,40 +1221,6 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) final_note(buf); } -static int __init crash_notes_memory_init(void) -{ - /* Allocate memory for saving cpu registers. */ - size_t size, align; - - /* - * crash_notes could be allocated across 2 vmalloc pages when percpu - * is vmalloc based . vmalloc doesn't guarantee 2 continuous vmalloc - * pages are also on 2 continuous physical pages. In this case the - * 2nd part of crash_notes in 2nd page could be lost since only the - * starting address and size of crash_notes are exported through sysfs. - * Here round up the size of crash_notes to the nearest power of two - * and pass it to __alloc_percpu as align value. This can make sure - * crash_notes is allocated inside one physical page. - */ - size = sizeof(note_buf_t); - align = min(roundup_pow_of_two(sizeof(note_buf_t)), PAGE_SIZE); - - /* - * Break compile if size is bigger than PAGE_SIZE since crash_notes - * definitely will be in 2 pages with that. - */ - BUILD_BUG_ON(size > PAGE_SIZE); - - crash_notes = __alloc_percpu(size, align); - if (!crash_notes) { - pr_warn("Memory allocation for saving cpu register states failed\n"); - return -ENOMEM; - } - return 0; -} -subsys_initcall(crash_notes_memory_init); - - /* * Move into place and start executing a preloaded standalone * executable. If nothing was preloaded return an error. diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 881ba0d1714c..e2ec9d7b9a1f 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -685,7 +685,7 @@ static int kexec_calculate_store_digests(struct kimage *image) struct kexec_sha_region *sha_regions; struct purgatory_info *pi = &image->purgatory_info; - if (!IS_ENABLED(CONFIG_ARCH_HAS_KEXEC_PURGATORY)) + if (!IS_ENABLED(CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY)) return 0; zero_buf = __va(page_to_pfn(ZERO_PAGE(0)) << PAGE_SHIFT); @@ -726,6 +726,12 @@ static int kexec_calculate_store_digests(struct kimage *image) for (j = i = 0; i < image->nr_segments; i++) { struct kexec_segment *ksegment; +#ifdef CONFIG_CRASH_HOTPLUG + /* Exclude elfcorehdr segment to allow future changes via hotplug */ + if (j == image->elfcorehdr_index) + continue; +#endif + ksegment = &image->segment[i]; /* * Skip purgatory as it will be modified once we put digest @@ -790,7 +796,7 @@ out: return ret; } -#ifdef CONFIG_ARCH_HAS_KEXEC_PURGATORY +#ifdef CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY /* * kexec_purgatory_setup_kbuf - prepare buffer to load purgatory. * @pi: Purgatory to be loaded. @@ -1150,185 +1156,4 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name, return 0; } -#endif /* CONFIG_ARCH_HAS_KEXEC_PURGATORY */ - -int crash_exclude_mem_range(struct crash_mem *mem, - unsigned long long mstart, unsigned long long mend) -{ - int i, j; - unsigned long long start, end, p_start, p_end; - struct range temp_range = {0, 0}; - - for (i = 0; i < mem->nr_ranges; i++) { - start = mem->ranges[i].start; - end = mem->ranges[i].end; - p_start = mstart; - p_end = mend; - - if (mstart > end || mend < start) - continue; - - /* Truncate any area outside of range */ - if (mstart < start) - p_start = start; - if (mend > end) - p_end = end; - - /* Found completely overlapping range */ - if (p_start == start && p_end == end) { - mem->ranges[i].start = 0; - mem->ranges[i].end = 0; - if (i < mem->nr_ranges - 1) { - /* Shift rest of the ranges to left */ - for (j = i; j < mem->nr_ranges - 1; j++) { - mem->ranges[j].start = - mem->ranges[j+1].start; - mem->ranges[j].end = - mem->ranges[j+1].end; - } - - /* - * Continue to check if there are another overlapping ranges - * from the current position because of shifting the above - * mem ranges. - */ - i--; - mem->nr_ranges--; - continue; - } - mem->nr_ranges--; - return 0; - } - - if (p_start > start && p_end < end) { - /* Split original range */ - mem->ranges[i].end = p_start - 1; - temp_range.start = p_end + 1; - temp_range.end = end; - } else if (p_start != start) - mem->ranges[i].end = p_start - 1; - else - mem->ranges[i].start = p_end + 1; - break; - } - - /* If a split happened, add the split to array */ - if (!temp_range.end) - return 0; - - /* Split happened */ - if (i == mem->max_nr_ranges - 1) - return -ENOMEM; - - /* Location where new range should go */ - j = i + 1; - if (j < mem->nr_ranges) { - /* Move over all ranges one slot towards the end */ - for (i = mem->nr_ranges - 1; i >= j; i--) - mem->ranges[i + 1] = mem->ranges[i]; - } - - mem->ranges[j].start = temp_range.start; - mem->ranges[j].end = temp_range.end; - mem->nr_ranges++; - return 0; -} - -int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map, - void **addr, unsigned long *sz) -{ - Elf64_Ehdr *ehdr; - Elf64_Phdr *phdr; - unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz; - unsigned char *buf; - unsigned int cpu, i; - unsigned long long notes_addr; - unsigned long mstart, mend; - - /* extra phdr for vmcoreinfo ELF note */ - nr_phdr = nr_cpus + 1; - nr_phdr += mem->nr_ranges; - - /* - * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping - * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64). - * I think this is required by tools like gdb. So same physical - * memory will be mapped in two ELF headers. One will contain kernel - * text virtual addresses and other will have __va(physical) addresses. - */ - - nr_phdr++; - elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr); - elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN); - - buf = vzalloc(elf_sz); - if (!buf) - return -ENOMEM; - - ehdr = (Elf64_Ehdr *)buf; - phdr = (Elf64_Phdr *)(ehdr + 1); - memcpy(ehdr->e_ident, ELFMAG, SELFMAG); - ehdr->e_ident[EI_CLASS] = ELFCLASS64; - ehdr->e_ident[EI_DATA] = ELFDATA2LSB; - ehdr->e_ident[EI_VERSION] = EV_CURRENT; - ehdr->e_ident[EI_OSABI] = ELF_OSABI; - memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD); - ehdr->e_type = ET_CORE; - ehdr->e_machine = ELF_ARCH; - ehdr->e_version = EV_CURRENT; - ehdr->e_phoff = sizeof(Elf64_Ehdr); - ehdr->e_ehsize = sizeof(Elf64_Ehdr); - ehdr->e_phentsize = sizeof(Elf64_Phdr); - - /* Prepare one phdr of type PT_NOTE for each present CPU */ - for_each_present_cpu(cpu) { - phdr->p_type = PT_NOTE; - notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu)); - phdr->p_offset = phdr->p_paddr = notes_addr; - phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t); - (ehdr->e_phnum)++; - phdr++; - } - - /* Prepare one PT_NOTE header for vmcoreinfo */ - phdr->p_type = PT_NOTE; - phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note(); - phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE; - (ehdr->e_phnum)++; - phdr++; - - /* Prepare PT_LOAD type program header for kernel text region */ - if (need_kernel_map) { - phdr->p_type = PT_LOAD; - phdr->p_flags = PF_R|PF_W|PF_X; - phdr->p_vaddr = (unsigned long) _text; - phdr->p_filesz = phdr->p_memsz = _end - _text; - phdr->p_offset = phdr->p_paddr = __pa_symbol(_text); - ehdr->e_phnum++; - phdr++; - } - - /* Go through all the ranges in mem->ranges[] and prepare phdr */ - for (i = 0; i < mem->nr_ranges; i++) { - mstart = mem->ranges[i].start; - mend = mem->ranges[i].end; - - phdr->p_type = PT_LOAD; - phdr->p_flags = PF_R|PF_W|PF_X; - phdr->p_offset = mstart; - - phdr->p_paddr = mstart; - phdr->p_vaddr = (unsigned long) __va(mstart); - phdr->p_filesz = phdr->p_memsz = mend - mstart + 1; - phdr->p_align = 0; - ehdr->e_phnum++; - pr_debug("Crash PT_LOAD ELF header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n", - phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz, - ehdr->e_phnum, phdr->p_offset); - phdr++; - } - - *addr = buf; - *sz = elf_sz; - return 0; -} +#endif /* CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY */ diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c index aad7a3bfd846..1d4bc493b2f4 100644 --- a/kernel/ksysfs.c +++ b/kernel/ksysfs.c @@ -165,6 +165,18 @@ static ssize_t vmcoreinfo_show(struct kobject *kobj, } KERNEL_ATTR_RO(vmcoreinfo); +#ifdef CONFIG_CRASH_HOTPLUG +static ssize_t crash_elfcorehdr_size_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + unsigned int sz = crash_get_elfcorehdr_size(); + + return sysfs_emit(buf, "%u\n", sz); +} +KERNEL_ATTR_RO(crash_elfcorehdr_size); + +#endif + #endif /* CONFIG_CRASH_CORE */ /* whether file capabilities are enabled */ @@ -255,6 +267,9 @@ static struct attribute * kernel_attrs[] = { #endif #ifdef CONFIG_CRASH_CORE &vmcoreinfo_attr.attr, +#ifdef CONFIG_CRASH_HOTPLUG + &crash_elfcorehdr_size_attr.attr, +#endif #endif #ifndef CONFIG_TINY_RCU &rcu_expedited_attr.attr, diff --git a/kernel/kthread.c b/kernel/kthread.c index 4fff7df17a68..1eea53050bab 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -159,11 +159,10 @@ bool kthread_should_stop(void) } EXPORT_SYMBOL(kthread_should_stop); -bool __kthread_should_park(struct task_struct *k) +static bool __kthread_should_park(struct task_struct *k) { return test_bit(KTHREAD_SHOULD_PARK, &to_kthread(k)->flags); } -EXPORT_SYMBOL_GPL(__kthread_should_park); /** * kthread_should_park - should this kthread park now? diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 111607d91489..e85b5ad3e206 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -819,34 +819,26 @@ static int very_verbose(struct lock_class *class) * Is this the address of a static object: */ #ifdef __KERNEL__ -/* - * Check if an address is part of freed initmem. After initmem is freed, - * memory can be allocated from it, and such allocations would then have - * addresses within the range [_stext, _end]. - */ -#ifndef arch_is_kernel_initmem_freed -static int arch_is_kernel_initmem_freed(unsigned long addr) -{ - if (system_state < SYSTEM_FREEING_INITMEM) - return 0; - - return init_section_contains((void *)addr, 1); -} -#endif - static int static_obj(const void *obj) { - unsigned long start = (unsigned long) &_stext, - end = (unsigned long) &_end, - addr = (unsigned long) obj; + unsigned long addr = (unsigned long) obj; - if (arch_is_kernel_initmem_freed(addr)) - return 0; + if (is_kernel_core_data(addr)) + return 1; + + /* + * keys are allowed in the __ro_after_init section. + */ + if (is_kernel_rodata(addr)) + return 1; /* - * static variable? + * in initdata section and used during bootup only? + * NOTE: On some platforms the initdata section is + * outside of the _stext ... _end range. */ - if ((addr >= start) && (addr < end)) + if (system_state < SYSTEM_FREEING_INITMEM && + init_section_contains((void *)addr, 1)) return 1; /* diff --git a/kernel/relay.c b/kernel/relay.c index a80fa01042e9..83fe0325cde1 100644 --- a/kernel/relay.c +++ b/kernel/relay.c @@ -375,7 +375,7 @@ static struct dentry *relay_create_buf_file(struct rchan *chan, */ static struct rchan_buf *relay_open_buf(struct rchan *chan, unsigned int cpu) { - struct rchan_buf *buf = NULL; + struct rchan_buf *buf; struct dentry *dentry; if (chan->is_global) diff --git a/kernel/signal.c b/kernel/signal.c index 128e9bb3d1a2..09019017d669 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -22,6 +22,7 @@ #include <linux/sched/cputime.h> #include <linux/file.h> #include <linux/fs.h> +#include <linux/mm.h> #include <linux/proc_fs.h> #include <linux/tty.h> #include <linux/binfmts.h> @@ -1260,7 +1261,17 @@ int send_signal_locked(int sig, struct kernel_siginfo *info, static void print_fatal_signal(int signr) { struct pt_regs *regs = task_pt_regs(current); - pr_info("potentially unexpected fatal signal %d.\n", signr); + struct file *exe_file; + + exe_file = get_task_exe_file(current); + if (exe_file) { + pr_info("%pD: %s: potentially unexpected fatal signal %d.\n", + exe_file, current->comm, signr); + fput(exe_file); + } else { + pr_info("%s: potentially unexpected fatal signal %d.\n", + current->comm, signr); + } #if defined(__i386__) && !defined(__arch_um__) pr_info("code at %08lx: ", regs->ip); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index be38276a365f..d145305d95fe 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -151,9 +151,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) */ if (is_hardlockup(cpu)) { unsigned int this_cpu = smp_processor_id(); - struct cpumask backtrace_mask; - - cpumask_copy(&backtrace_mask, cpu_online_mask); /* Only print hardlockups once. */ if (per_cpu(watchdog_hardlockup_warned, cpu)) @@ -167,10 +164,8 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) show_regs(regs); else dump_stack(); - cpumask_clear_cpu(cpu, &backtrace_mask); } else { - if (trigger_single_cpu_backtrace(cpu)) - cpumask_clear_cpu(cpu, &backtrace_mask); + trigger_single_cpu_backtrace(cpu); } /* @@ -179,7 +174,7 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) */ if (sysctl_hardlockup_all_cpu_backtrace && !test_and_set_bit(0, &watchdog_hardlockup_all_cpu_dumped)) - trigger_cpumask_backtrace(&backtrace_mask); + trigger_allbutcpu_cpu_backtrace(cpu); if (hardlockup_panic) nmi_panic(regs, "Hard LOCKUP"); @@ -523,7 +518,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) dump_stack(); if (softlockup_all_cpu_backtrace) { - trigger_allbutself_cpu_backtrace(); + trigger_allbutcpu_cpu_backtrace(smp_processor_id()); clear_bit_unlock(0, &soft_lockup_nmi_warn); } diff --git a/lib/Kconfig b/lib/Kconfig index 5c2da561c516..c686f4adc124 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -415,6 +415,7 @@ config REED_SOLOMON_DEC16 # config BCH tristate + select BITREVERSE config BCH_CONST_PARAMS bool diff --git a/lib/bch.c b/lib/bch.c index c8095f30f254..5f71fd76eca8 100644 --- a/lib/bch.c +++ b/lib/bch.c @@ -71,6 +71,7 @@ #include <linux/module.h> #include <linux/slab.h> #include <linux/bitops.h> +#include <linux/bitrev.h> #include <asm/byteorder.h> #include <linux/bch.h> @@ -114,47 +115,12 @@ struct gf_poly_deg1 { unsigned int c[2]; }; -static u8 swap_bits_table[] = { - 0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0, - 0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0, - 0x08, 0x88, 0x48, 0xc8, 0x28, 0xa8, 0x68, 0xe8, - 0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8, 0x78, 0xf8, - 0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4, - 0x14, 0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4, - 0x0c, 0x8c, 0x4c, 0xcc, 0x2c, 0xac, 0x6c, 0xec, - 0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c, 0xfc, - 0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2, - 0x12, 0x92, 0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2, - 0x0a, 0x8a, 0x4a, 0xca, 0x2a, 0xaa, 0x6a, 0xea, - 0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa, - 0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6, - 0x16, 0x96, 0x56, 0xd6, 0x36, 0xb6, 0x76, 0xf6, - 0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae, 0x6e, 0xee, - 0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe, - 0x01, 0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1, - 0x11, 0x91, 0x51, 0xd1, 0x31, 0xb1, 0x71, 0xf1, - 0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69, 0xe9, - 0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9, - 0x05, 0x85, 0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5, - 0x15, 0x95, 0x55, 0xd5, 0x35, 0xb5, 0x75, 0xf5, - 0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed, - 0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd, - 0x03, 0x83, 0x43, 0xc3, 0x23, 0xa3, 0x63, 0xe3, - 0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3, 0x73, 0xf3, - 0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb, - 0x1b, 0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb, - 0x07, 0x87, 0x47, 0xc7, 0x27, 0xa7, 0x67, 0xe7, - 0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77, 0xf7, - 0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef, - 0x1f, 0x9f, 0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff, -}; - static u8 swap_bits(struct bch_control *bch, u8 in) { if (!bch->swap_bits) return in; - return swap_bits_table[in]; + return bitrev8(in); } /* diff --git a/lib/error-inject.c b/lib/error-inject.c index 32c14770508e..887acd9a6ea6 100644 --- a/lib/error-inject.c +++ b/lib/error-inject.c @@ -217,8 +217,6 @@ static int __init ei_debugfs_init(void) struct dentry *dir, *file; dir = debugfs_create_dir("error_injection", NULL); - if (!dir) - return -ENOMEM; file = debugfs_create_file("list", 0444, dir, NULL, &ei_fops); if (!file) { diff --git a/lib/kstrtox.c b/lib/kstrtox.c index 08c14019841a..d586e6af5e5a 100644 --- a/lib/kstrtox.c +++ b/lib/kstrtox.c @@ -59,7 +59,7 @@ unsigned int _parse_integer_limit(const char *s, unsigned int base, unsigned lon rv = 0; while (max_chars--) { unsigned int c = *s; - unsigned int lc = c | 0x20; /* don't tolower() this line */ + unsigned int lc = _tolower(c); unsigned int val; if ('0' <= c && c <= '9') diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c index 5274bbb026d7..33c154264bfe 100644 --- a/lib/nmi_backtrace.c +++ b/lib/nmi_backtrace.c @@ -34,7 +34,7 @@ static unsigned long backtrace_flag; * they are passed being updated as a side effect of this call. */ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask, - bool exclude_self, + int exclude_cpu, void (*raise)(cpumask_t *mask)) { int i, this_cpu = get_cpu(); @@ -49,8 +49,8 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask, } cpumask_copy(to_cpumask(backtrace_mask), mask); - if (exclude_self) - cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask)); + if (exclude_cpu != -1) + cpumask_clear_cpu(exclude_cpu, to_cpumask(backtrace_mask)); /* * Don't try to send an NMI to this cpu; it may work on some diff --git a/lib/notifier-error-inject.c b/lib/notifier-error-inject.c index 2b24ea6c9497..954c3412d22d 100644 --- a/lib/notifier-error-inject.c +++ b/lib/notifier-error-inject.c @@ -83,9 +83,6 @@ static int __init err_inject_init(void) notifier_err_inject_dir = debugfs_create_dir("notifier-error-inject", NULL); - if (!notifier_err_inject_dir) - return -ENOMEM; - return 0; } diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 67e6f83fe0f8..717dcb830127 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -368,16 +368,13 @@ static int dmirror_do_read(struct dmirror *dmirror, unsigned long start, for (pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++) { void *entry; struct page *page; - void *tmp; entry = xa_load(&dmirror->pt, pfn); page = xa_untag_pointer(entry); if (!page) return -ENOENT; - tmp = kmap(page); - memcpy(ptr, tmp, PAGE_SIZE); - kunmap(page); + memcpy_from_page(ptr, page, 0, PAGE_SIZE); ptr += PAGE_SIZE; bounce->cpages++; @@ -437,16 +434,13 @@ static int dmirror_do_write(struct dmirror *dmirror, unsigned long start, for (pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++) { void *entry; struct page *page; - void *tmp; entry = xa_load(&dmirror->pt, pfn); page = xa_untag_pointer(entry); if (!page || xa_pointer_tag(entry) != DPT_XA_TAG_WRITE) return -ENOENT; - tmp = kmap(page); - memcpy(tmp, ptr, PAGE_SIZE); - kunmap(page); + memcpy_to_page(page, 0, ptr, PAGE_SIZE); ptr += PAGE_SIZE; bounce->cpages++; diff --git a/lib/test_printf.c b/lib/test_printf.c index 7677ebccf3c3..69b6a5e177f2 100644 --- a/lib/test_printf.c +++ b/lib/test_printf.c @@ -12,6 +12,7 @@ #include <linux/random.h> #include <linux/rtc.h> #include <linux/slab.h> +#include <linux/sprintf.h> #include <linux/string.h> #include <linux/bitmap.h> @@ -41,8 +42,6 @@ KSTM_MODULE_GLOBALS(); static char *test_buffer __initdata; static char *alloced_buffer __initdata; -extern bool no_hash_pointers; - static int __printf(4, 0) __init do_test(int bufsize, const char *expect, int elen, const char *fmt, va_list ap) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 40f560959b16..afb88b24fa74 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -34,6 +34,7 @@ #include <linux/dcache.h> #include <linux/cred.h> #include <linux/rtc.h> +#include <linux/sprintf.h> #include <linux/time.h> #include <linux/uuid.h> #include <linux/of.h> diff --git a/mm/kfence/report.c b/mm/kfence/report.c index 197430a5be4a..c509aed326ce 100644 --- a/mm/kfence/report.c +++ b/mm/kfence/report.c @@ -13,6 +13,7 @@ #include <linux/printk.h> #include <linux/sched/debug.h> #include <linux/seq_file.h> +#include <linux/sprintf.h> #include <linux/stacktrace.h> #include <linux/string.h> #include <trace/events/error_report.h> @@ -26,8 +27,6 @@ #define ARCH_FUNC_PREFIX "" #endif -extern bool no_hash_pointers; - /* Helper function to either print to a seq_file or to console. */ __printf(2, 3) static void seq_con_printf(struct seq_file *seq, const char *fmt, ...) diff --git a/scripts/bloat-o-meter b/scripts/bloat-o-meter index 36303afa9dfc..888ce286a351 100755 --- a/scripts/bloat-o-meter +++ b/scripts/bloat-o-meter @@ -100,12 +100,12 @@ def print_result(symboltype, symbolformat): print("Total: Before=%d, After=%d, chg %+.2f%%" % (otot, ntot, percent)) if args.c: - print_result("Function", "tT") - print_result("Data", "dDbB") + print_result("Function", "tTwW") + print_result("Data", "dDbBvV") print_result("RO Data", "rR") elif args.d: - print_result("Data", "dDbBrR") + print_result("Data", "dDbBrRvV") elif args.t: - print_result("Function", "tT") + print_result("Function", "tTwW") else: - print_result("Function", "tTdDbBrR") + print_result("Function", "tTdDbBrRvVwW") diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index a9841148cde2..7d16f863edf1 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -74,6 +74,8 @@ my $git_command ='export LANGUAGE=en_US.UTF-8; git'; my $tabsize = 8; my ${CONFIG_} = "CONFIG_"; +my %maybe_linker_symbol; # for externs in c exceptions, when seen in *vmlinux.lds.h + sub help { my ($exitcode) = @_; @@ -3270,7 +3272,7 @@ sub process { # A Fixes:, link or signature tag line $commit_log_possible_stack_dump)) { WARN("COMMIT_LOG_LONG_LINE", - "Possible unwrapped commit description (prefer a maximum 75 chars per line)\n" . $herecurr); + "Prefer a maximum 75 chars per line (possible unwrapped commit description?)\n" . $herecurr); $commit_log_long_line = 1; } @@ -6051,6 +6053,9 @@ sub process { # check for line continuations outside of #defines, preprocessor #, and asm + } elsif ($realfile =~ m@/vmlinux.lds.h$@) { + $line =~ s/(\w+)/$maybe_linker_symbol{$1}++/ge; + #print "REAL: $realfile\nln: $line\nkeys:", sort keys %maybe_linker_symbol; } else { if ($prevline !~ /^..*\\$/ && $line !~ /^\+\s*\#.*\\$/ && # preprocessor @@ -7120,6 +7125,21 @@ sub process { } } elsif ($realfile =~ /\.c$/ && defined $stat && + $stat =~ /^\+extern struct\s+(\w+)\s+(\w+)\[\];/) + { + my ($st_type, $st_name) = ($1, $2); + + for my $s (keys %maybe_linker_symbol) { + #print "Linker symbol? $st_name : $s\n"; + goto LIKELY_LINKER_SYMBOL + if $st_name =~ /$s/; + } + WARN("AVOID_EXTERNS", + "found a file-scoped extern type:$st_type name:$st_name in .c file\n" + . "is this a linker symbol ?\n" . $herecurr); + LIKELY_LINKER_SYMBOL: + + } elsif ($realfile =~ /\.c$/ && defined $stat && $stat =~ /^.\s*extern\s+/) { WARN("AVOID_EXTERNS", diff --git a/scripts/gdb/linux/constants.py.in b/scripts/gdb/linux/constants.py.in index 50a92c4e9984..e3517d4ab8ec 100644 --- a/scripts/gdb/linux/constants.py.in +++ b/scripts/gdb/linux/constants.py.in @@ -18,8 +18,11 @@ #include <linux/irq.h> #include <linux/mount.h> #include <linux/of_fdt.h> +#include <linux/page_ext.h> #include <linux/radix-tree.h> +#include <linux/slab.h> #include <linux/threads.h> +#include <linux/vmalloc.h> /* We need to stringify expanded macros so that they can be parsed */ @@ -64,6 +67,9 @@ LX_GDBPARSED(IRQ_HIDDEN) /* linux/module.h */ LX_GDBPARSED(MOD_TEXT) +LX_GDBPARSED(MOD_DATA) +LX_GDBPARSED(MOD_RODATA) +LX_GDBPARSED(MOD_RO_AFTER_INIT) /* linux/mount.h */ LX_VALUE(MNT_NOSUID) @@ -86,6 +92,28 @@ LX_GDBPARSED(RADIX_TREE_MAP_SIZE) LX_GDBPARSED(RADIX_TREE_MAP_SHIFT) LX_GDBPARSED(RADIX_TREE_MAP_MASK) +/* linux/vmalloc.h */ +LX_VALUE(VM_IOREMAP) +LX_VALUE(VM_ALLOC) +LX_VALUE(VM_MAP) +LX_VALUE(VM_USERMAP) +LX_VALUE(VM_DMA_COHERENT) + +/* linux/page_ext.h */ +if IS_BUILTIN(CONFIG_PAGE_OWNER): + LX_GDBPARSED(PAGE_EXT_OWNER) + LX_GDBPARSED(PAGE_EXT_OWNER_ALLOCATED) + +/* linux/slab.h */ +LX_GDBPARSED(SLAB_RED_ZONE) +LX_GDBPARSED(SLAB_POISON) +LX_GDBPARSED(SLAB_KMALLOC) +LX_GDBPARSED(SLAB_HWCACHE_ALIGN) +LX_GDBPARSED(SLAB_CACHE_DMA) +LX_GDBPARSED(SLAB_CACHE_DMA32) +LX_GDBPARSED(SLAB_STORE_USER) +LX_GDBPARSED(SLAB_PANIC) + /* Kernel Configs */ LX_CONFIG(CONFIG_GENERIC_CLOCKEVENTS) LX_CONFIG(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) @@ -102,3 +130,30 @@ LX_CONFIG(CONFIG_X86_MCE_AMD) LX_CONFIG(CONFIG_X86_MCE) LX_CONFIG(CONFIG_X86_IO_APIC) LX_CONFIG(CONFIG_HAVE_KVM) +LX_CONFIG(CONFIG_NUMA) +LX_CONFIG(CONFIG_ARM64) +LX_CONFIG(CONFIG_ARM64_4K_PAGES) +LX_CONFIG(CONFIG_ARM64_16K_PAGES) +LX_CONFIG(CONFIG_ARM64_64K_PAGES) +if IS_BUILTIN(CONFIG_ARM64): + LX_VALUE(CONFIG_ARM64_PA_BITS) + LX_VALUE(CONFIG_ARM64_VA_BITS) + LX_VALUE(CONFIG_ARM64_PAGE_SHIFT) + LX_VALUE(CONFIG_ARCH_FORCE_MAX_ORDER) +LX_CONFIG(CONFIG_SPARSEMEM) +LX_CONFIG(CONFIG_SPARSEMEM_EXTREME) +LX_CONFIG(CONFIG_SPARSEMEM_VMEMMAP) +LX_CONFIG(CONFIG_KASAN) +LX_CONFIG(CONFIG_KASAN_GENERIC) +LX_CONFIG(CONFIG_KASAN_SW_TAGS) +LX_CONFIG(CONFIG_KASAN_HW_TAGS) +if IS_BUILTIN(CONFIG_KASAN_GENERIC) or IS_BUILTIN(CONFIG_KASAN_SW_TAGS): + LX_VALUE(CONFIG_KASAN_SHADOW_OFFSET) +LX_CONFIG(CONFIG_VMAP_STACK) +if IS_BUILTIN(CONFIG_NUMA): + LX_VALUE(CONFIG_NODES_SHIFT) +LX_CONFIG(CONFIG_DEBUG_VIRTUAL) +LX_CONFIG(CONFIG_STACKDEPOT) +LX_CONFIG(CONFIG_PAGE_OWNER) +LX_CONFIG(CONFIG_SLUB_DEBUG) +LX_CONFIG(CONFIG_SLAB_FREELIST_HARDENED) diff --git a/scripts/gdb/linux/mm.py b/scripts/gdb/linux/mm.py index 30d837f3dfae..ad5641dcb068 100644 --- a/scripts/gdb/linux/mm.py +++ b/scripts/gdb/linux/mm.py @@ -1,222 +1,398 @@ -# SPDX-License-Identifier: GPL-2.0-only +# SPDX-License-Identifier: GPL-2.0 # -# gdb helper commands and functions for Linux kernel debugging -# -# routines to introspect page table +# Copyright (c) 2023 MediaTek Inc. # # Authors: -# Dmitrii Bundin <dmitrii.bundin.a@gmail.com> +# Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> # import gdb +import math +from linux import utils, constants + +def DIV_ROUND_UP(n,d): + return ((n) + (d) - 1) // (d) -from linux import utils +def test_bit(nr, addr): + if addr.dereference() & (0x1 << nr): + return True + else: + return False -PHYSICAL_ADDRESS_MASK = gdb.parse_and_eval('0xfffffffffffff') +class page_ops(): + ops = None + def __init__(self): + if not constants.LX_CONFIG_SPARSEMEM_VMEMMAP: + raise gdb.GdbError('Only support CONFIG_SPARSEMEM_VMEMMAP now') + if constants.LX_CONFIG_ARM64 and utils.is_target_arch('aarch64'): + self.ops = aarch64_page_ops() + else: + raise gdb.GdbError('Only support aarch64 now') +class aarch64_page_ops(): + def __init__(self): + self.SUBSECTION_SHIFT = 21 + self.SEBSECTION_SIZE = 1 << self.SUBSECTION_SHIFT + self.MODULES_VSIZE = 128 * 1024 * 1024 -def page_mask(level=1): - # 4KB - if level == 1: - return gdb.parse_and_eval('(u64) ~0xfff') - # 2MB - elif level == 2: - return gdb.parse_and_eval('(u64) ~0x1fffff') - # 1GB - elif level == 3: - return gdb.parse_and_eval('(u64) ~0x3fffffff') - else: - raise Exception(f'Unknown page level: {level}') - - -#page_offset_base in case CONFIG_DYNAMIC_MEMORY_LAYOUT is disabled -POB_NO_DYNAMIC_MEM_LAYOUT = '0xffff888000000000' -def _page_offset_base(): - pob_symbol = gdb.lookup_global_symbol('page_offset_base') - pob = pob_symbol.name if pob_symbol else POB_NO_DYNAMIC_MEM_LAYOUT - return gdb.parse_and_eval(pob) - - -def is_bit_defined_tupled(data, offset): - return offset, bool(data >> offset & 1) - -def content_tupled(data, bit_start, bit_end): - return (bit_start, bit_end), data >> bit_start & ((1 << (1 + bit_end - bit_start)) - 1) - -def entry_va(level, phys_addr, translating_va): - def start_bit(level): - if level == 5: - return 48 - elif level == 4: - return 39 - elif level == 3: - return 30 - elif level == 2: - return 21 - elif level == 1: - return 12 + if constants.LX_CONFIG_ARM64_64K_PAGES: + self.SECTION_SIZE_BITS = 29 + else: + self.SECTION_SIZE_BITS = 27 + self.MAX_PHYSMEM_BITS = constants.LX_CONFIG_ARM64_VA_BITS + + self.PAGE_SHIFT = constants.LX_CONFIG_ARM64_PAGE_SHIFT + self.PAGE_SIZE = 1 << self.PAGE_SHIFT + self.PAGE_MASK = (~(self.PAGE_SIZE - 1)) & ((1 << 64) - 1) + + self.VA_BITS = constants.LX_CONFIG_ARM64_VA_BITS + if self.VA_BITS > 48: + self.VA_BITS_MIN = 48 + self.vabits_actual = gdb.parse_and_eval('vabits_actual') + else: + self.VA_BITS_MIN = self.VA_BITS + self.vabits_actual = self.VA_BITS + self.kimage_voffset = gdb.parse_and_eval('kimage_voffset') & ((1 << 64) - 1) + + self.SECTIONS_SHIFT = self.MAX_PHYSMEM_BITS - self.SECTION_SIZE_BITS + + if str(constants.LX_CONFIG_ARCH_FORCE_MAX_ORDER).isdigit(): + self.MAX_ORDER = constants.LX_CONFIG_ARCH_FORCE_MAX_ORDER + else: + self.MAX_ORDER = 11 + + self.MAX_ORDER_NR_PAGES = 1 << (self.MAX_ORDER - 1) + self.PFN_SECTION_SHIFT = self.SECTION_SIZE_BITS - self.PAGE_SHIFT + self.NR_MEM_SECTIONS = 1 << self.SECTIONS_SHIFT + self.PAGES_PER_SECTION = 1 << self.PFN_SECTION_SHIFT + self.PAGE_SECTION_MASK = (~(self.PAGES_PER_SECTION - 1)) & ((1 << 64) - 1) + + if constants.LX_CONFIG_SPARSEMEM_EXTREME: + self.SECTIONS_PER_ROOT = self.PAGE_SIZE // gdb.lookup_type("struct mem_section").sizeof + else: + self.SECTIONS_PER_ROOT = 1 + + self.NR_SECTION_ROOTS = DIV_ROUND_UP(self.NR_MEM_SECTIONS, self.SECTIONS_PER_ROOT) + self.SECTION_ROOT_MASK = self.SECTIONS_PER_ROOT - 1 + self.SUBSECTION_SHIFT = 21 + self.SEBSECTION_SIZE = 1 << self.SUBSECTION_SHIFT + self.PFN_SUBSECTION_SHIFT = self.SUBSECTION_SHIFT - self.PAGE_SHIFT + self.PAGES_PER_SUBSECTION = 1 << self.PFN_SUBSECTION_SHIFT + + self.SECTION_HAS_MEM_MAP = 1 << int(gdb.parse_and_eval('SECTION_HAS_MEM_MAP_BIT')) + self.SECTION_IS_EARLY = 1 << int(gdb.parse_and_eval('SECTION_IS_EARLY_BIT')) + + self.struct_page_size = utils.get_page_type().sizeof + self.STRUCT_PAGE_MAX_SHIFT = (int)(math.log(self.struct_page_size, 2)) + + self.PAGE_OFFSET = self._PAGE_OFFSET(self.VA_BITS) + self.MODULES_VADDR = self._PAGE_END(self.VA_BITS_MIN) + self.MODULES_END = self.MODULES_VADDR + self.MODULES_VSIZE + + self.VMEMMAP_SHIFT = (self.PAGE_SHIFT - self.STRUCT_PAGE_MAX_SHIFT) + self.VMEMMAP_SIZE = ((self._PAGE_END(self.VA_BITS_MIN) - self.PAGE_OFFSET) >> self.VMEMMAP_SHIFT) + self.VMEMMAP_START = (-(1 << (self.VA_BITS - self.VMEMMAP_SHIFT))) & 0xffffffffffffffff + self.VMEMMAP_END = self.VMEMMAP_START + self.VMEMMAP_SIZE + + self.VMALLOC_START = self.MODULES_END + self.VMALLOC_END = self.VMEMMAP_START - 256 * 1024 * 1024 + + self.memstart_addr = gdb.parse_and_eval("memstart_addr") + self.PHYS_OFFSET = self.memstart_addr + self.vmemmap = gdb.Value(self.VMEMMAP_START).cast(utils.get_page_type().pointer()) - (self.memstart_addr >> self.PAGE_SHIFT) + + self.KERNEL_START = gdb.parse_and_eval("_text") + self.KERNEL_END = gdb.parse_and_eval("_end") + + if constants.LX_CONFIG_KASAN_GENERIC or constants.LX_CONFIG_KASAN_SW_TAGS: + if constants.LX_CONFIG_KASAN_GENERIC: + self.KASAN_SHADOW_SCALE_SHIFT = 3 else: - raise Exception(f'Unknown level {level}') - - entry_offset = ((translating_va >> start_bit(level)) & 511) * 8 - entry_va = _page_offset_base() + phys_addr + entry_offset - return entry_va - -class Cr3(): - def __init__(self, cr3, page_levels): - self.cr3 = cr3 - self.page_levels = page_levels - self.page_level_write_through = is_bit_defined_tupled(cr3, 3) - self.page_level_cache_disabled = is_bit_defined_tupled(cr3, 4) - self.next_entry_physical_address = cr3 & PHYSICAL_ADDRESS_MASK & page_mask() - - def next_entry(self, va): - next_level = self.page_levels - return PageHierarchyEntry(entry_va(next_level, self.next_entry_physical_address, va), next_level) - - def mk_string(self): - return f"""\ -cr3: - {'cr3 binary data': <30} {hex(self.cr3)} - {'next entry physical address': <30} {hex(self.next_entry_physical_address)} - --- - {'bit' : <4} {self.page_level_write_through[0]: <10} {'page level write through': <30} {self.page_level_write_through[1]} - {'bit' : <4} {self.page_level_cache_disabled[0]: <10} {'page level cache disabled': <30} {self.page_level_cache_disabled[1]} -""" - - -class PageHierarchyEntry(): - def __init__(self, address, level): - data = int.from_bytes( - memoryview(gdb.selected_inferior().read_memory(address, 8)), - "little" - ) - if level == 1: - self.is_page = True - self.entry_present = is_bit_defined_tupled(data, 0) - self.read_write = is_bit_defined_tupled(data, 1) - self.user_access_allowed = is_bit_defined_tupled(data, 2) - self.page_level_write_through = is_bit_defined_tupled(data, 3) - self.page_level_cache_disabled = is_bit_defined_tupled(data, 4) - self.entry_was_accessed = is_bit_defined_tupled(data, 5) - self.dirty = is_bit_defined_tupled(data, 6) - self.pat = is_bit_defined_tupled(data, 7) - self.global_translation = is_bit_defined_tupled(data, 8) - self.page_physical_address = data & PHYSICAL_ADDRESS_MASK & page_mask(level) - self.next_entry_physical_address = None - self.hlat_restart_with_ordinary = is_bit_defined_tupled(data, 11) - self.protection_key = content_tupled(data, 59, 62) - self.executed_disable = is_bit_defined_tupled(data, 63) + self.KASAN_SHADOW_SCALE_SHIFT = 4 + self.KASAN_SHADOW_OFFSET = constants.LX_CONFIG_KASAN_SHADOW_OFFSET + self.KASAN_SHADOW_END = (1 << (64 - self.KASAN_SHADOW_SCALE_SHIFT)) + self.KASAN_SHADOW_OFFSET + self.PAGE_END = self.KASAN_SHADOW_END - (1 << (self.vabits_actual - self.KASAN_SHADOW_SCALE_SHIFT)) + else: + self.PAGE_END = self._PAGE_END(self.VA_BITS_MIN) + + if constants.LX_CONFIG_NUMA and constants.LX_CONFIG_NODES_SHIFT: + self.NODE_SHIFT = constants.LX_CONFIG_NODES_SHIFT + else: + self.NODE_SHIFT = 0 + + self.MAX_NUMNODES = 1 << self.NODE_SHIFT + + def SECTION_NR_TO_ROOT(self, sec): + return sec // self.SECTIONS_PER_ROOT + + def __nr_to_section(self, nr): + root = self.SECTION_NR_TO_ROOT(nr) + mem_section = gdb.parse_and_eval("mem_section") + return mem_section[root][nr & self.SECTION_ROOT_MASK] + + def pfn_to_section_nr(self, pfn): + return pfn >> self.PFN_SECTION_SHIFT + + def section_nr_to_pfn(self, sec): + return sec << self.PFN_SECTION_SHIFT + + def __pfn_to_section(self, pfn): + return self.__nr_to_section(self.pfn_to_section_nr(pfn)) + + def pfn_to_section(self, pfn): + return self.__pfn_to_section(pfn) + + def subsection_map_index(self, pfn): + return (pfn & ~(self.PAGE_SECTION_MASK)) // self.PAGES_PER_SUBSECTION + + def pfn_section_valid(self, ms, pfn): + if constants.LX_CONFIG_SPARSEMEM_VMEMMAP: + idx = self.subsection_map_index(pfn) + return test_bit(idx, ms['usage']['subsection_map']) + else: + return True + + def valid_section(self, mem_section): + if mem_section != None and (mem_section['section_mem_map'] & self.SECTION_HAS_MEM_MAP): + return True + return False + + def early_section(self, mem_section): + if mem_section != None and (mem_section['section_mem_map'] & self.SECTION_IS_EARLY): + return True + return False + + def pfn_valid(self, pfn): + ms = None + if self.PHYS_PFN(self.PFN_PHYS(pfn)) != pfn: + return False + if self.pfn_to_section_nr(pfn) >= self.NR_MEM_SECTIONS: + return False + ms = self.__pfn_to_section(pfn) + + if not self.valid_section(ms): + return False + return self.early_section(ms) or self.pfn_section_valid(ms, pfn) + + def _PAGE_OFFSET(self, va): + return (-(1 << (va))) & 0xffffffffffffffff + + def _PAGE_END(self, va): + return (-(1 << (va - 1))) & 0xffffffffffffffff + + def kasan_reset_tag(self, addr): + if constants.LX_CONFIG_KASAN_SW_TAGS or constants.LX_CONFIG_KASAN_HW_TAGS: + return int(addr) | (0xff << 56) + else: + return addr + + def __is_lm_address(self, addr): + if (addr - self.PAGE_OFFSET) < (self.PAGE_END - self.PAGE_OFFSET): + return True + else: + return False + def __lm_to_phys(self, addr): + return addr - self.PAGE_OFFSET + self.PHYS_OFFSET + + def __kimg_to_phys(self, addr): + return addr - self.kimage_voffset + + def __virt_to_phys_nodebug(self, va): + untagged_va = self.kasan_reset_tag(va) + if self.__is_lm_address(untagged_va): + return self.__lm_to_phys(untagged_va) + else: + return self.__kimg_to_phys(untagged_va) + + def __virt_to_phys(self, va): + if constants.LX_CONFIG_DEBUG_VIRTUAL: + if not self.__is_lm_address(self.kasan_reset_tag(va)): + raise gdb.GdbError("Warning: virt_to_phys used for non-linear address: 0x%lx\n" % va) + return self.__virt_to_phys_nodebug(va) + + def virt_to_phys(self, va): + return self.__virt_to_phys(va) + + def PFN_PHYS(self, pfn): + return pfn << self.PAGE_SHIFT + + def PHYS_PFN(self, phys): + return phys >> self.PAGE_SHIFT + + def __phys_to_virt(self, pa): + return (pa - self.PHYS_OFFSET) | self.PAGE_OFFSET + + def __phys_to_pfn(self, pa): + return self.PHYS_PFN(pa) + + def __pfn_to_phys(self, pfn): + return self.PFN_PHYS(pfn) + + def __pa_symbol_nodebug(self, x): + return self.__kimg_to_phys(x) + + def __phys_addr_symbol(self, x): + if constants.LX_CONFIG_DEBUG_VIRTUAL: + if x < self.KERNEL_START or x > self.KERNEL_END: + raise gdb.GdbError("0x%x exceed kernel range" % x) + return self.__pa_symbol_nodebug(x) + + def __pa_symbol(self, x): + return self.__phys_addr_symbol(x) + + def __va(self, pa): + return self.__phys_to_virt(pa) + + def pfn_to_kaddr(self, pfn): + return self.__va(pfn << self.PAGE_SHIFT) + + def virt_to_pfn(self, va): + return self.__phys_to_pfn(self.__virt_to_phys(va)) + + def sym_to_pfn(self, x): + return self.__phys_to_pfn(self.__pa_symbol(x)) + + def page_to_pfn(self, page): + return int(page.cast(utils.get_page_type().pointer()) - self.vmemmap.cast(utils.get_page_type().pointer())) + + def page_to_phys(self, page): + return self.__pfn_to_phys(self.page_to_pfn(page)) + + def pfn_to_page(self, pfn): + return (self.vmemmap + pfn).cast(utils.get_page_type().pointer()) + + def page_to_virt(self, page): + if constants.LX_CONFIG_DEBUG_VIRTUAL: + return self.__va(self.page_to_phys(page)) else: - page_size = is_bit_defined_tupled(data, 7) - page_size_bit = page_size[1] - self.is_page = page_size_bit - self.entry_present = is_bit_defined_tupled(data, 0) - self.read_write = is_bit_defined_tupled(data, 1) - self.user_access_allowed = is_bit_defined_tupled(data, 2) - self.page_level_write_through = is_bit_defined_tupled(data, 3) - self.page_level_cache_disabled = is_bit_defined_tupled(data, 4) - self.entry_was_accessed = is_bit_defined_tupled(data, 5) - self.page_size = page_size - self.dirty = is_bit_defined_tupled( - data, 6) if page_size_bit else None - self.global_translation = is_bit_defined_tupled( - data, 8) if page_size_bit else None - self.pat = is_bit_defined_tupled( - data, 12) if page_size_bit else None - self.page_physical_address = data & PHYSICAL_ADDRESS_MASK & page_mask(level) if page_size_bit else None - self.next_entry_physical_address = None if page_size_bit else data & PHYSICAL_ADDRESS_MASK & page_mask() - self.hlat_restart_with_ordinary = is_bit_defined_tupled(data, 11) - self.protection_key = content_tupled(data, 59, 62) if page_size_bit else None - self.executed_disable = is_bit_defined_tupled(data, 63) - self.address = address - self.page_entry_binary_data = data - self.page_hierarchy_level = level - - def next_entry(self, va): - if self.is_page or not self.entry_present[1]: - return None - - next_level = self.page_hierarchy_level - 1 - return PageHierarchyEntry(entry_va(next_level, self.next_entry_physical_address, va), next_level) - - - def mk_string(self): - if not self.entry_present[1]: - return f"""\ -level {self.page_hierarchy_level}: - {'entry address': <30} {hex(self.address)} - {'page entry binary data': <30} {hex(self.page_entry_binary_data)} - --- - PAGE ENTRY IS NOT PRESENT! -""" - elif self.is_page: - def page_size_line(ps_bit, ps, level): - return "" if level == 1 else f"{'bit': <3} {ps_bit: <5} {'page size': <30} {ps}" - - return f"""\ -level {self.page_hierarchy_level}: - {'entry address': <30} {hex(self.address)} - {'page entry binary data': <30} {hex(self.page_entry_binary_data)} - {'page size': <30} {'1GB' if self.page_hierarchy_level == 3 else '2MB' if self.page_hierarchy_level == 2 else '4KB' if self.page_hierarchy_level == 1 else 'Unknown page size for level:' + self.page_hierarchy_level} - {'page physical address': <30} {hex(self.page_physical_address)} - --- - {'bit': <4} {self.entry_present[0]: <10} {'entry present': <30} {self.entry_present[1]} - {'bit': <4} {self.read_write[0]: <10} {'read/write access allowed': <30} {self.read_write[1]} - {'bit': <4} {self.user_access_allowed[0]: <10} {'user access allowed': <30} {self.user_access_allowed[1]} - {'bit': <4} {self.page_level_write_through[0]: <10} {'page level write through': <30} {self.page_level_write_through[1]} - {'bit': <4} {self.page_level_cache_disabled[0]: <10} {'page level cache disabled': <30} {self.page_level_cache_disabled[1]} - {'bit': <4} {self.entry_was_accessed[0]: <10} {'entry has been accessed': <30} {self.entry_was_accessed[1]} - {"" if self.page_hierarchy_level == 1 else f"{'bit': <4} {self.page_size[0]: <10} {'page size': <30} {self.page_size[1]}"} - {'bit': <4} {self.dirty[0]: <10} {'page dirty': <30} {self.dirty[1]} - {'bit': <4} {self.global_translation[0]: <10} {'global translation': <30} {self.global_translation[1]} - {'bit': <4} {self.hlat_restart_with_ordinary[0]: <10} {'restart to ordinary': <30} {self.hlat_restart_with_ordinary[1]} - {'bit': <4} {self.pat[0]: <10} {'pat': <30} {self.pat[1]} - {'bits': <4} {str(self.protection_key[0]): <10} {'protection key': <30} {self.protection_key[1]} - {'bit': <4} {self.executed_disable[0]: <10} {'execute disable': <30} {self.executed_disable[1]} -""" + __idx = int((page.cast(gdb.lookup_type("unsigned long")) - self.VMEMMAP_START).cast(utils.get_ulong_type())) // self.struct_page_size + return self.PAGE_OFFSET + (__idx * self.PAGE_SIZE) + + def virt_to_page(self, va): + if constants.LX_CONFIG_DEBUG_VIRTUAL: + return self.pfn_to_page(self.virt_to_pfn(va)) else: - return f"""\ -level {self.page_hierarchy_level}: - {'entry address': <30} {hex(self.address)} - {'page entry binary data': <30} {hex(self.page_entry_binary_data)} - {'next entry physical address': <30} {hex(self.next_entry_physical_address)} - --- - {'bit': <4} {self.entry_present[0]: <10} {'entry present': <30} {self.entry_present[1]} - {'bit': <4} {self.read_write[0]: <10} {'read/write access allowed': <30} {self.read_write[1]} - {'bit': <4} {self.user_access_allowed[0]: <10} {'user access allowed': <30} {self.user_access_allowed[1]} - {'bit': <4} {self.page_level_write_through[0]: <10} {'page level write through': <30} {self.page_level_write_through[1]} - {'bit': <4} {self.page_level_cache_disabled[0]: <10} {'page level cache disabled': <30} {self.page_level_cache_disabled[1]} - {'bit': <4} {self.entry_was_accessed[0]: <10} {'entry has been accessed': <30} {self.entry_was_accessed[1]} - {'bit': <4} {self.page_size[0]: <10} {'page size': <30} {self.page_size[1]} - {'bit': <4} {self.hlat_restart_with_ordinary[0]: <10} {'restart to ordinary': <30} {self.hlat_restart_with_ordinary[1]} - {'bit': <4} {self.executed_disable[0]: <10} {'execute disable': <30} {self.executed_disable[1]} -""" - - -class TranslateVM(gdb.Command): - """Prints the entire paging structure used to translate a given virtual address. - -Having an address space of the currently executed process translates the virtual address -and prints detailed information of all paging structure levels used for the transaltion. -Currently supported arch: x86""" + __idx = int(self.kasan_reset_tag(va) - self.PAGE_OFFSET) // self.PAGE_SIZE + addr = self.VMEMMAP_START + (__idx * self.struct_page_size) + return gdb.Value(addr).cast(utils.get_page_type().pointer()) + + def page_address(self, page): + return self.page_to_virt(page) + + def folio_address(self, folio): + return self.page_address(folio['page'].address) + +class LxPFN2Page(gdb.Command): + """PFN to struct page""" def __init__(self): - super(TranslateVM, self).__init__('translate-vm', gdb.COMMAND_USER) + super(LxPFN2Page, self).__init__("lx-pfn_to_page", gdb.COMMAND_USER) def invoke(self, arg, from_tty): - if utils.is_target_arch("x86"): - vm_address = gdb.parse_and_eval(f'{arg}') - cr3_data = gdb.parse_and_eval('$cr3') - cr4 = gdb.parse_and_eval('$cr4') - page_levels = 5 if cr4 & (1 << 12) else 4 - page_entry = Cr3(cr3_data, page_levels) - while page_entry: - gdb.write(page_entry.mk_string()) - page_entry = page_entry.next_entry(vm_address) - else: - gdb.GdbError("Virtual address translation is not" - "supported for this arch") + argv = gdb.string_to_argv(arg) + pfn = int(argv[0]) + page = page_ops().ops.pfn_to_page(pfn) + gdb.write("pfn_to_page(0x%x) = 0x%x\n" % (pfn, page)) + +LxPFN2Page() + +class LxPage2PFN(gdb.Command): + """struct page to PFN""" + + def __init__(self): + super(LxPage2PFN, self).__init__("lx-page_to_pfn", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + struct_page_addr = int(argv[0], 16) + page = gdb.Value(struct_page_addr).cast(utils.get_page_type().pointer()) + pfn = page_ops().ops.page_to_pfn(page) + gdb.write("page_to_pfn(0x%x) = 0x%x\n" % (page, pfn)) + +LxPage2PFN() + +class LxPageAddress(gdb.Command): + """struct page to linear mapping address""" + + def __init__(self): + super(LxPageAddress, self).__init__("lx-page_address", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + struct_page_addr = int(argv[0], 16) + page = gdb.Value(struct_page_addr).cast(utils.get_page_type().pointer()) + addr = page_ops().ops.page_address(page) + gdb.write("page_address(0x%x) = 0x%x\n" % (page, addr)) +LxPageAddress() + +class LxPage2Phys(gdb.Command): + """struct page to physical address""" + + def __init__(self): + super(LxPage2Phys, self).__init__("lx-page_to_phys", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + struct_page_addr = int(argv[0], 16) + page = gdb.Value(struct_page_addr).cast(utils.get_page_type().pointer()) + phys_addr = page_ops().ops.page_to_phys(page) + gdb.write("page_to_phys(0x%x) = 0x%x\n" % (page, phys_addr)) + +LxPage2Phys() + +class LxVirt2Phys(gdb.Command): + """virtual address to physical address""" + + def __init__(self): + super(LxVirt2Phys, self).__init__("lx-virt_to_phys", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + linear_addr = int(argv[0], 16) + phys_addr = page_ops().ops.virt_to_phys(linear_addr) + gdb.write("virt_to_phys(0x%x) = 0x%x\n" % (linear_addr, phys_addr)) + +LxVirt2Phys() + +class LxVirt2Page(gdb.Command): + """virtual address to struct page""" + + def __init__(self): + super(LxVirt2Page, self).__init__("lx-virt_to_page", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + linear_addr = int(argv[0], 16) + page = page_ops().ops.virt_to_page(linear_addr) + gdb.write("virt_to_page(0x%x) = 0x%x\n" % (linear_addr, page)) + +LxVirt2Page() + +class LxSym2PFN(gdb.Command): + """symbol address to PFN""" + + def __init__(self): + super(LxSym2PFN, self).__init__("lx-sym_to_pfn", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + sym_addr = int(argv[0], 16) + pfn = page_ops().ops.sym_to_pfn(sym_addr) + gdb.write("sym_to_pfn(0x%x) = %d\n" % (sym_addr, pfn)) + +LxSym2PFN() + +class LxPFN2Kaddr(gdb.Command): + """PFN to kernel address""" + + def __init__(self): + super(LxPFN2Kaddr, self).__init__("lx-pfn_to_kaddr", gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + argv = gdb.string_to_argv(arg) + pfn = int(argv[0]) + kaddr = page_ops().ops.pfn_to_kaddr(pfn) + gdb.write("pfn_to_kaddr(%d) = 0x%x\n" % (pfn, kaddr)) -TranslateVM() +LxPFN2Kaddr() diff --git a/scripts/gdb/linux/modules.py b/scripts/gdb/linux/modules.py index 261f28640f4c..298dfcc25eae 100644 --- a/scripts/gdb/linux/modules.py +++ b/scripts/gdb/linux/modules.py @@ -73,11 +73,17 @@ class LxLsmod(gdb.Command): " " if utils.get_long_type().sizeof == 8 else "")) for module in module_list(): - layout = module['mem'][constants.LX_MOD_TEXT] + text = module['mem'][constants.LX_MOD_TEXT] + text_addr = str(text['base']).split()[0] + total_size = 0 + + for i in range(constants.LX_MOD_TEXT, constants.LX_MOD_RO_AFTER_INIT + 1): + total_size += module['mem'][i]['size'] + gdb.write("{address} {name:<19} {size:>8} {ref}".format( - address=str(layout['base']).split()[0], + address=text_addr, name=module['name'].string(), - size=str(layout['size']), + size=str(total_size), ref=str(module['refcnt']['counter'] - 1))) t = self._module_use_type.get_type().pointer() @@ -91,5 +97,35 @@ class LxLsmod(gdb.Command): gdb.write("\n") - LxLsmod() + +def help(): + t = """Usage: lx-getmod-by-textaddr [Heximal Address] + Example: lx-getmod-by-textaddr 0xffff800002d305ac\n""" + gdb.write("Unrecognized command\n") + raise gdb.GdbError(t) + +class LxFindTextAddrinMod(gdb.Command): + '''Look up loaded kernel module by text address.''' + + def __init__(self): + super(LxFindTextAddrinMod, self).__init__('lx-getmod-by-textaddr', gdb.COMMAND_SUPPORT) + + def invoke(self, arg, from_tty): + args = gdb.string_to_argv(arg) + + if len(args) != 1: + help() + + addr = gdb.Value(int(args[0], 16)).cast(utils.get_ulong_type()) + for mod in module_list(): + mod_text_start = mod['mem'][constants.LX_MOD_TEXT]['base'] + mod_text_end = mod_text_start + mod['mem'][constants.LX_MOD_TEXT]['size'].cast(utils.get_ulong_type()) + + if addr >= mod_text_start and addr < mod_text_end: + s = "0x%x" % addr + " is in " + mod['name'].string() + ".ko\n" + gdb.write(s) + return + gdb.write("0x%x is not in any module text section\n" % addr) + +LxFindTextAddrinMod() diff --git a/scripts/gdb/linux/page_owner.py b/scripts/gdb/linux/page_owner.py new file mode 100644 index 000000000000..844fd5d0c912 --- /dev/null +++ b/scripts/gdb/linux/page_owner.py @@ -0,0 +1,190 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2023 MediaTek Inc. +# +# Authors: +# Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> +# + +import gdb +from linux import utils, stackdepot, constants, mm + +if constants.LX_CONFIG_PAGE_OWNER: + page_ext_t = utils.CachedType('struct page_ext') + page_owner_t = utils.CachedType('struct page_owner') + + PAGE_OWNER_STACK_DEPTH = 16 + PAGE_EXT_OWNER = constants.LX_PAGE_EXT_OWNER + PAGE_EXT_INVALID = 0x1 + PAGE_EXT_OWNER_ALLOCATED = constants.LX_PAGE_EXT_OWNER_ALLOCATED + +def help(): + t = """Usage: lx-dump-page-owner [Option] + Option: + --pfn [Decimal pfn] + Example: + lx-dump-page-owner --pfn 655360\n""" + gdb.write("Unrecognized command\n") + raise gdb.GdbError(t) + +class DumpPageOwner(gdb.Command): + """Dump page owner""" + + min_pfn = None + max_pfn = None + p_ops = None + migrate_reason_names = None + + def __init__(self): + super(DumpPageOwner, self).__init__("lx-dump-page-owner", gdb.COMMAND_SUPPORT) + + def invoke(self, args, from_tty): + if not constants.LX_CONFIG_PAGE_OWNER: + raise gdb.GdbError('CONFIG_PAGE_OWNER does not enable') + + page_owner_inited = gdb.parse_and_eval('page_owner_inited') + if page_owner_inited['key']['enabled']['counter'] != 0x1: + raise gdb.GdbError('page_owner_inited is not enabled') + + self.p_ops = mm.page_ops().ops + self.get_page_owner_info() + argv = gdb.string_to_argv(args) + if len(argv) == 0: + self.read_page_owner() + elif len(argv) == 2: + if argv[0] == "--pfn": + pfn = int(argv[1]) + self.read_page_owner_by_addr(self.p_ops.pfn_to_page(pfn)) + else: + help() + else: + help() + + def get_page_owner_info(self): + self.min_pfn = int(gdb.parse_and_eval("min_low_pfn")) + self.max_pfn = int(gdb.parse_and_eval("max_pfn")) + self.page_ext_size = int(gdb.parse_and_eval("page_ext_size")) + self.migrate_reason_names = gdb.parse_and_eval('migrate_reason_names') + + def page_ext_invalid(self, page_ext): + if page_ext == gdb.Value(0): + return True + if page_ext.cast(utils.get_ulong_type()) & PAGE_EXT_INVALID == PAGE_EXT_INVALID: + return True + return False + + def get_entry(self, base, index): + return (base.cast(utils.get_ulong_type()) + self.page_ext_size * index).cast(page_ext_t.get_type().pointer()) + + def lookup_page_ext(self, page): + pfn = self.p_ops.page_to_pfn(page) + section = self.p_ops.pfn_to_section(pfn) + page_ext = section["page_ext"] + if self.page_ext_invalid(page_ext): + return gdb.Value(0) + return self.get_entry(page_ext, pfn) + + def page_ext_get(self, page): + page_ext = self.lookup_page_ext(page) + if page_ext != gdb.Value(0): + return page_ext + else: + return gdb.Value(0) + + def get_page_owner(self, page_ext): + addr = page_ext.cast(utils.get_ulong_type()) + gdb.parse_and_eval("page_owner_ops")["offset"].cast(utils.get_ulong_type()) + return addr.cast(page_owner_t.get_type().pointer()) + + def read_page_owner_by_addr(self, struct_page_addr): + page = gdb.Value(struct_page_addr).cast(utils.get_page_type().pointer()) + pfn = self.p_ops.page_to_pfn(page) + + if pfn < self.min_pfn or pfn > self.max_pfn or (not self.p_ops.pfn_valid(pfn)): + gdb.write("pfn is invalid\n") + return + + page = self.p_ops.pfn_to_page(pfn) + page_ext = self.page_ext_get(page) + + if page_ext == gdb.Value(0): + gdb.write("page_ext is null\n") + return + + if not (page_ext['flags'] & (1 << PAGE_EXT_OWNER)): + gdb.write("page_owner flag is invalid\n") + raise gdb.GdbError('page_owner info is not present (never set?)\n') + + if mm.test_bit(PAGE_EXT_OWNER_ALLOCATED, page_ext['flags'].address): + gdb.write('page_owner tracks the page as allocated\n') + else: + gdb.write('page_owner tracks the page as freed\n') + + if not (page_ext['flags'] & (1 << PAGE_EXT_OWNER_ALLOCATED)): + gdb.write("page_owner is not allocated\n") + + try: + page_owner = self.get_page_owner(page_ext) + gdb.write("Page last allocated via order %d, gfp_mask: 0x%x, pid: %d, tgid: %d (%s), ts %u ns, free_ts %u ns\n" %\ + (page_owner["order"], page_owner["gfp_mask"],\ + page_owner["pid"], page_owner["tgid"], page_owner["comm"],\ + page_owner["ts_nsec"], page_owner["free_ts_nsec"])) + gdb.write("PFN: %d, Flags: 0x%x\n" % (pfn, page['flags'])) + if page_owner["handle"] == 0: + gdb.write('page_owner allocation stack trace missing\n') + else: + stackdepot.stack_depot_print(page_owner["handle"]) + + if page_owner["free_handle"] == 0: + gdb.write('page_owner free stack trace missing\n') + else: + gdb.write('page last free stack trace:\n') + stackdepot.stack_depot_print(page_owner["free_handle"]) + if page_owner['last_migrate_reason'] != -1: + gdb.write('page has been migrated, last migrate reason: %s\n' % self.migrate_reason_names[page_owner['last_migrate_reason']]) + except: + gdb.write("\n") + + def read_page_owner(self): + pfn = self.min_pfn + + # Find a valid PFN or the start of a MAX_ORDER_NR_PAGES area + while ((not self.p_ops.pfn_valid(pfn)) and (pfn & (self.p_ops.MAX_ORDER_NR_PAGES - 1))) != 0: + pfn += 1 + + while pfn < self.max_pfn: + # + # If the new page is in a new MAX_ORDER_NR_PAGES area, + # validate the area as existing, skip it if not + # + if ((pfn & (self.p_ops.MAX_ORDER_NR_PAGES - 1)) == 0) and (not self.p_ops.pfn_valid(pfn)): + pfn += (self.p_ops.MAX_ORDER_NR_PAGES - 1) + continue; + + page = self.p_ops.pfn_to_page(pfn) + page_ext = self.page_ext_get(page) + if page_ext == gdb.Value(0): + pfn += 1 + continue + + if not (page_ext['flags'] & (1 << PAGE_EXT_OWNER)): + pfn += 1 + continue + if not (page_ext['flags'] & (1 << PAGE_EXT_OWNER_ALLOCATED)): + pfn += 1 + continue + + try: + page_owner = self.get_page_owner(page_ext) + gdb.write("Page allocated via order %d, gfp_mask: 0x%x, pid: %d, tgid: %d (%s), ts %u ns, free_ts %u ns\n" %\ + (page_owner["order"], page_owner["gfp_mask"],\ + page_owner["pid"], page_owner["tgid"], page_owner["comm"],\ + page_owner["ts_nsec"], page_owner["free_ts_nsec"])) + gdb.write("PFN: %d, Flags: 0x%x\n" % (pfn, page['flags'])) + stackdepot.stack_depot_print(page_owner["handle"]) + pfn += (1 << page_owner["order"]) + continue + except: + gdb.write("\n") + pfn += 1 + +DumpPageOwner() diff --git a/scripts/gdb/linux/pgtable.py b/scripts/gdb/linux/pgtable.py new file mode 100644 index 000000000000..30d837f3dfae --- /dev/null +++ b/scripts/gdb/linux/pgtable.py @@ -0,0 +1,222 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# gdb helper commands and functions for Linux kernel debugging +# +# routines to introspect page table +# +# Authors: +# Dmitrii Bundin <dmitrii.bundin.a@gmail.com> +# + +import gdb + +from linux import utils + +PHYSICAL_ADDRESS_MASK = gdb.parse_and_eval('0xfffffffffffff') + + +def page_mask(level=1): + # 4KB + if level == 1: + return gdb.parse_and_eval('(u64) ~0xfff') + # 2MB + elif level == 2: + return gdb.parse_and_eval('(u64) ~0x1fffff') + # 1GB + elif level == 3: + return gdb.parse_and_eval('(u64) ~0x3fffffff') + else: + raise Exception(f'Unknown page level: {level}') + + +#page_offset_base in case CONFIG_DYNAMIC_MEMORY_LAYOUT is disabled +POB_NO_DYNAMIC_MEM_LAYOUT = '0xffff888000000000' +def _page_offset_base(): + pob_symbol = gdb.lookup_global_symbol('page_offset_base') + pob = pob_symbol.name if pob_symbol else POB_NO_DYNAMIC_MEM_LAYOUT + return gdb.parse_and_eval(pob) + + +def is_bit_defined_tupled(data, offset): + return offset, bool(data >> offset & 1) + +def content_tupled(data, bit_start, bit_end): + return (bit_start, bit_end), data >> bit_start & ((1 << (1 + bit_end - bit_start)) - 1) + +def entry_va(level, phys_addr, translating_va): + def start_bit(level): + if level == 5: + return 48 + elif level == 4: + return 39 + elif level == 3: + return 30 + elif level == 2: + return 21 + elif level == 1: + return 12 + else: + raise Exception(f'Unknown level {level}') + + entry_offset = ((translating_va >> start_bit(level)) & 511) * 8 + entry_va = _page_offset_base() + phys_addr + entry_offset + return entry_va + +class Cr3(): + def __init__(self, cr3, page_levels): + self.cr3 = cr3 + self.page_levels = page_levels + self.page_level_write_through = is_bit_defined_tupled(cr3, 3) + self.page_level_cache_disabled = is_bit_defined_tupled(cr3, 4) + self.next_entry_physical_address = cr3 & PHYSICAL_ADDRESS_MASK & page_mask() + + def next_entry(self, va): + next_level = self.page_levels + return PageHierarchyEntry(entry_va(next_level, self.next_entry_physical_address, va), next_level) + + def mk_string(self): + return f"""\ +cr3: + {'cr3 binary data': <30} {hex(self.cr3)} + {'next entry physical address': <30} {hex(self.next_entry_physical_address)} + --- + {'bit' : <4} {self.page_level_write_through[0]: <10} {'page level write through': <30} {self.page_level_write_through[1]} + {'bit' : <4} {self.page_level_cache_disabled[0]: <10} {'page level cache disabled': <30} {self.page_level_cache_disabled[1]} +""" + + +class PageHierarchyEntry(): + def __init__(self, address, level): + data = int.from_bytes( + memoryview(gdb.selected_inferior().read_memory(address, 8)), + "little" + ) + if level == 1: + self.is_page = True + self.entry_present = is_bit_defined_tupled(data, 0) + self.read_write = is_bit_defined_tupled(data, 1) + self.user_access_allowed = is_bit_defined_tupled(data, 2) + self.page_level_write_through = is_bit_defined_tupled(data, 3) + self.page_level_cache_disabled = is_bit_defined_tupled(data, 4) + self.entry_was_accessed = is_bit_defined_tupled(data, 5) + self.dirty = is_bit_defined_tupled(data, 6) + self.pat = is_bit_defined_tupled(data, 7) + self.global_translation = is_bit_defined_tupled(data, 8) + self.page_physical_address = data & PHYSICAL_ADDRESS_MASK & page_mask(level) + self.next_entry_physical_address = None + self.hlat_restart_with_ordinary = is_bit_defined_tupled(data, 11) + self.protection_key = content_tupled(data, 59, 62) + self.executed_disable = is_bit_defined_tupled(data, 63) + else: + page_size = is_bit_defined_tupled(data, 7) + page_size_bit = page_size[1] + self.is_page = page_size_bit + self.entry_present = is_bit_defined_tupled(data, 0) + self.read_write = is_bit_defined_tupled(data, 1) + self.user_access_allowed = is_bit_defined_tupled(data, 2) + self.page_level_write_through = is_bit_defined_tupled(data, 3) + self.page_level_cache_disabled = is_bit_defined_tupled(data, 4) + self.entry_was_accessed = is_bit_defined_tupled(data, 5) + self.page_size = page_size + self.dirty = is_bit_defined_tupled( + data, 6) if page_size_bit else None + self.global_translation = is_bit_defined_tupled( + data, 8) if page_size_bit else None + self.pat = is_bit_defined_tupled( + data, 12) if page_size_bit else None + self.page_physical_address = data & PHYSICAL_ADDRESS_MASK & page_mask(level) if page_size_bit else None + self.next_entry_physical_address = None if page_size_bit else data & PHYSICAL_ADDRESS_MASK & page_mask() + self.hlat_restart_with_ordinary = is_bit_defined_tupled(data, 11) + self.protection_key = content_tupled(data, 59, 62) if page_size_bit else None + self.executed_disable = is_bit_defined_tupled(data, 63) + self.address = address + self.page_entry_binary_data = data + self.page_hierarchy_level = level + + def next_entry(self, va): + if self.is_page or not self.entry_present[1]: + return None + + next_level = self.page_hierarchy_level - 1 + return PageHierarchyEntry(entry_va(next_level, self.next_entry_physical_address, va), next_level) + + + def mk_string(self): + if not self.entry_present[1]: + return f"""\ +level {self.page_hierarchy_level}: + {'entry address': <30} {hex(self.address)} + {'page entry binary data': <30} {hex(self.page_entry_binary_data)} + --- + PAGE ENTRY IS NOT PRESENT! +""" + elif self.is_page: + def page_size_line(ps_bit, ps, level): + return "" if level == 1 else f"{'bit': <3} {ps_bit: <5} {'page size': <30} {ps}" + + return f"""\ +level {self.page_hierarchy_level}: + {'entry address': <30} {hex(self.address)} + {'page entry binary data': <30} {hex(self.page_entry_binary_data)} + {'page size': <30} {'1GB' if self.page_hierarchy_level == 3 else '2MB' if self.page_hierarchy_level == 2 else '4KB' if self.page_hierarchy_level == 1 else 'Unknown page size for level:' + self.page_hierarchy_level} + {'page physical address': <30} {hex(self.page_physical_address)} + --- + {'bit': <4} {self.entry_present[0]: <10} {'entry present': <30} {self.entry_present[1]} + {'bit': <4} {self.read_write[0]: <10} {'read/write access allowed': <30} {self.read_write[1]} + {'bit': <4} {self.user_access_allowed[0]: <10} {'user access allowed': <30} {self.user_access_allowed[1]} + {'bit': <4} {self.page_level_write_through[0]: <10} {'page level write through': <30} {self.page_level_write_through[1]} + {'bit': <4} {self.page_level_cache_disabled[0]: <10} {'page level cache disabled': <30} {self.page_level_cache_disabled[1]} + {'bit': <4} {self.entry_was_accessed[0]: <10} {'entry has been accessed': <30} {self.entry_was_accessed[1]} + {"" if self.page_hierarchy_level == 1 else f"{'bit': <4} {self.page_size[0]: <10} {'page size': <30} {self.page_size[1]}"} + {'bit': <4} {self.dirty[0]: <10} {'page dirty': <30} {self.dirty[1]} + {'bit': <4} {self.global_translation[0]: <10} {'global translation': <30} {self.global_translation[1]} + {'bit': <4} {self.hlat_restart_with_ordinary[0]: <10} {'restart to ordinary': <30} {self.hlat_restart_with_ordinary[1]} + {'bit': <4} {self.pat[0]: <10} {'pat': <30} {self.pat[1]} + {'bits': <4} {str(self.protection_key[0]): <10} {'protection key': <30} {self.protection_key[1]} + {'bit': <4} {self.executed_disable[0]: <10} {'execute disable': <30} {self.executed_disable[1]} +""" + else: + return f"""\ +level {self.page_hierarchy_level}: + {'entry address': <30} {hex(self.address)} + {'page entry binary data': <30} {hex(self.page_entry_binary_data)} + {'next entry physical address': <30} {hex(self.next_entry_physical_address)} + --- + {'bit': <4} {self.entry_present[0]: <10} {'entry present': <30} {self.entry_present[1]} + {'bit': <4} {self.read_write[0]: <10} {'read/write access allowed': <30} {self.read_write[1]} + {'bit': <4} {self.user_access_allowed[0]: <10} {'user access allowed': <30} {self.user_access_allowed[1]} + {'bit': <4} {self.page_level_write_through[0]: <10} {'page level write through': <30} {self.page_level_write_through[1]} + {'bit': <4} {self.page_level_cache_disabled[0]: <10} {'page level cache disabled': <30} {self.page_level_cache_disabled[1]} + {'bit': <4} {self.entry_was_accessed[0]: <10} {'entry has been accessed': <30} {self.entry_was_accessed[1]} + {'bit': <4} {self.page_size[0]: <10} {'page size': <30} {self.page_size[1]} + {'bit': <4} {self.hlat_restart_with_ordinary[0]: <10} {'restart to ordinary': <30} {self.hlat_restart_with_ordinary[1]} + {'bit': <4} {self.executed_disable[0]: <10} {'execute disable': <30} {self.executed_disable[1]} +""" + + +class TranslateVM(gdb.Command): + """Prints the entire paging structure used to translate a given virtual address. + +Having an address space of the currently executed process translates the virtual address +and prints detailed information of all paging structure levels used for the transaltion. +Currently supported arch: x86""" + + def __init__(self): + super(TranslateVM, self).__init__('translate-vm', gdb.COMMAND_USER) + + def invoke(self, arg, from_tty): + if utils.is_target_arch("x86"): + vm_address = gdb.parse_and_eval(f'{arg}') + cr3_data = gdb.parse_and_eval('$cr3') + cr4 = gdb.parse_and_eval('$cr4') + page_levels = 5 if cr4 & (1 << 12) else 4 + page_entry = Cr3(cr3_data, page_levels) + while page_entry: + gdb.write(page_entry.mk_string()) + page_entry = page_entry.next_entry(vm_address) + else: + gdb.GdbError("Virtual address translation is not" + "supported for this arch") + + +TranslateVM() diff --git a/scripts/gdb/linux/slab.py b/scripts/gdb/linux/slab.py new file mode 100644 index 000000000000..f012ba38c7d9 --- /dev/null +++ b/scripts/gdb/linux/slab.py @@ -0,0 +1,326 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2023 MediaTek Inc. +# +# Authors: +# Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> +# + +import gdb +import re +import traceback +from linux import lists, utils, stackdepot, constants, mm + +SLAB_RED_ZONE = constants.LX_SLAB_RED_ZONE +SLAB_POISON = constants.LX_SLAB_POISON +SLAB_KMALLOC = constants.LX_SLAB_KMALLOC +SLAB_HWCACHE_ALIGN = constants.LX_SLAB_HWCACHE_ALIGN +SLAB_CACHE_DMA = constants.LX_SLAB_CACHE_DMA +SLAB_CACHE_DMA32 = constants.LX_SLAB_CACHE_DMA32 +SLAB_STORE_USER = constants.LX_SLAB_STORE_USER +SLAB_PANIC = constants.LX_SLAB_PANIC + +OO_SHIFT = 16 +OO_MASK = (1 << OO_SHIFT) - 1 + +if constants.LX_CONFIG_SLUB_DEBUG: + slab_type = utils.CachedType("struct slab") + slab_ptr_type = slab_type.get_type().pointer() + kmem_cache_type = utils.CachedType("struct kmem_cache") + kmem_cache_ptr_type = kmem_cache_type.get_type().pointer() + freeptr_t = utils.CachedType("freeptr_t") + freeptr_t_ptr = freeptr_t.get_type().pointer() + + track_type = gdb.lookup_type('struct track') + track_alloc = int(gdb.parse_and_eval('TRACK_ALLOC')) + track_free = int(gdb.parse_and_eval('TRACK_FREE')) + +def slab_folio(slab): + return slab.cast(gdb.lookup_type("struct folio").pointer()) + +def slab_address(slab): + p_ops = mm.page_ops().ops + folio = slab_folio(slab) + return p_ops.folio_address(folio) + +def for_each_object(cache, addr, slab_objects): + p = addr + if cache['flags'] & SLAB_RED_ZONE: + p += int(cache['red_left_pad']) + while p < addr + (slab_objects * cache['size']): + yield p + p = p + int(cache['size']) + +def get_info_end(cache): + if (cache['offset'] >= cache['inuse']): + return cache['inuse'] + gdb.lookup_type("void").pointer().sizeof + else: + return cache['inuse'] + +def get_orig_size(cache, obj): + if cache['flags'] & SLAB_STORE_USER and cache['flags'] & SLAB_KMALLOC: + p = mm.page_ops().ops.kasan_reset_tag(obj) + p += get_info_end(cache) + p += gdb.lookup_type('struct track').sizeof * 2 + p = p.cast(utils.get_uint_type().pointer()) + return p.dereference() + else: + return cache['object_size'] + +def get_track(cache, object_pointer, alloc): + p = object_pointer + get_info_end(cache) + p += (alloc * track_type.sizeof) + return p + +def oo_objects(x): + return int(x['x']) & OO_MASK + +def oo_order(x): + return int(x['x']) >> OO_SHIFT + +def reciprocal_divide(a, R): + t = (a * int(R['m'])) >> 32 + return (t + ((a - t) >> int(R['sh1']))) >> int(R['sh2']) + +def __obj_to_index(cache, addr, obj): + return reciprocal_divide(int(mm.page_ops().ops.kasan_reset_tag(obj)) - addr, cache['reciprocal_size']) + +def swab64(x): + result = (((x & 0x00000000000000ff) << 56) | \ + ((x & 0x000000000000ff00) << 40) | \ + ((x & 0x0000000000ff0000) << 24) | \ + ((x & 0x00000000ff000000) << 8) | \ + ((x & 0x000000ff00000000) >> 8) | \ + ((x & 0x0000ff0000000000) >> 24) | \ + ((x & 0x00ff000000000000) >> 40) | \ + ((x & 0xff00000000000000) >> 56)) + return result + +def freelist_ptr_decode(cache, ptr, ptr_addr): + if constants.LX_CONFIG_SLAB_FREELIST_HARDENED: + return ptr['v'] ^ cache['random'] ^ swab64(int(ptr_addr)) + else: + return ptr['v'] + +def get_freepointer(cache, obj): + obj = mm.page_ops().ops.kasan_reset_tag(obj) + ptr_addr = obj + cache['offset'] + p = ptr_addr.cast(freeptr_t_ptr).dereference() + return freelist_ptr_decode(cache, p, ptr_addr) + +def loc_exist(loc_track, addr, handle, waste): + for loc in loc_track: + if loc['addr'] == addr and loc['handle'] == handle and loc['waste'] == waste: + return loc + return None + +def add_location(loc_track, cache, track, orig_size): + jiffies = gdb.parse_and_eval("jiffies_64") + age = jiffies - track['when'] + handle = 0 + waste = cache['object_size'] - int(orig_size) + pid = int(track['pid']) + cpuid = int(track['cpu']) + addr = track['addr'] + if constants.LX_CONFIG_STACKDEPOT: + handle = track['handle'] + + loc = loc_exist(loc_track, addr, handle, waste) + if loc: + loc['count'] += 1 + if track['when']: + loc['sum_time'] += age + loc['min_time'] = min(loc['min_time'], age) + loc['max_time'] = max(loc['max_time'], age) + loc['min_pid'] = min(loc['min_pid'], pid) + loc['max_pid'] = max(loc['max_pid'], pid) + loc['cpus'].add(cpuid) + else: + loc_track.append({ + 'count' : 1, + 'addr' : addr, + 'sum_time' : age, + 'min_time' : age, + 'max_time' : age, + 'min_pid' : pid, + 'max_pid' : pid, + 'handle' : handle, + 'waste' : waste, + 'cpus' : {cpuid} + } + ) + +def slabtrace(alloc, cache_name): + + def __fill_map(obj_map, cache, slab): + p = slab['freelist'] + addr = slab_address(slab) + while p != gdb.Value(0): + index = __obj_to_index(cache, addr, p) + obj_map[index] = True # free objects + p = get_freepointer(cache, p) + + # process every slab page on the slab_list (partial and full list) + def process_slab(loc_track, slab_list, alloc, cache): + for slab in lists.list_for_each_entry(slab_list, slab_ptr_type, "slab_list"): + obj_map[:] = [False] * oo_objects(cache['oo']) + __fill_map(obj_map, cache, slab) + addr = slab_address(slab) + for object_pointer in for_each_object(cache, addr, slab['objects']): + if obj_map[__obj_to_index(cache, addr, object_pointer)] == True: + continue + p = get_track(cache, object_pointer, alloc) + track = gdb.Value(p).cast(track_type.pointer()) + if alloc == track_alloc: + size = get_orig_size(cache, object_pointer) + else: + size = cache['object_size'] + add_location(loc_track, cache, track, size) + continue + + slab_caches = gdb.parse_and_eval("slab_caches") + if mm.page_ops().ops.MAX_NUMNODES > 1: + nr_node_ids = int(gdb.parse_and_eval("nr_node_ids")) + else: + nr_node_ids = 1 + + target_cache = None + loc_track = [] + + for cache in lists.list_for_each_entry(slab_caches, kmem_cache_ptr_type, 'list'): + if cache['name'].string() == cache_name: + target_cache = cache + break + + obj_map = [False] * oo_objects(target_cache['oo']) + + if target_cache['flags'] & SLAB_STORE_USER: + for i in range(0, nr_node_ids): + cache_node = target_cache['node'][i] + if cache_node['nr_slabs']['counter'] == 0: + continue + process_slab(loc_track, cache_node['partial'], alloc, target_cache) + process_slab(loc_track, cache_node['full'], alloc, target_cache) + else: + raise gdb.GdbError("SLAB_STORE_USER is not set in %s" % target_cache['name'].string()) + + for loc in sorted(loc_track, key=lambda x:x['count'], reverse=True): + if loc['addr']: + addr = loc['addr'].cast(utils.get_ulong_type().pointer()) + gdb.write("%d %s" % (loc['count'], str(addr).split(' ')[-1])) + else: + gdb.write("%d <not-available>" % loc['count']) + + if loc['waste']: + gdb.write(" waste=%d/%d" % (loc['count'] * loc['waste'], loc['waste'])) + + if loc['sum_time'] != loc['min_time']: + gdb.write(" age=%d/%d/%d" % (loc['min_time'], loc['sum_time']/loc['count'], loc['max_time'])) + else: + gdb.write(" age=%d" % loc['min_time']) + + if loc['min_pid'] != loc['max_pid']: + gdb.write(" pid=%d-%d" % (loc['min_pid'], loc['max_pid'])) + else: + gdb.write(" pid=%d" % loc['min_pid']) + + if constants.LX_NR_CPUS > 1: + nr_cpu = gdb.parse_and_eval('__num_online_cpus')['counter'] + if nr_cpu > 1: + gdb.write(" cpus=") + for i in loc['cpus']: + gdb.write("%d," % i) + gdb.write("\n") + if constants.LX_CONFIG_STACKDEPOT: + if loc['handle']: + stackdepot.stack_depot_print(loc['handle']) + gdb.write("\n") + +def help(): + t = """Usage: lx-slabtrace --cache_name [cache_name] [Options] + Options: + --alloc + print information of allocation trace of the allocated objects + --free + print information of freeing trace of the allocated objects + Example: + lx-slabtrace --cache_name kmalloc-1k --alloc + lx-slabtrace --cache_name kmalloc-1k --free\n""" + gdb.write("Unrecognized command\n") + raise gdb.GdbError(t) + +class LxSlabTrace(gdb.Command): + """Show specific cache slabtrace""" + + def __init__(self): + super(LxSlabTrace, self).__init__("lx-slabtrace", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + if not constants.LX_CONFIG_SLUB_DEBUG: + raise gdb.GdbError("CONFIG_SLUB_DEBUG is not enabled") + + argv = gdb.string_to_argv(arg) + alloc = track_alloc # default show alloc_traces + + if len(argv) == 3: + if argv[2] == '--alloc': + alloc = track_alloc + elif argv[2] == '--free': + alloc = track_free + else: + help() + if len(argv) >= 2 and argv[0] == '--cache_name': + slabtrace(alloc, argv[1]) + else: + help() +LxSlabTrace() + +def slabinfo(): + nr_node_ids = None + + if not constants.LX_CONFIG_SLUB_DEBUG: + raise gdb.GdbError("CONFIG_SLUB_DEBUG is not enabled") + + def count_free(slab): + total_free = 0 + for slab in lists.list_for_each_entry(slab, slab_ptr_type, 'slab_list'): + total_free += int(slab['objects'] - slab['inuse']) + return total_free + + gdb.write("{:^18} | {:^20} | {:^12} | {:^12} | {:^8} | {:^11} | {:^13}\n".format('Pointer', 'name', 'active_objs', 'num_objs', 'objsize', 'objperslab', 'pagesperslab')) + gdb.write("{:-^18} | {:-^20} | {:-^12} | {:-^12} | {:-^8} | {:-^11} | {:-^13}\n".format('', '', '', '', '', '', '')) + + slab_caches = gdb.parse_and_eval("slab_caches") + if mm.page_ops().ops.MAX_NUMNODES > 1: + nr_node_ids = int(gdb.parse_and_eval("nr_node_ids")) + else: + nr_node_ids = 1 + + for cache in lists.list_for_each_entry(slab_caches, kmem_cache_ptr_type, 'list'): + nr_objs = 0 + nr_free = 0 + nr_slabs = 0 + for i in range(0, nr_node_ids): + cache_node = cache['node'][i] + try: + nr_slabs += cache_node['nr_slabs']['counter'] + nr_objs = int(cache_node['total_objects']['counter']) + nr_free = count_free(cache_node['partial']) + except: + raise gdb.GdbError(traceback.format_exc()) + active_objs = nr_objs - nr_free + num_objs = nr_objs + active_slabs = nr_slabs + objects_per_slab = oo_objects(cache['oo']) + cache_order = oo_order(cache['oo']) + gdb.write("{:18s} | {:20.19s} | {:12} | {:12} | {:8} | {:11} | {:13}\n".format(hex(cache), cache['name'].string(), str(active_objs), str(num_objs), str(cache['size']), str(objects_per_slab), str(1 << cache_order))) + +class LxSlabInfo(gdb.Command): + """Show slabinfo""" + + def __init__(self): + super(LxSlabInfo, self).__init__("lx-slabinfo", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + slabinfo() +LxSlabInfo() diff --git a/scripts/gdb/linux/stackdepot.py b/scripts/gdb/linux/stackdepot.py new file mode 100644 index 000000000000..047d329a6a12 --- /dev/null +++ b/scripts/gdb/linux/stackdepot.py @@ -0,0 +1,55 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2023 MediaTek Inc. +# +# Authors: +# Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> +# + +import gdb +from linux import utils, constants + +if constants.LX_CONFIG_STACKDEPOT: + stack_record_type = utils.CachedType('struct stack_record') + DEPOT_STACK_ALIGN = 4 + +def stack_depot_fetch(handle): + global DEPOT_STACK_ALIGN + global stack_record_type + + stack_depot_disabled = gdb.parse_and_eval('stack_depot_disabled') + + if stack_depot_disabled: + raise gdb.GdbError("stack_depot_disabled\n") + + handle_parts_t = gdb.lookup_type("union handle_parts") + parts = handle.cast(handle_parts_t) + offset = parts['offset'] << DEPOT_STACK_ALIGN + pool_index_cached = gdb.parse_and_eval('pool_index') + + if parts['pool_index'] > pool_index_cached: + gdb.write("pool index %d out of bounds (%d) for stack id 0x%08x\n" % (parts['pool_index'], pool_index_cached, handle)) + return gdb.Value(0), 0 + + stack_pools = gdb.parse_and_eval('stack_pools') + + try: + pool = stack_pools[parts['pool_index']] + stack = (pool + gdb.Value(offset).cast(utils.get_size_t_type())).cast(stack_record_type.get_type().pointer()) + size = int(stack['size'].cast(utils.get_ulong_type())) + return stack['entries'], size + except Exception as e: + gdb.write("%s\n" % e) + return gdb.Value(0), 0 + +def stack_depot_print(handle): + if not constants.LX_CONFIG_STACKDEPOT: + raise gdb.GdbError("CONFIG_STACKDEPOT is not enabled") + + entries, nr_entries = stack_depot_fetch(handle) + if nr_entries > 0: + for i in range(0, nr_entries): + try: + gdb.execute("x /i 0x%x" % (int(entries[i]))) + except Exception as e: + gdb.write("%s\n" % e) diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py index fdad3f32c747..5179edd1b627 100644 --- a/scripts/gdb/linux/symbols.py +++ b/scripts/gdb/linux/symbols.py @@ -89,29 +89,34 @@ lx-symbols command.""" return name return None - def _section_arguments(self, module): + def _section_arguments(self, module, module_addr): try: sect_attrs = module['sect_attrs'].dereference() except gdb.error: - return "" + return str(module_addr) + attrs = sect_attrs['attrs'] section_name_to_address = { attrs[n]['battr']['attr']['name'].string(): attrs[n]['address'] for n in range(int(sect_attrs['nsections']))} + + textaddr = section_name_to_address.get(".text", module_addr) args = [] for section_name in [".data", ".data..read_mostly", ".rodata", ".bss", - ".text", ".text.hot", ".text.unlikely"]: + ".text.hot", ".text.unlikely"]: address = section_name_to_address.get(section_name) if address: args.append(" -s {name} {addr}".format( name=section_name, addr=str(address))) - return "".join(args) + return "{textaddr} {sections}".format( + textaddr=textaddr, sections="".join(args)) - def load_module_symbols(self, module): + def load_module_symbols(self, module, module_file=None): module_name = module['name'].string() module_addr = str(module['mem'][constants.LX_MOD_TEXT]['base']).split()[0] - module_file = self._get_module_file(module_name) + if not module_file: + module_file = self._get_module_file(module_name) if not module_file and not self.module_files_updated: self._update_module_files() module_file = self._get_module_file(module_name) @@ -125,16 +130,28 @@ lx-symbols command.""" module_addr = hex(int(module_addr, 0) + plt_offset + plt_size) gdb.write("loading @{addr}: {filename}\n".format( addr=module_addr, filename=module_file)) - cmdline = "add-symbol-file {filename} {addr}{sections}".format( + cmdline = "add-symbol-file {filename} {sections}".format( filename=module_file, - addr=module_addr, - sections=self._section_arguments(module)) + sections=self._section_arguments(module, module_addr)) gdb.execute(cmdline, to_string=True) if module_name not in self.loaded_modules: self.loaded_modules.append(module_name) else: gdb.write("no module object found for '{0}'\n".format(module_name)) + def load_ko_symbols(self, mod_path): + self.loaded_modules = [] + module_list = modules.module_list() + + for module in module_list: + module_name = module['name'].string() + module_pattern = ".*/{0}\.ko(?:.debug)?$".format( + module_name.replace("_", r"[_\-]")) + if re.match(module_pattern, mod_path) and os.path.exists(mod_path): + self.load_module_symbols(module, mod_path) + return + raise gdb.GdbError("%s is not a valid .ko\n" % mod_path) + def load_all_symbols(self): gdb.write("loading vmlinux\n") @@ -173,6 +190,11 @@ lx-symbols command.""" self.module_files = [] self.module_files_updated = False + argv = gdb.string_to_argv(arg) + if len(argv) == 1: + self.load_ko_symbols(argv[0]) + return + self.load_all_symbols() if hasattr(gdb, 'Breakpoint'): diff --git a/scripts/gdb/linux/utils.py b/scripts/gdb/linux/utils.py index 9f44df13761e..7d5278d815fa 100644 --- a/scripts/gdb/linux/utils.py +++ b/scripts/gdb/linux/utils.py @@ -35,12 +35,32 @@ class CachedType: long_type = CachedType("long") +ulong_type = CachedType("unsigned long") +uint_type = CachedType("unsigned int") atomic_long_type = CachedType("atomic_long_t") +size_t_type = CachedType("size_t") +struct_page_type = CachedType("struct page") + +def get_uint_type(): + global uint_type + return uint_type.get_type() + +def get_page_type(): + global struct_page_type + return struct_page_type.get_type() def get_long_type(): global long_type return long_type.get_type() +def get_ulong_type(): + global ulong_type + return ulong_type.get_type() + +def get_size_t_type(): + global size_t_type + return size_t_type.get_type() + def offset_of(typeobj, field): element = gdb.Value(0).cast(typeobj) return int(str(element[field].address).split()[0], 16) diff --git a/scripts/gdb/linux/vmalloc.py b/scripts/gdb/linux/vmalloc.py new file mode 100644 index 000000000000..48e4a4fae7bb --- /dev/null +++ b/scripts/gdb/linux/vmalloc.py @@ -0,0 +1,56 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (c) 2023 MediaTek Inc. +# +# Authors: +# Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> +# + +import gdb +import re +from linux import lists, utils, stackdepot, constants, mm + +vmap_area_type = utils.CachedType('struct vmap_area') +vmap_area_ptr_type = vmap_area_type.get_type().pointer() + +def is_vmalloc_addr(x): + pg_ops = mm.page_ops().ops + addr = pg_ops.kasan_reset_tag(x) + return addr >= pg_ops.VMALLOC_START and addr < pg_ops.VMALLOC_END + +class LxVmallocInfo(gdb.Command): + """Show vmallocinfo""" + + def __init__(self): + super(LxVmallocInfo, self).__init__("lx-vmallocinfo", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + vmap_area_list = gdb.parse_and_eval('vmap_area_list') + for vmap_area in lists.list_for_each_entry(vmap_area_list, vmap_area_ptr_type, "list"): + if not vmap_area['vm']: + gdb.write("0x%x-0x%x %10d vm_map_ram\n" % (vmap_area['va_start'], vmap_area['va_end'], + vmap_area['va_end'] - vmap_area['va_start'])) + continue + v = vmap_area['vm'] + gdb.write("0x%x-0x%x %10d" % (v['addr'], v['addr'] + v['size'], v['size'])) + if v['caller']: + gdb.write(" %s" % str(v['caller']).split(' ')[-1]) + if v['nr_pages']: + gdb.write(" pages=%d" % v['nr_pages']) + if v['phys_addr']: + gdb.write(" phys=0x%x" % v['phys_addr']) + if v['flags'] & constants.LX_VM_IOREMAP: + gdb.write(" ioremap") + if v['flags'] & constants.LX_VM_ALLOC: + gdb.write(" vmalloc") + if v['flags'] & constants.LX_VM_MAP: + gdb.write(" vmap") + if v['flags'] & constants.LX_VM_USERMAP: + gdb.write(" user") + if v['flags'] & constants.LX_VM_DMA_COHERENT: + gdb.write(" dma-coherent") + if is_vmalloc_addr(v['pages']): + gdb.write(" vpages") + gdb.write("\n") + +LxVmallocInfo() diff --git a/scripts/gdb/vmlinux-gdb.py b/scripts/gdb/vmlinux-gdb.py index 2d32308c3f7a..fc53cdf286f1 100644 --- a/scripts/gdb/vmlinux-gdb.py +++ b/scripts/gdb/vmlinux-gdb.py @@ -41,6 +41,11 @@ else: import linux.genpd import linux.device import linux.vfs - import linux.mm + import linux.pgtable import linux.radixtree import linux.interrupts + import linux.mm + import linux.stackdepot + import linux.page_owner + import linux.slab + import linux.vmalloc diff --git a/scripts/headers_install.sh b/scripts/headers_install.sh index 36b56b746fce..afdddc82f02b 100755 --- a/scripts/headers_install.sh +++ b/scripts/headers_install.sh @@ -76,7 +76,6 @@ arch/arc/include/uapi/asm/swab.h:CONFIG_ARC_HAS_SWAPE arch/arm/include/uapi/asm/ptrace.h:CONFIG_CPU_ENDIAN_BE8 arch/hexagon/include/uapi/asm/ptrace.h:CONFIG_HEXAGON_ARCH_VERSION arch/hexagon/include/uapi/asm/user.h:CONFIG_HEXAGON_ARCH_VERSION -arch/ia64/include/uapi/asm/cmpxchg.h:CONFIG_IA64_DEBUG_CMPXCHG arch/m68k/include/uapi/asm/ptrace.h:CONFIG_COLDFIRE arch/nios2/include/uapi/asm/swab.h:CONFIG_NIOS2_CI_SWAB_NO arch/nios2/include/uapi/asm/swab.h:CONFIG_NIOS2_CI_SWAB_SUPPORT diff --git a/tools/testing/selftests/proc/proc-empty-vm.c b/tools/testing/selftests/proc/proc-empty-vm.c index e1052443d070..b16c13688b88 100644 --- a/tools/testing/selftests/proc/proc-empty-vm.c +++ b/tools/testing/selftests/proc/proc-empty-vm.c @@ -1,3 +1,4 @@ +#if defined __amd64__ || defined __i386__ /* * Copyright (c) 2022 Alexey Dobriyan <adobriyan@gmail.com> * @@ -37,6 +38,10 @@ #include <sys/wait.h> #include <unistd.h> +#ifdef __amd64__ +#define TEST_VSYSCALL +#endif + /* * 0: vsyscall VMA doesn't exist vsyscall=none * 1: vsyscall VMA is --xp vsyscall=xonly @@ -119,6 +124,7 @@ static void sigaction_SIGSEGV(int _, siginfo_t *__, void *___) _exit(EXIT_FAILURE); } +#ifdef TEST_VSYSCALL static void sigaction_SIGSEGV_vsyscall(int _, siginfo_t *__, void *___) { _exit(g_vsyscall); @@ -170,6 +176,7 @@ static void vsyscall(void) exit(1); } } +#endif static int test_proc_pid_maps(pid_t pid) { @@ -299,7 +306,9 @@ int main(void) { int rv = EXIT_SUCCESS; +#ifdef TEST_VSYSCALL vsyscall(); +#endif switch (g_vsyscall) { case 0: @@ -346,6 +355,14 @@ int main(void) #ifdef __amd64__ munmap(NULL, ((size_t)1 << 47) - 4096); +#elif defined __i386__ + { + size_t len; + + for (len = -4096;; len -= 4096) { + munmap(NULL, len); + } + } #else #error "implement 'unmap everything'" #endif @@ -386,3 +403,9 @@ int main(void) return rv; } +#else +int main(void) +{ + return 4; +} +#endif diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index 6327c9c400e0..507555714b1d 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -41,7 +41,6 @@ CONFIG_KALLSYMS=y CONFIG_BUG=y CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y CONFIG_JUMP_LABEL=y -CONFIG_EMBEDDED=n CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_SHMEM=y |