summaryrefslogtreecommitdiff
path: root/include/linux
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-06-26 13:59:56 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2023-06-26 13:59:56 -0700
commit9244724fbf8ab394a7210e8e93bf037abc859514 (patch)
tree95c2b9caf65ac531b6649247e99dc554e3bca96c /include/linux
parent7cffdbe3607a6cc2dc02d135e13732ec36bc4e28 (diff)
parentbf5a8c26ad7caf0772a1cd48c8a0924e48bdbaf0 (diff)
Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP updates from Thomas Gleixner: "A large update for SMP management: - Parallel CPU bringup The reason why people are interested in parallel bringup is to shorten the (kexec) reboot time of cloud servers to reduce the downtime of the VM tenants. The current fully serialized bringup does the following per AP: 1) Prepare callbacks (allocate, intialize, create threads) 2) Kick the AP alive (e.g. INIT/SIPI on x86) 3) Wait for the AP to report alive state 4) Let the AP continue through the atomic bringup 5) Let the AP run the threaded bringup to full online state There are two significant delays: #3 The time for an AP to report alive state in start_secondary() on x86 has been measured in the range between 350us and 3.5ms depending on vendor and CPU type, BIOS microcode size etc. #4 The atomic bringup does the microcode update. This has been measured to take up to ~8ms on the primary threads depending on the microcode patch size to apply. On a two socket SKL server with 56 cores (112 threads) the boot CPU spends on current mainline about 800ms busy waiting for the APs to come up and apply microcode. That's more than 80% of the actual onlining procedure. This can be reduced significantly by splitting the bringup mechanism into two parts: 1) Run the prepare callbacks and kick the AP alive for each AP which needs to be brought up. The APs wake up, do their firmware initialization and run the low level kernel startup code including microcode loading in parallel up to the first synchronization point. (#1 and #2 above) 2) Run the rest of the bringup code strictly serialized per CPU (#3 - #5 above) as it's done today. Parallelizing that stage of the CPU bringup might be possible in theory, but it's questionable whether required surgery would be justified for a pretty small gain. If the system is large enough the first AP is already waiting at the first synchronization point when the boot CPU finished the wake-up of the last AP. That reduces the AP bringup time on that SKL from ~800ms to ~80ms, i.e. by a factor ~10x. The actual gain varies wildly depending on the system, CPU, microcode patch size and other factors. There are some opportunities to reduce the overhead further, but that needs some deep surgery in the x86 CPU bringup code. For now this is only enabled on x86, but the core functionality obviously works for all SMP capable architectures. - Enhancements for SMP function call tracing so it is possible to locate the scheduling and the actual execution points. That allows to measure IPI delivery time precisely" * tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) trace,smp: Add tracepoints for scheduling remotelly called functions trace,smp: Add tracepoints around remotelly called functions MAINTAINERS: Add CPU HOTPLUG entry x86/smpboot: Fix the parallel bringup decision x86/realmode: Make stack lock work in trampoline_compat() x86/smp: Initialize cpu_primary_thread_mask late cpu/hotplug: Fix off by one in cpuhp_bringup_mask() x86/apic: Fix use of X{,2}APIC_ENABLE in asm with older binutils x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it x86/smpboot: Support parallel startup of secondary CPUs x86/smpboot: Implement a bit spinlock to protect the realmode stack x86/apic: Save the APIC virtual base address cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE x86/apic: Provide cpu_primary_thread mask x86/smpboot: Enable split CPU startup cpu/hotplug: Provide a split up CPUHP_BRINGUP mechanism cpu/hotplug: Reset task stack state in _cpu_up() cpu/hotplug: Remove unused state functions riscv: Switch to hotplug core state synchronization parisc: Switch to hotplug core state synchronization ...
Diffstat (limited to 'include/linux')
-rw-r--r--include/linux/cpu.h4
-rw-r--r--include/linux/cpuhotplug.h17
2 files changed, 17 insertions, 4 deletions
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4893c4ac026d..6e6e57ec69e8 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -190,8 +190,6 @@ void arch_cpu_finalize_init(void);
static inline void arch_cpu_finalize_init(void) { }
#endif
-int cpu_report_state(int cpu);
-int cpu_check_up_prepare(int cpu);
void cpu_set_state_online(int cpu);
void play_idle_precise(u64 duration_ns, u64 latency_ns);
@@ -201,8 +199,6 @@ static inline void play_idle(unsigned long duration_us)
}
#ifdef CONFIG_HOTPLUG_CPU
-bool cpu_wait_death(unsigned int cpu, int seconds);
-bool cpu_report_death(void);
void cpuhp_report_idle_dead(void);
#else
static inline void cpuhp_report_idle_dead(void) { }
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 3ceb9dfa0993..25b6e6e6ba6b 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -133,6 +133,7 @@ enum cpuhp_state {
CPUHP_MIPS_SOC_PREPARE,
CPUHP_BP_PREPARE_DYN,
CPUHP_BP_PREPARE_DYN_END = CPUHP_BP_PREPARE_DYN + 20,
+ CPUHP_BP_KICK_AP,
CPUHP_BRINGUP_CPU,
/*
@@ -518,4 +519,20 @@ void cpuhp_online_idle(enum cpuhp_state state);
static inline void cpuhp_online_idle(enum cpuhp_state state) { }
#endif
+struct task_struct;
+
+void cpuhp_ap_sync_alive(void);
+void arch_cpuhp_sync_state_poll(void);
+void arch_cpuhp_cleanup_kick_cpu(unsigned int cpu);
+int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *tidle);
+bool arch_cpuhp_init_parallel_bringup(void);
+
+#ifdef CONFIG_HOTPLUG_CORE_SYNC_DEAD
+void cpuhp_ap_report_dead(void);
+void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu);
+#else
+static inline void cpuhp_ap_report_dead(void) { }
+static inline void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu) { }
+#endif
+
#endif