Merge tag 'wq-for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo: "Two doc update patches and the following three fixes: - On single node systems, the default pool is used but the node_nr_active for the default pool was set to min_active. This effectively limited the max concurrency of unbound pools on single node systems to 8 causing performance regressions on some workloads. Fixed by setting the default pool's node_nr_active to max_active. - wq_update_node_max_active() could trigger divide-by-zero if the intersection between the allowed CPUs for an unbound workqueue and online CPUs becomes empty. - When kick_pool() was trying to repatriate a worker to a CPU in its pod by setting task->wake_cpu, it didn't consider whether the CPU being selected is online or not which obviously can lead to subobtimal behaviors. On s390, this triggered a crash in arch code. The workqueue patch removes the gross misbehavior but doesn't fix the crash completely as there's a race window in which CPUs can go down after wake_cpu is set. Need to decide whether the fix should be on the core or arch side" * tag 'wq-for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Fix divide error in wq_update_node_max_active() workqueue: The default node_nr_active should have its max set to max_active workqueue: Fix selection of wake_cpu in kick_pool() docs/zh_CN: core-api: Update translation of workqueue.rst to 6.9-rc1 Documentation/core-api: Update events_freezable_power references.
author: Linus Torvalds <torvalds@linux-foundation.org> 2024-04-29 15:57:37 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> 2024-04-29 15:57:37 -0700
commit: 98369dccd2f8e16bf4c6621053af7aa4821dcf8e (patch)
tree: 04f8e6b2af1bfb06d5696898b2b8a8c7b3929433 /kernel
parent: d03d4188908883e1705987795a09aeed31424f66 (diff)
parent: 91f098704c25106d88706fc9f8bcfce01fdb97df (diff)
1 files changed, 16 insertions, 3 deletions
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0066c8f6c154..d2dbe099286b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1277,8 +1277,12 @@ static bool kick_pool(struct worker_pool *pool)
 	    !cpumask_test_cpu(p->wake_cpu, pool->attrs->__pod_cpumask)) {
 		struct work_struct *work = list_first_entry(&pool->worklist,
 						struct work_struct, entry);
-		p->wake_cpu = cpumask_any_distribute(pool->attrs->__pod_cpumask);
-		get_work_pwq(work)->stats[PWQ_STAT_REPATRIATED]++;
+		int wake_cpu = cpumask_any_and_distribute(pool->attrs->__pod_cpumask,
+							  cpu_online_mask);
+		if (wake_cpu < nr_cpu_ids) {
+			p->wake_cpu = wake_cpu;
+			get_work_pwq(work)->stats[PWQ_STAT_REPATRIATED]++;
+		}
 	}
 #endif
 	wake_up_process(p);
@@ -1594,6 +1598,15 @@ static void wq_update_node_max_active(struct workqueue_struct *wq, int off_cpu)
 	if (off_cpu >= 0)
 		total_cpus--;
 
+	/* If all CPUs of the wq get offline, use the default values */
+	if (unlikely(!total_cpus)) {
+		for_each_node(node)
+			wq_node_nr_active(wq, node)->max = min_active;
+
+		wq_node_nr_active(wq, NUMA_NO_NODE)->max = max_active;
+		return;
+	}
+
 	for_each_node(node) {
 		int node_cpus;
 
@@ -1606,7 +1619,7 @@ static void wq_update_node_max_active(struct workqueue_struct *wq, int off_cpu)
 			      min_active, max_active);
 	}
 
-	wq_node_nr_active(wq, NUMA_NO_NODE)->max = min_active;
+	wq_node_nr_active(wq, NUMA_NO_NODE)->max = max_active;
 }
 
 /**
author	Linus Torvalds <torvalds@linux-foundation.org>	2024-04-29 15:57:37 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2024-04-29 15:57:37 -0700
commit	98369dccd2f8e16bf4c6621053af7aa4821dcf8e (patch)
tree	04f8e6b2af1bfb06d5696898b2b8a8c7b3929433 /kernel
parent	d03d4188908883e1705987795a09aeed31424f66 (diff)
parent	91f098704c25106d88706fc9f8bcfce01fdb97df (diff)