summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)Author
2024-06-17Merge tag 'thermal-v6.10-rc4' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux Merge thermal driver fixes for 6.10-rc5 from Daniel Lezcano: "- Remove the filtered mode for mt8188 as it is not supported on this platform (Julien Panis) - Fail in case the golden temperature is zero as that means the efuse data is not correctly set (Julien Panis)" * tag 'thermal-v6.10-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
2024-06-14thermal: core: Change PM notifier priority to the minimumRafael J. Wysocki
It is reported that commit 5a5efdaffda5 ("thermal: core: Resume thermal zones asynchronously") causes battery data in sysfs on Thinkpad P1 Gen2 to become invalid after a resume from S3 (and it is necessary to reboot the machine to restore correct battery data). Some investigation into the problem indicated that it happened because, after the commit in question, the ACPI battery PM notifier ran in parallel with thermal_zone_device_resume() for one of the thermal zones which apparently confused the platform firmware on the affected system. While the exact reason for the firmware confusion remains unclear, it is arguably not particularly relevant, and the expected behavior of the affected system can be restored by making the thermal PM notifier run at the lowest priority which avoids interference between work items spawned by it and the other PM notifiers (that will run before those work items now). Fixes: 5a5efdaffda5 ("thermal: core: Resume thermal zones asynchronously") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218881 Reported-by: fhortner@yahoo.de Tested-by: fhortner@yahoo.de Cc: 6.8+ <stable@vger.kernel.org> # 6.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-14thermal: core: Synchronize suspend-prepare and post-suspend actionsRafael J. Wysocki
After commit 5a5efdaffda5 ("thermal: core: Resume thermal zones asynchronously") it is theoretically possible that, if a system suspend starts immediately after a system resume, thermal_zone_device_resume() spawned by the thermal PM notifier for one of the thermal zones at the end of the system resume will run after the PM thermal notifier for the suspend-prepare action. If that happens, tz->suspended set by the latter will be reset by the former which may lead to unexpected consequences. To avoid that race, synchronize thermal_zone_device_resume() with the suspend-prepare thermal PM notifier with the help of additional bool field and completion in struct thermal_zone_device. Note that this also ensures running __thermal_zone_device_update() at least once for each thermal zone between system resume and the following system suspend in case it is needed to start thermal mitigation. Fixes: 5a5efdaffda5 ("thermal: core: Resume thermal zones asynchronously") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-12thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse ↵Julien Panis
data This patch prevents from registering thermal entries and letting the driver misbehave if efuse data is invalid. A device is not properly calibrated if the golden temperature is zero. Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver") Signed-off-by: Julien Panis <jpanis@baylibre.com> Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20240604-mtk-thermal-calib-check-v2-1-8f258254051d@baylibre.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal: core: Avoid calling .trip_crossed() for critical and hot tripsRafael J. Wysocki
Invoking the governor .trip_crossed() callback for critical and hot trips is pointless because they are handled directly by the core, so make thermal_governor_trip_crossed() avoid doing that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11thermal: gov_bang_bang: Drop unnecessary cooling device target state checksRafael J. Wysocki
Some cooling device target state checks in bang_bang_control() done before setting the new target state are not necessary after recent changes, so drop them. Also avoid updating the target state before checking it for unexpected values. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11thermal: trip: Use READ_ONCE() for lockless access to trip propertiesRafael J. Wysocki
When accessing trip temperature and hysteresis without locking, it is better to use READ_ONCE() to prevent compiler optimizations possibly affecting the read from being applied. Of course, for the READ_ONCE() to be effective, WRITE_ONCE() needs to be used when updating their values. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11thermal: trip: Make thermal_zone_set_trips() use trip thresholdsRafael J. Wysocki
Modify thermal_zone_set_trips() to use trip thresholds instead of computing the low temperature for each trip to avoid deriving both the high and low temperature levels from the same trip (which may happen if the zone temperature falls into the hysteresis range of one trip). Accordingly, make __thermal_zone_device_update() call thermal_zone_set_trips() later, when threshold values have been updated for all trips. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips()Rafael J. Wysocki
Drop the pointless double underline prefix from the function name as per the subject. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal: trip: Use common set of trip type namesRafael J. Wysocki
Use the same set of trip type names in sysfs and in the thermal debug code output. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Move some statements from under thermal_dbg->lockRafael J. Wysocki
The tz_dbg local variable assignments in thermal_debug_tz_trip_up(), thermal_debug_tz_trip_down(), and thermal_debug_update_trip_stats() need not be carried out under thermal_dbg->lock, so move them from under that lock (to avoid possible future confusion). While at it, reorder local variable definitions in thermal_debug_tz_trip_up() for more clarity. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Compute maximum temperature for mitigation episode as a wholeRafael J. Wysocki
Notice that the maximum temperature above the trip point must be the same for all of the trip points involved in a given mitigation episode, so it need not be computerd for each of them separately. It is sufficient to compute the maximum temperature for the mitigation episode as a whole and print it accordingly, so do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Adjust check for trips without statistics in tze_seq_show()Rafael J. Wysocki
Initialize the trip_temp field in struct trip_stats to THERMAL_TEMP_INVALID and adjust the check for trips without statistics in tze_seq_show() to look at that field instead of comparing min and max. This will mostly be useful to simplify subsequent changes. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Fix up units in "mitigations" filesRafael J. Wysocki
Print temperature units as m°C rather than °mC (the meaning of which is unclear) and add time unit to the duration column. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Print mitigation timestamp value in millisecondsRafael J. Wysocki
Because mitigation episode duration is printed in milliseconds, there is no reason to print timestamp information for mitigation episodes in smaller units which also makes it somewhat harder to interpret the numbers. Print it in milliseconds for consistency. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Do not extend mitigation episodes beyond system resumeRafael J. Wysocki
Because thermal zone handling by the thermal core is started from scratch during resume from system-wide suspend, prevent the debug code from extending mitigation episodes beyond that point by ending the mitigation episode currently in progress, if any, for each thermal zone. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal/debugfs: Use helper to update trip point overstepping durationRafael J. Wysocki
Add a helper for updating trip point overstepping duration to be called from thermal_debug_tz_trip_down(). Subsequently, it will also be used during resume from system-wide suspend. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11thermal: gov_step_wise: Restore passive polling managementRafael J. Wysocki
Consider a thermal zone with one passive trip point, a cooling device with 3 states (0, 1, 2) bound to it, passive polling enabled (nonzero passive_delay_jiffies) and no regular polling (polling_delay_jiffies equal to 0) that is managed by the Step-Wise governor. Suppose that the initial state of the cooling device is 0 and the zone temperature is below the trip point to start with. When the trip point is crossed, tz->passive is incremented by the thermal core and the governor's .manage() callback is invoked. It sets 'throttle' to 'true' for the trip in question and get_target_state() returns 1 for the instance corresponding to the cooling device (say that 'upper' and 'lower' are set to 2 and 0 for it, respectively), so its state changes to 1. Passive polling is still active for the zone, so next time the temperature is updated, the governor's .manage() callback will be invoked again. If the temperature is still rising, it will change the state of the cooling device to 2. Now suppose that next time the zone temperature is updated, it falls below the trip point, so tz->passive is decremented for the zone (say it becomes 0 then) and the governor's .manage() callbacks runs. It finds that the temperature trend for the zone is 'falling' and 'throttle' will be set to 'false' for the trip in question, so the cooling device's state will be changed to 1. However, because tz->polling is 0 for the zone, the governor's .manage() callback may not be invoked again for a long time and the cooling device's state will not be reset back to 0. This can happen because commit 042a3d80f118 ("thermal: core: Move passive polling management to the core") removed passive polling management from the Step-Wise governor. Before that change, thermal_zone_trip_update() would bump up tz->passive when changing the target state for a thermal instance from "no target" to a specific value and it would drop tz->passive when changing it back to "no target" which would cause passive polling to be active for the zone until the governor has reset the states of all cooling devices. In particular, in the example above tz->passive would be incremented when changing the state of the cooling device from 0 to 1 and then it would be still nonzero when the state of the cooling device was changed from 2 to 1. To prevent this problem from occurring, restore the passive polling management in the Step-Wise governor by partially reverting the commit in question and update the comment in the restored code to explain its role more clearly. Fixes: 042a3d80f118 ("thermal: core: Move passive polling management to the core") Closes: https://lore.kernel.org/linux-pm/ZmVfcEOxmjUHZTSX@hovoldconsulting.com Reported-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07thermal: intel: intel_pch: Improve cooling logZhang Rui
The intel_pch_thermal cooling mechanism currently only provides one of the following final conclusions: 1. intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [48C] 2. intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 30700 ms delay 3. intel_pch_thermal 0000:00:12.0: CPU-PCH is hot [60C] after 60000 ms delay. S0ix might fail 4. intel_pch_thermal 0000:00:12.0: Wakeup event detected, abort cooling This does not provide sufficient context about what is happening, especially for case 4. Add one line log to indicate when PCH overheats and the cooling delay has started. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07thermal: int3403: remove unused struct 'int3403_performance_state'Dr. David Alan Gilbert
'int3403_performance_state' has never been used since the original commit 4384b8fe162d ("Thermal: introduce int3403 thermal driver"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07thermal: int3400: Use sizeof(*pointer) instead of sizeof(type)Erick Archer
It is preferred to use sizeof(*pointer) instead of sizeof(type) due to the type of the variable can change and one needs not change the former (unlike the latter). This patch has no effect on runtime behavior. Signed-off-by: Erick Archer <erick.archer@outlook.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07thermal: intel: intel_soc_dts_thermal: Switch to new Intel CPU model definesTony Luck
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07thermal: intel: intel_tcc_cooling: Switch to new Intel CPU model definesTony Luck
New CPU #defines encode vendor and family as well as model. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07thermal: core: Do not fail cdev registration because of invalid initial stateRafael J. Wysocki
It is reported that commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") causes the ACPI fan driver to fail probing on some systems which turns out to be due to the _FST control method returning an invalid value until _FSL is first evaluated for the given fan. If this happens, the .get_cur_state() cooling device callback returns an error and __thermal_cooling_device_register() fails as uses that callback after commit 31a0fa0019b0. Arguably, _FST should not return an invalid value even if it is evaluated before _FSL, so this may be regarded as a platform firmware issue, but at the same time it is not a good enough reason for failing the cooling device registration where the initial cooling device state is only needed to initialize a thermal debug facility. Accordingly, modify __thermal_cooling_device_register() to avoid calling thermal_debug_cdev_add() instead of returning an error if the initial .get_cur_state() callback invocation fails. Fixes: 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") Closes: https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@collabora.com Reported-by: Laura Nao <laura.nao@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Laura Nao <laura.nao@collabora.com>
2024-06-04thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188Julien Panis
Filtered mode is not supported on mt8188 SoC and is the source of bad results. Move to immediate mode which provides good temperatures. Fixes: f4745f546e60 ("thermal/drivers/mediatek/lvts_thermal: Add MT8188 support") Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Julien Panis <jpanis@baylibre.com> Link: https://lore.kernel.org/r/20240516-mtk-thermal-mt8188-mode-fix-v2-1-40a317442c62@baylibre.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-05-27thermal: trip: Trigger trip down notifications when trips involved in ↵Rafael J. Wysocki
mitigation become invalid When a trip point becomes invalid after being crossed on the way up, it is involved in a mitigation episode that needs to be adjusted to compensate for the trip going away. For this reason, introduce thermal_zone_trip_down() as a wrapper around thermal_trip_crossed() and make thermal_zone_set_trip_temp() call it if the new temperature of the trip at hand is equal to THERMAL_TEMP_INVALID and it has been crossed on the way up to trigger all of the necessary adjustments in user space, the thermal debug code and the zone governor. Fixes: 8c69a777e480 ("thermal: core: Fix the handling of invalid trip points") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27thermal: core: Introduce thermal_trip_crossed()Rafael J. Wysocki
Add a helper function called thermal_trip_crossed() to be invoked by __thermal_zone_device_update() in order to notify user space, the thermal debug code and the zone governor about trip crossing. Subsequently, this will also be used in the case when a trip point becomes invalid after being crossed on the way up. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27thermal/debugfs: Allow tze_seq_show() to print statistics for invalid tripsRafael J. Wysocki
Commit a6258fde8de3 ("thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats") modified tze_seq_show() to skip invalid trips, but it overlooked the fact that a trip may become invalid during a mitigation eposide involving it, in which case its statistics should still be reported. For this reason, remove the invalid trip temperature check from the main loop in tze_seq_show(). The trips that have never been valid will still be skipped after this change because there are no statistics to report for them. Fixes: a6258fde8de3 ("thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27thermal/debugfs: Print initial trip temperature and hysteresis in tze_seq_show()Rafael J. Wysocki
The temperature and hysteresis of a trip point may change during a mitigation episode it is involved in (it may even become invalid altogether), so in order to avoid possible confusion related to that, store the temperature and hysteresis of trip points at the time they are crossed on the way up and print those values instead of their current temperature and hysteresis. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-22tracing/treewide: Remove second parameter of __assign_str()Steven Rostedt (Google)
With the rework of how the __string() handles dynamic strings where it saves off the source string in field in the helper structure[1], the assignment of that value to the trace event field is stored in the helper value and does not need to be passed in again. This means that with: __string(field, mystring) Which use to be assigned with __assign_str(field, mystring), no longer needs the second parameter and it is unused. With this, __assign_str() will now only get a single parameter. There's over 700 users of __assign_str() and because coccinelle does not handle the TRACE_EVENT() macro I ended up using the following sed script: git grep -l __assign_str | while read a ; do sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file; mv /tmp/test-file $a; done I then searched for __assign_str() that did not end with ';' as those were multi line assignments that the sed script above would fail to catch. Note, the same updates will need to be done for: __assign_str_len() __assign_rel_str() __assign_rel_str_len() I tested this with both an allmodconfig and an allyesconfig (build only for both). [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts. Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-22Merge tag 'driver-core-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the small set of driver core and kernfs changes for 6.10-rc1. Nothing major here at all, just a small set of changes for some driver core apis, and minor fixups. Included in here are: - sysfs_bin_attr_simple_read() helper added and used - device_show_string() helper added and used All usages of these were acked by the various maintainers. Also in here are: - kernfs minor cleanup - removed unused functions - typo fix in documentation - pay attention to sysfs_create_link() failures in module.c finally All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: device property: Fix a typo in the description of device_get_child_node_count() kernfs: mount: Remove unnecessary ‘NULL’ values from knparent scsi: Use device_show_string() helper for sysfs attributes platform/x86: Use device_show_string() helper for sysfs attributes perf: Use device_show_string() helper for sysfs attributes IB/qib: Use device_show_string() helper for sysfs attributes hwmon: Use device_show_string() helper for sysfs attributes driver core: Add device_show_string() helper for sysfs attributes treewide: Use sysfs_bin_attr_simple_read() helper sysfs: Add sysfs_bin_attr_simple_read() helper module: don't ignore sysfs_create_link() failures driver core: Remove unused platform_notify, platform_notify_remove
2024-05-21Merge tag 'thermal-6.10-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "These fix the MediaTek lvts_thermal driver and the handling of trip points that start as invalid and are adjusted later by user space via sysfs. Specifics: - Fix and clean up the MediaTek lvts_thermal driver (Julien Panis) - Prevent invalid trip point handling from triggering spurious trip point crossing events and allow passive polling to stop when a passive trip point involved in it becomes invalid (Rafael Wysocki)" * tag 'thermal-6.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: core: Fix the handling of invalid trip points thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl index thermal/drivers/mediatek/lvts_thermal: Remove unused members from struct lvts_ctrl_data thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_data
2024-05-17thermal: core: Fix the handling of invalid trip pointsRafael J. Wysocki
Commit 9ad18043fb35 ("thermal: core: Send trip crossing notifications at init time if needed") overlooked the case when a trip point that has started as invalid is set to a valid temperature later. Namely, the initial threshold value for all trips is zero, so if a previously invalid trip becomes valid and its (new) low temperature is above the zone temperature, a spurious trip crossing notification will occur and it may trigger the WARN_ON() in handle_thermal_trip(). To address this, set the initial threshold for all trips to INT_MAX. There is also the case when a valid writable trip becomes invalid that requires special handling. First, in accordance with the change mentioned above, the trip's threshold needs to be set to INT_MAX to avoid the same issue. Second, if the trip in question is passive and it has been crossed by the thermal zone temperature on the way up, the zone's passive count has been incremented and it is in the passive polling mode, so its passive count needs to be adjusted to allow the passive polling to be turned off eventually. Fixes: 9ad18043fb35 ("thermal: core: Send trip crossing notifications at init time if needed") Fixes: 042a3d80f118 ("thermal: core: Move passive polling management to the core") Reported-by: Zhang Rui <zhang.rui@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Wendy Wang <wendy.wang@intel.com>
2024-05-15Merge tag 'thermal-v6.10-rc1-2' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux Merge thermal driver fixes for 6.10-rc1 from Daniel Lezcano: "- Check for a NULL pointer before using it in the probe routine of the Mediatek LVTS driver (Julien Panis) - Remove the num_lvts_sensor and cal_offset fields of the lvts_ctrl_data as they are not used. These are not functional fixes but slight memory usage fix of the Mediatek LVTS driver (Julien Panis) - Fix wrong lvts_ctrl index leading to a NULL pointer dereference in the Mediatek LVTS driver (Julien Panis)" * tag 'thermal-v6.10-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl index thermal/drivers/mediatek/lvts_thermal: Remove unused members from struct lvts_ctrl_data thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_data
2024-05-14Merge tag 'acpi-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These are ACPICA updates coming from the 20240322 release upstream, an ACPI DPTF driver update adding new platform support for it, some new quirks and some assorted fixes and cleanups. Specifics: - Add EINJ CXL error types to actbl1.h (Ben Cheatham) - Add support for RAS2 table to ACPICA (Shiju Jose) - Fix various spelling mistakes in text files and code comments in ACPICA (Colin Ian King) - Fix spelling and typos in ACPICA (Saket Dumbre) - Modify ACPI_OBJECT_COMMON_HEADER (lijun) - Add RISC-V RINTC affinity structure support to ACPICA (Haibo Xu) - Fix CXL 3.0 structure (RDPAS) in the CEDT table (Hojin Nam) - Add missin increment of registered GPE count to ACPICA (Daniil Tatianin) - Mark new ACPICA release 20240322 (Saket Dumbre) - Add support for the AEST V2 table to ACPICA (Ruidong Tian) - Disable -Wstringop-truncation for some ACPICA code in the kernel to avoid a compiler warning that is not very useful (Arnd Bergmann) - Make the kernel indicate support for several ACPI features that are in fact supported to the platform firmware through _OSC and fix the Generic Initiator Affinity _OSC bit (Armin Wolf) - Make the ACPI core set the owner value for ACPI drivers, drop the owner setting from a number of drivers and eliminate the owner field from struct acpi_driver (Krzysztof Kozlowski) - Rearrange fields in several structures to effectively eliminate computations from container_of() in some cases (Andy Shevchenko) - Do some assorted cleanups of the ACPI device enumeration code (Andy Shevchenko) - Make the ACPI device enumeration code skip devices with _STA values clearly identified by the specification as invalid (Rafael Wysocki) - Rework the handling of the NHLT table to simplify and clarify it and drop some obsolete pieces (Cezary Rojewski) - Add ACPI IRQ override quirks for Asus Vivobook Pro N6506MV, TongFang GXxHRXx and GMxHGxx, and XMG APEX 17 M23 (Guenter Schafranek, Tamim Khan, Christoffer Sandberg) - Add reference to UEFI DSD Guide to the documentation related to the ACPI handling of device properties (Sakari Ailus) - Fix SRAT lookup of CFMWS ranges with numa_fill_memblks(), remove lefover architecture-dependent code from the ACPI NUMA handling code and simplify it on top of that (Robert Richter) - Add a num-cs device property to specify the number of chip selects for Intel Braswell to the ACPI LPSS (Intel SoC) driver and remove a nested CONFIG_PM #ifdef from it (Andy Shevchenko) - Move three x86-specific ACPI files to the x86 directory (Andy Shevchenko) - Mark SMO8810 accel on Dell XPS 15 9550 as always present and add a PNP_UART1_SKIP quirk for Lenovo Blade2 tablets (Hans de Goede) - Move acpi_blacklisted() declaration to asm/acpi.h (Kuppuswamy Sathyanarayanan) - Add Lunar Lake support to the ACPI DPTF driver (Sumeet Pawnikar) - Mark the einj_driver driver's remove callback as __exit because it cannot get unbound via sysfs (Uwe Kleine-König) - Fix a typo in the ACPI documentation regarding the layout of sysfs subdirectory representing the ACPI namespace (John Watts) - Make the ACPI pfrut utility print the update_cap field during capability query (Chen Yu) - Add HAS_IOPORT dependencies to PNP (Niklas Schnelle)" * tag 'acpi-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits) ACPI/NUMA: Squash acpi_numa_memory_affinity_init() into acpi_parse_memory_affinity() ACPI/NUMA: Squash acpi_numa_slit_init() into acpi_parse_slit() ACPI/NUMA: Remove architecture dependent remainings x86/numa: Fix SRAT lookup of CFMWS ranges with numa_fill_memblks() ACPI: video: Add backlight=native quirk for Lenovo Slim 7 16ARH7 ACPI: scan: Avoid enumerating devices with clearly invalid _STA values ACPI: Move acpi_blacklisted() declaration to asm/acpi.h ACPI: resource: Skip IRQ override on Asus Vivobook Pro N6506MV ACPICA: AEST: Add support for the AEST V2 table ACPI: tools: pfrut: Print the update_cap field during capability query ACPI: property: Add reference to UEFI DSD Guide Documentation: firmware-guide: ACPI: Fix namespace typo PNP: add HAS_IOPORT dependencies ACPI: resource: Do IRQ override on TongFang GXxHRXx and GMxHGxx ACPI: resource: Do IRQ override on GMxBGxx (XMG APEX 17 M23) ACPICA: Update acpixf.h for new ACPICA release 20240322 ACPICA: events/evgpeinit: don't forget to increment registered GPE count ACPICA: Fix CXL 3.0 structure (RDPAS) in the CEDT table ACPICA: SRAT: Add dump and compiler support for RINTC affinity structure ACPICA: SRAT: Add RISC-V RINTC affinity structure ...
2024-05-14Merge tag 'thermal-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "The most significant part of this is a rework of thermal governors, including a redesign of the thermal governor interface and changes to make some of them take trip point hysteresis into account properly, as well as some related cleanups of the thermal governors and thermal core. The above is based on preliminary changes refactoring thermal data structures and moving the definitions of some of them into the thermal core which also ensure that trip point crossing notifications will be sent to user space via netlink and recorded in the debug statistics in temperature order. In addition, netlink bind/unbind notifications are added to the thermal core and the Intel HFI driver is modified to use them to avoid sending netlink messages until there are subscribers. Apart from that, multiple thermal drivers are updated which includes new hardware support (MediaTek MT8188 and MT8186, Amlogic A1 thermal sensor, Loongson-2K2000, Lmh QCM2290), fixes, cleanups and documentation updates, and the recently added thermal debug code is fixed and cleaned up. Specifics: - Redesign the thermal governor interface to allow the governors to work in a more straightforward way (Rafael Wysocki) - Make thermal governors take the current trip point thresholds into account in their computations which allows trip hysteresis to be observed more accurately (Rafael Wysocki) - Make the thermal core manage passive polling for thermal zones and remove passive polling management from thermal governors (Rafael Wysocki) - Refactor trip point representation and move the definition of thermal governor and thermal zone device structures to the thermal core (Rafael Wysocki) - Sort trip point crossing notifications and debug recording of trip point crossing events by temperature (Rafael Wysocki) - Improve the handling of cooling device states and thermal mitigation episodes in progress in the thermal debug code (Rafael Wysocki) - Avoid excessive updates of trip point statistics and clean up the printing of thermal mitigation episode information (Rafael Wysocki) - Clean up thermal governors and thermal core (Rafael Wysocki) - Allow thermal drivers to register notifiers that will be invoked on netlink events like BIND and UNBIND, so that they can adjust their activity depending on whether or not there are any subscribers of netlink messages coming from them, and make the Intel HFI driver use this mechanism (Stanislaw Gruszka) - Adjust the update delay and capabilities-per-event values in the Intel HFI thermal driver to prevent it from missing events and allow it to process more data in one go (Ricardo Neri) - Add missing MODULE_DESCRIPTION() to multiple files in the int340x_thermal and intel_soc_dts_iosf drivers (Srinivas Pandruvada) - Replace deprecated strncpy() with strscpy() in the int340x_thermal driver (Justin Stitt) - Add QCM2290 compatible DT bindings for Lmh and fix a NULL pointer dereference in the lmh driver when the SCM is not present (Konrad Dybcio) - Use the strreplace() function instead of doing it manually in the Armada driver (Rasmus Villemoes) - Convert st,stih407-thermal to DT schema and fix up missing properties (Raphael Gallais-Pou) - Add suspend/resume by restoring the context of the tsens sensor (Priyansh Jain) - Support A1 SoC family Thermal Sensor controller and add the DT bindings (Dmitry Rokosov) - Improve the temperature approximation calculation and consolidate the Tj constant into a shared area of the structure instead of duplicating it on the Rcar Gen3 (Niklas Söderlund) - Fix the Mediatek LVTS sensor coefficient for the MT8192 in order to support it correctly (Hsin-Te Yuan) - Fix a NULL pointer dereference in the tsens driver when the function compute_intercept_slope() is called with a NULL parameter (Aleksandr Mishin) - Remove some unused fields in struct qpnp_tm_chip and k3_bandgap (Christophe Jaillet) - Fix up calibration efuse data decoding, consolidate the code by checking boundaries and refactor some part of the LVTS Mediatek driver. After setting the scene, add MT8186 and MT8188 along with the DT bindings (Nicolas Pitre) - Add Loongson-2K2000 support after some minor code adjustements and providing the DT bindings definition (Binbin Zhou)" * tag 'thermal-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits) thermal: intel: hfi: Increase the number of CPU capabilities per netlink event thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL thermal: intel: Add missing module description thermal: core: Move passive polling management to the core thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid thermal: trip: Add missing empty code line thermal/debugfs: Avoid printing zero duration for mitigation events in progress thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add() thermal/debugfs: Create records for cdev states as they get used thermal: core: Introduce thermal_governor_trip_crossed() thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats() thermal/debugfs: Clean up thermal_debug_update_temp() thermal/debugfs: Avoid excessive updates of trip point statistics thermal: core: Relocate critical and hot trip handling thermal: core: Drop the .throttle() governor callback thermal: gov_user_space: Use .trip_crossed() instead of .throttle() thermal: gov_fair_share: Eliminate unnecessary integer divisions ...
2024-05-13Merge tag 'sched-core-2024-05-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Add cpufreq pressure feedback for the scheduler - Rework misfit load-balancing wrt affinity restrictions - Clean up and simplify the code around ::overutilized and ::overload access. - Simplify sched_balance_newidle() - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES handling that changed the output. - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch() - Reorganize, clean up and unify most of the higher level scheduler balancing function names around the sched_balance_*() prefix - Simplify the balancing flag code (sched_balance_running) - Miscellaneous cleanups & fixes * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/pelt: Remove shift of thermal clock sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() thermal/cpufreq: Remove arch_update_thermal_pressure() sched/cpufreq: Take cpufreq feedback into account cpufreq: Add a cpufreq pressure feedback for the scheduler sched/fair: Fix update of rd->sg_overutilized sched/vtime: Do not include <asm/vtime.h> header s390/irq,nmi: Include <asm/vtime.h> header directly s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover sched/vtime: Get rid of generic vtime_task_switch() implementation sched/vtime: Remove confusing arch_vtime_task_switch() declaration sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized() sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded() sched/fair: Rename root_domain::overload to ::overloaded sched/fair: Use helper functions to access root_domain::overload sched/fair: Check root_domain::overload value before update sched/fair: Combine EAS check with root_domain::overutilized access sched/fair: Simplify the continue_balancing logic in sched_balance_newidle() ...
2024-05-13Merge branches 'acpi-x86', 'acpi-dptf' and 'acpi-apei'Rafael J. Wysocki
Merge x86-specific ACPI updates, an ACPI DPTF driver update adding new platform support to it, and an ACPI APEI update: - Add a num-cs device property to specify the number of chip selects for Intel Braswell to the ACPI LPSS (Intel SoC) driver and remove a nested CONFIG_PM #ifdef from it (Andy Shevchenko). - Move three x86-specific ACPI files to the x86 directory (Andy Shevchenko). - Mark SMO8810 accel on Dell XPS 15 9550 as always present and add a PNP_UART1_SKIP quirk for Lenovo Blade2 tablets (Hans de Goede). - Move acpi_blacklisted() declaration to asm/acpi.h (Kuppuswamy Sathyanarayanan). - Add Lunar Lake support to the ACPI DPTF driver (Sumeet Pawnikar). - Mark the einj_driver driver's remove callback as __exit because it cannot get unbound via sysfs (Uwe Kleine-König). * acpi-x86: ACPI: Move acpi_blacklisted() declaration to asm/acpi.h ACPI: x86: Add PNP_UART1_SKIP quirk for Lenovo Blade2 tablets ACPI: x86: utils: Mark SMO8810 accel on Dell XPS 15 9550 as always present ACPI: x86: Move LPSS to x86 folder ACPI: x86: Move blacklist to x86 folder ACPI: x86: Move acpi_cmos_rtc to x86 folder ACPI: x86: Introduce a Makefile ACPI: LPSS: Remove nested ifdeffery for CONFIG_PM ACPI: LPSS: Advertise number of chip selects via property * acpi-dptf: ACPI: DPTF: Add Lunar Lake support * acpi-apei: ACPI: APEI: EINJ: mark remove callback as __exit
2024-05-10Merge branch 'thermal-intel'Rafael J. Wysocki
Merge updates of Intel thermal drivers for v6.10: - Add missing MODULE_DESCRIPTION() to multiple files in the int340x_thermal and intel_soc_dts_iosf drivers (Srinivas Pandruvada). - Adjust the update delay and capabilities-per-event values in the Intel HFI thermal driver to prevent it from missing events and allow it to process more data in one go (Ricardo Neri). * thermal-intel: thermal: intel: hfi: Increase the number of CPU capabilities per netlink event thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL thermal: intel: Add missing module description
2024-05-08thermal: intel: hfi: Increase the number of CPU capabilities per netlink eventRicardo Neri
The number of updated CPU capabilities per netlink event is hard-coded to 16. On systems with more than 16 CPUs (a common case), it takes more than one thermal netlink event to relay all the new capabilities after an HFI interrupt. This adds unnecessary overhead to both the kernel and user space entities. Increase the number of CPU capabilities updated per event to 64. Any system with 64 CPUs or less can now update all the capabilities in a single thermal netlink event. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNTRicardo Neri
When processing a hardware update, HFI generates as many thermal netlink events as needed to relay all the updated CPU capabilities to user space. The constant HFI_MAX_THERM_NOTIFY_COUNT is the number of CPU capabilities updated per each of those events. Give this constant a more descriptive name. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08thermal: intel: hfi: Shorten the thermal netlink event delay to 100msRicardo Neri
The delay between an HFI interrupt and its corresponding thermal netlink event has so far been hard-coded to CONFIG_HZ jiffies (1 second). This delay is too long for hardware that generates updates every tens of milliseconds. The HFI driver uses a delayed workqueue to send thermal netlink events. No subsequent events will be sent if there is pending work. As a result, much of the information of consecutive hardware updates will be lost if the workqueue delay is too long. User space entities may act on obsolete data. If the delay is too short, multiple events may overwhelm listeners. Set the delay to 100ms to strike a balance between too many and too few events. Use milliseconds instead of jiffies to improve readability. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08thermal: intel: hfi: Rename HFI_UPDATE_INTERVALRicardo Neri
The name of the constant HFI_UPDATE_INTERVAL is misleading. It is not a periodic interval at which HFI updates are processed. It is the delay in the processing of an HFI update after the arrival of an HFI interrupt. Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-06Merge branch 'thermal-core'Rafael J. Wysocki
This includes a major rework of thermal governors and part of the thermal core interacting with them as well as some fixes and cleanups of the thermal debug code: - Redesign the thermal governor interface to allow the governors to work in a more straightforward way. - Make thermal governors take the current trip point thresholds into account in their computations which allows trip hysteresis to be observed more accurately. - Clean up thermal governors. - Make the thermal core manage passive polling for thermal zones and remove passive polling management from thermal governors. - Improve the handling of cooling device states and thermal mitigation episodes in progress in the thermal debug code. - Avoid excessive updates of trip point statistics and clean up the printing of thermal mitigation episode information. * thermal-core: (27 commits) thermal: core: Move passive polling management to the core thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid thermal: trip: Add missing empty code line thermal/debugfs: Avoid printing zero duration for mitigation events in progress thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add() thermal/debugfs: Create records for cdev states as they get used thermal: core: Introduce thermal_governor_trip_crossed() thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats() thermal/debugfs: Clean up thermal_debug_update_temp() thermal/debugfs: Avoid excessive updates of trip point statistics thermal: core: Relocate critical and hot trip handling thermal: core: Drop the .throttle() governor callback thermal: gov_user_space: Use .trip_crossed() instead of .throttle() thermal: gov_fair_share: Eliminate unnecessary integer divisions thermal: gov_fair_share: Use trip thresholds instead of trip temperatures thermal: gov_fair_share: Use .manage() callback instead of .throttle() thermal: gov_step_wise: Clean up thermal_zone_trip_update() thermal: gov_step_wise: Use trip thresholds instead of trip temperatures thermal: gov_step_wise: Use .manage() callback instead of .throttle() ...
2024-05-06Merge back thermal cotntrol material for v6.10.Rafael J. Wysocki
2024-05-06thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl indexJulien Panis
In 'lvts_should_update_thresh()' and 'lvts_ctrl_start()' functions, the parameter passed to 'lvts_for_each_valid_sensor()' macro is always 'lvts_ctrl->lvts_data->lvts_ctrl'. In other words, the array index 0 is systematically passed as 'struct lvts_ctrl_data' type item, even when another item should be consumed instead. Hence, the 'valid_sensor_mask' value which is selected can be wrong because unrelated to the 'struct lvts_ctrl_data' type item that should be used. Hence, some thermal zone can be registered for a sensor 'i' that does not actually exist. Because of the invalid address used as 'lvts_sensor[i].msr', this situation ends up with a crash in 'lvts_get_temp()' function, where this 'msr' pointer is passed to 'readl_poll_timeout()' function. The following message is output: "Unable to handle kernel NULL pointer dereference at virtual address <msr>", with <msr> = 0. This patch fixes the issue. Fixes: 11e6f4c31447 ("thermal/drivers/mediatek/lvts_thermal: Allow early empty sensor slots") Signed-off-by: Julien Panis <jpanis@baylibre.com> Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240503-mtk-thermal-lvts-ctrl-idx-fix-v1-2-f605c50ca117@baylibre.com
2024-05-06thermal/drivers/mediatek/lvts_thermal: Remove unused members from struct ↵Julien Panis
lvts_ctrl_data In struct lvts_ctrl_data, num_lvts_sensor and cal_offset[] are not used. Signed-off-by: Julien Panis <jpanis@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240503-mtk-thermal-lvts-ctrl-idx-fix-v1-1-f605c50ca117@baylibre.com
2024-05-02thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_dataJulien Panis
Verify that lvts_data is not NULL before using it. Signed-off-by: Julien Panis <jpanis@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20240502-mtk-thermal-lvts-data-v1-1-65f1b0bfad37@baylibre.com
2024-05-02thermal: intel: Add missing module descriptionSrinivas Pandruvada
Fix warnings displayed by "make W=1" build: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/intel_soc_dts_iosf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rfim.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_req.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_hint.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_power_floor Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-30thermal: core: Move passive polling management to the coreRafael J. Wysocki
Passive polling is enabled by setting the 'passive' field in struct thermal_zone_device to a positive value so long as the 'passive_delay_jiffies' field is greater than zero. It causes the thermal core to actively check the thermal zone temperature periodically which in theory should be done after crossing a passive trip point on the way up in order to allow governors to react more rapidly to temperature changes and adjust mitigation more precisely. However, the 'passive' field in struct thermal_zone_device is currently managed by governors which is quite problematic. First of all, only two governors, Step-Wise and Power Allocator, update that field at all, so the other governors do not benefit from passive polling, although in principle they should. Moreover, if the zone governor is changed from, say, Step-Wise to Fair-Share after 'passive' has been incremented by the former, it is not going to be reset back to zero by the latter even if the zone temperature falls down below all passive trip points. For this reason, make handle_thermal_trip() increment 'passive' to enable passive polling for the given thermal zone whenever a passive trip point is crossed on the way up and decrement it whenever a passive trip point is crossed on the way down. Also remove the 'passive' field updates from governors and additionally clear it in thermal_zone_device_init() to prevent passive polling from being enabled after a system resume just beacuse it was enabled before suspending the system. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com>