aboutsummaryrefslogtreecommitdiff
path: root/drivers/thermal/intel/intel_powerclamp.c
AgeCommit message (Collapse)AuthorFilesLines
2024-02-15x86/cpu/topology: Rename topology_max_die_per_package()Thomas Gleixner1-1/+1
The plural of die is dies. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240213210253.065874205@linutronix.de
2024-01-22thermal: intel: powerclamp: Remove dead code for target mwait valueSrinivas Pandruvada1-32/+0
After conversion of this driver to use powercap idle_inject core, this driver doesn't use target_mwait value. So remove dead code. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-05thermal: intel: powerclamp: fix mismatch in get function for max_idleDavid Arcari1-1/+1
KASAN reported this [ 444.853098] BUG: KASAN: global-out-of-bounds in param_get_int+0x77/0x90 [ 444.853111] Read of size 4 at addr ffffffffc16c9220 by task cat/2105 ... [ 444.853442] The buggy address belongs to the variable: [ 444.853443] max_idle+0x0/0xffffffffffffcde0 [intel_powerclamp] There is a mismatch between the param_get_int and the definition of max_idle. Replacing param_get_int with param_get_byte resolves this issue. Fixes: ebf519710218 ("thermal: intel: powerclamp: Add two module parameters") Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: David Arcari <darcari@redhat.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-04thermal: intel: powerclamp: Fix NULL pointer access issueSrinivas Pandruvada1-0/+4
If cur_state for the powerclamp cooling device is set to the default minimum state of 0, without setting first to cur_state > 0, this results in NULL pointer access. This NULL pointer access happens in the powercap core idle-inject function idle_inject_set_duration() as there is no NULL check for idle_inject_device pointer. This pointer must be allocated by calling idle_inject_register() or idle_inject_register_full(). In the function powerclamp_set_cur_state(), idle_inject_device pointer is allocated only when the cur_state > 0. But setting 0 without changing to any other state, idle_inject_set_duration() will be called with a NULL idle_inject_device pointer. To address this, just return from powerclamp_set_cur_state() if the current cooling device state is the same as the last one. Since the power-up default cooling device state is 0, changing the state to 0 again here will return without calling idle_inject_set_duration(). Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 8526eb7fc75a ("thermal: intel: powerclamp: Use powercap idle-inject feature") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=217386 Tested-by: Risto A. Paju <teknohog@iki.fi> Cc: 6.3+ <stable@kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-30thermal: intel: powerclamp: Fix cpumask and max_idle module parametersDavid Arcari1-1/+8
When cpumask is specified as a module parameter the value is overwritten by the module init routine. This can easily be fixed by checking to see if the mask has already been allocated in the init routine. When max_idle is specified as a module parameter a panic will occur. The problem is that the idle_injection_cpu_mask is not allocated until the module init routine executes. This can easily be fixed by allocating the cpumask if it's not already allocated. Fixes: ebf519710218 ("thermal: intel: powerclamp: Add two module parameters") Signed-off-by: David Arcari <darcari@redhat.com> Reviewed-by: Srinivas Pandruvada<srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09thermal: intel: powerclamp: Add two module parametersSrinivas Pandruvada1-20/+156
In some use cases, it is desirable to only inject idle on certain set of CPUs. For example on Alder Lake systems, it is possible that we force idle only on P-Cores for thermal reasons. Also the idle percent can be more than 50% if we only choose partial set of CPUs in the system. Introduce 2 new module parameters for this purpose. They can be only changed when the cooling device is inactive. cpumask (Read/Write): A bit mask of CPUs to inject idle. The format of this bitmask is same as used in other subsystems like in /proc/irq/*/smp_affinity. The mask is comma separated 32 bit groups. Each CPU is one bit. For example for 256 CPU system the full mask is: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff The rightmost mask is for CPU 0-32. max_idle (Read/Write): Maximum injected idle time to the total CPU time ratio in percent range from 1 to 100. Even if the cooling device max_state is always 100 (100%), this parameter allows to add a max idle percent limit. The default is 50, to match the current implementation of powerclamp driver. Also doesn't allow value more than 75, if the cpumask includes every CPU present in the system. Also when the cpumask doesn't include every CPU, there is no use of compensation using package C-state idle counters. Hence don't start package C-state polling thread even for a single package or a single die system in this case. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09thermal: intel: powerclamp: Fix duration module parameterSrinivas Pandruvada1-6/+18
After the switch to use the powercap/idle-inject framework in the Intel powerclamp driver, the idle duration unit is microsecond. However, the module parameter for idle duration is in milliseconds, so convert it to microseconds in the "set" callback and back to milliseconds in a new "get" callback. While here, also use mutex protection for setting and getting "duration". The other uses of "duration" are already protected by the mutex. Fixes: 8526eb7fc75a ("thermal: intel: powerclamp: Use powercap idle-inject feature") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-07thermal: intel: powerclamp: Return last requested state as cur_stateSrinivas Pandruvada1-11/+1
When the user is reading cur_state from the thermal cooling device for Intel powerclamp device: - It returns the idle ratio from Package C-state counters when there is active idle injection session. - -1, when there is no active idle injection session. This information is not very useful as the package C-state counters vary a lot from read to read. Instead just return the last requested cur_state. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03thermal: intel: powerclamp: Use powercap idle-inject featureSrinivas Pandruvada1-223/+156
There are two idle injection implementation in the Linux kernel. One via intel_powerclamp and the other using powercap/idle_inject. Both implementation end up in calling play_idle* function from a FIFO priority thread. Both can't be used at the same time. It is better to use one idle injection framework for better maintainability. In this way, there is only one caller for play_idle. Here powercap/idle_inject can be used for both per-core and for system wide idle injection. This framework has a well defined interface which allow registry for per-core or for all CPUs (system wide). This reduces code complexity in the intel powerclamp driver as all the per CPU kthreads, delayed work and calls to play_idle can be removed. The changes include: - Remove unneeded include files - Remove per CPU kthread workers: balancing_work and idle_injection_work. - Reuse the compensation related code by moving from previous worker thread to idle_injection callback. - Adjust the idle_duration and runtime by using powercap/idle_inject interface. - Remove all variables, which are not required once powercap/idle_inject is used. - Add mutex to avoid race during removal of idle injection during module unload and user action to change idle inject percent. Also for protection during dynamic adjustment of run and idle time from update() callback. - Remove online/offline callbacks to designate control CPU - Use cpu_present_mask global variable for CPU mask - Remove hot plug locks Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-02thermal: intel: powerclamp: Fix cur_state for multi package systemSrinivas Pandruvada1-4/+16
The powerclamp cooling device cur_state shows actual idle observed by package C-state idle counters. But the implementation is not sufficient for multi package or multi die system. The cur_state value is incorrect. On these systems, these counters must be read from each package/die and somehow aggregate them. But there is no good method for aggregation. It was not a problem when explicit CPU model addition was required to enable intel powerclamp. In this way certain CPU models could have been avoided. But with the removal of CPU model check with the availability of Package C-state counters, the driver is loaded on most of the recent systems. For multi package/die systems, just show the actual target idle state, the system is trying to achieve. In powerclamp this is the user set state minus one. Also there is no use of starting a worker thread for polling package C-state counters and applying any compensation for multiple package or multiple die systems. Fixes: b721ca0d1927 ("thermal/powerclamp: remove cpu whitelist") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-15thermal: intel_powerclamp: Use first online CPU as control_cpuRafael J. Wysocki1-5/+1
Commit 68b99e94a4a2 ("thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash") fixed an issue related to using smp_processor_id() in preemptible context by replacing it with a pair of get_cpu()/put_cpu(), but what is needed there really is any online CPU and not necessarily the one currently running the code. Arguably, getting the one that's running the code in there is confusing. For this reason, simply give the control CPU role to the first online one which automatically will be CPU0 if it is online, so one check can be dropped from the code for an added benefit. Link: https://lore.kernel.org/linux-pm/20221011113646.GA12080@duo.ucw.cz/ Fixes: 68b99e94a4a2 ("thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com>
2022-09-21thermal: intel_powerclamp: Remove accounting for IRQ wakesSrinivas Pandruvada1-18/+3
There is a static variable "idle_wakeup_counter", which accounts for number of wake ups because of IRQs and take actions to compensate idle injection. This is now read and reset to 0, but never incremented. So all the usage of this counter for idle injection has no use. Also another static variable "reduce_irq", which depends on "idle_wakeup_counter", so remove usage of "reduce_irq" also. Commit feb6cd6a0f9f ("thermal/intel_powerclamp: stop sched tick in forced idle") replaced the local use of "mwait_idle_with_hints" with play_idle(). This removed possibility of updating "idle_wakeup_counter" without change in play_idle(). This change was made in Linux 4.10. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-21thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to ↵Srinivas Pandruvada1-2/+4
avoid crash When CPU 0 is offline and intel_powerclamp is used to inject idle, it generates kernel BUG: BUG: using smp_processor_id() in preemptible [00000000] code: bash/15687 caller is debug_smp_processor_id+0x17/0x20 CPU: 4 PID: 15687 Comm: bash Not tainted 5.19.0-rc7+ #57 Call Trace: <TASK> dump_stack_lvl+0x49/0x63 dump_stack+0x10/0x16 check_preemption_disabled+0xdd/0xe0 debug_smp_processor_id+0x17/0x20 powerclamp_set_cur_state+0x7f/0xf9 [intel_powerclamp] ... ... Here CPU 0 is the control CPU by default and changed to the current CPU, if CPU 0 offlined. This check has to be performed under cpus_read_lock(), hence the above warning. Use get_cpu() instead of smp_processor_id() to avoid this BUG. Suggested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-04thermal: intel_powerclamp: don't use bitmap_weight() in end_power_clamp()Yury Norov1-6/+3
Don't call bitmap_weight() if the following code can get by without it. Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-30thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_opsRikard Falkeborn1-1/+1
The only usage of powerclamp_cooling_ops is to pass its address to thermal_cooling_device_register(), which takes a pointer to const struct thermal_cooling_device_ops. Make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20211128214641.30953-1-rikard.falkeborn@gmail.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-05thermal: intel_powerclamp: Use bitmap_zalloc/bitmap_free when applicableChristophe JAILLET1-5/+3
'cpu_clamping_mask' is a bitmap. So use 'bitmap_zalloc()' and 'bitmap_free()' to simplify code, improve the semantic of the code and avoid some open-coded arithmetic in allocator arguments. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-14thermal/drivers/intel_powerclamp: Replace deprecated CPU-hotplug functions.Sebastian Andrzej Siewior1-2/+2
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amitk@kernel.org> Cc: linux-pm@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210803141621.780504-20-bigeasy@linutronix.de
2020-06-15sched,powerclamp: Convert to sched_set_fifo()Peter Zijlstra1-4/+1
Because SCHED_FIFO is a broken scheduler model (see previous patches) take away the priority field, the kernel can't possibly make an informed decision. Effectively no change. Cc: rafael.j.wysocki@intel.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org>
2020-03-24thermal: Convert to new X86 CPU match macrosThomas Gleixner1-1/+1
The new macro set has a consistent namespace and uses C99 initializers instead of the grufty C89 ones. Get rid the of the local QUARK defines and use the proper ones. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/20200320131509.967017771@linutronix.de
2019-09-03cpuidle: play_idle: Increase the resolution to usecDaniel Lezcano1-1/+1
The play_idle resolution is 1ms. The intel_powerclamp bases the idle duration on jiffies. The idle injection API is also using msec based duration but has no user yet. Unfortunately, msec based time does not fit well when we want to inject idle cycle precisely with shallow idle state. In order to set the scene for the incoming idle injection user, move the precision up to usec when calling play_idle. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-07-12Merge tag 'driver-core-5.3-rc1' of ↵Linus Torvalds1-10/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...
2019-06-18thermal: intel_powerclamp: no need to check return value of debugfs_create ↵Greg Kroah-Hartman1-10/+2
functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Zhang Rui <rui.zhang@intel.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: linux-pm@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 335Thomas Gleixner1-16/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 51 franklin st fifth floor boston ma 02110 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 111 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190530000436.567572064@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-18Merge branches 'fixes' and 'thermal-intel' into nextZhang Rui1-1/+1
2019-03-18thermal/intel_powerclamp: fix truncated kthread nameZhang Rui1-1/+1
kthread name only allows 15 characters (TASK_COMMON_LEN is 16). Thus rename the kthreads created by intel_powerclamp driver from "kidle_inject/ + decimal cpuid" to "kidle_inj/ + decimal cpuid" to avoid truncated kthead name for cpu 100 and later. Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-03-18thermal/intel_powerclamp: fix __percpu declaration of worker_dataLuc Van Oostenryck1-1/+1
This variable is declared as: static struct powerclamp_worker_data * __percpu worker_data; In other words, a percpu pointer to struct ... But this variable not used like so but as a pointer to a percpu struct powerclamp_worker_data. So fix the declaration as: static struct powerclamp_worker_data __percpu *worker_data; This also quiets Sparse's warnings from __verify_pcpu_ptr(), like: 494:49: warning: incorrect type in initializer (different address spaces) 494:49: expected void const [noderef] <asn:3> *__vpp_verify 494:49: got struct powerclamp_worker_data * Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-01-05Merge branch 'next' of ↵Linus Torvalds1-0/+803
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: - Add locking for cooling device sysfs attribute in case the cooling device state is changed by userspace and thermal framework simultaneously. (Thara Gopinath) - Fix a problem that passive cooling is reset improperly after system suspend/resume. (Wei Wang) - Cleanup the driver/thermal/ directory by moving intel and qcom platform specific drivers to platform specific sub-directories. (Amit Kucheria) - Some trivial cleanups. (Lukasz Luba, Wolfram Sang) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal/intel: fixup for Kconfig string parsing tightening up drivers: thermal: Move QCOM_SPMI_TEMP_ALARM into the qcom subdir drivers: thermal: Move various drivers for intel platforms into a subdir thermal: Fix locking in cooling device sysfs update cur_state Thermal: do not clear passive state during system sleep thermal: zx2967_thermal: simplify getting .driver_data thermal: st: st_thermal: simplify getting .driver_data thermal: spear_thermal: simplify getting .driver_data thermal: rockchip_thermal: simplify getting .driver_data thermal: int340x_thermal: int3400_thermal: simplify getting .driver_data thermal: remove unused function parameter
2018-12-07drivers: thermal: Move various drivers for intel platforms into a subdirAmit Kucheria1-0/+815
This cleans up the directory a bit, now that we have several other platforms using platform-specific sub-directories. Compile-tested with ARCH=x86 defconfig and the drivers explicitly enabled with menuconfig. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>