aboutsummaryrefslogtreecommitdiff
path: root/drivers/cpufreq/cpufreq_ondemand.c
AgeCommit message (Collapse)AuthorFilesLines
2011-03-16[CPUFREQ] Remove deprecated sysfs file sampling_rate_maxThomas Renninger1-13/+0
Marked deprecated for quite a while now... Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Dave Jones <[email protected]> CC: [email protected]
2011-03-16[CPUFREQ] calculate delay after dbs_check_cpuVincent Guittot1-6/+11
calculate ondemand delay after dbs_check_cpu call because it can modify rate_mult value use freq_lo_jiffies value for the sub sample period of powersave_bias mode Signed-off-by: Vincent Guittot <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2011-01-26cpufreq: use system_wq instead of dedicated workqueuesTejun Heo1-17/+3
With cmwq, there's no reason for cpufreq drivers to use separate workqueues. Remove the dedicated workqueues from cpufreq_conservative and cpufreq_ondemand and use system_wq instead. The work items are already sync canceled on stop, so it's already guaranteed that no work is running on module exit. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Dave Jones <[email protected]> Cc: [email protected]
2010-10-22[CPUFREQ] add sampling_down_factor tunable to improve ondemand performanceDavid C Niemi1-1/+41
Adds a new global tunable, sampling_down_factor. Set to 1 it makes no changes from existing behavior, but set to greater than 1 (e.g. 100) it acts as a multiplier for the scheduling interval for reevaluating load when the CPU is at its top speed due to high load. This improves performance by reducing the overhead of load evaluation and helping the CPU stay at its top speed when truly busy, rather than shifting back and forth in speed. This tunable has no effect on behavior at lower speeds/lower CPU loads. This patch is against 2.6.36-rc6. This patch should help solve kernel bug 19672 "ondemand is slow". Signed-off-by: David Niemi <[email protected]> Acked-by: Venkatesh Pallipadi <[email protected]> CC: Daniel Hollocher <[email protected]> CC: <[email protected]> CC: <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2010-08-03[CPUFREQ] ondemand: don't synchronize sample rate unless multiple cpus presentJocelyn Falempe1-2/+6
For UP systems this is not required, and results in a more consistent sample interval. [[email protected]: coding-style fixes] Signed-off-by: Jocelyn Falempe <[email protected]> Signed-off-by: Mike Chan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2010-08-03[CPUFREQ] ondemand: Refactor frequency increase codeMike Chan1-13/+12
Make simpler to read and call. *** v3 - Always call when powersave_bias is enabled. Acked-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Mike Chan <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2010-05-18Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds1-29/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hypervisor: add missing <linux/module.h> Modify the VMware balloon driver for the new x86_hyper API x86, hypervisor: Export the x86_hyper* symbols x86: Clean up the hypervisor layer x86, HyperV: fix up the license to mshyperv.c x86: Detect running on a Microsoft HyperV system x86, cpu: Make APERF/MPERF a normal table-driven flag x86, k8: Fix build error when K8_NB is disabled x86, cacheinfo: Disable index in all four subcaches x86, cacheinfo: Make L3 cache info per node x86, cacheinfo: Reorganize AMD L3 cache structure x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments x86, cacheinfo: Unify AMD L3 cache index disable checking cpufreq: Unify sysfs attribute definition macros powernow-k8: Fix frequency reporting x86, cpufreq: Add APERF/MPERF support for AMD processors x86: Unify APERF/MPERF support powernow-k8: Add core performance boost support x86, cpu: Add AMD core boosting feature flag to /proc/cpuinfo Fix up trivial conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c and drivers/cpufreq/cpufreq_ondemand.c
2010-05-09ondemand: Make the iowait-is-busy time a sysfs tunableArjan van de Ven1-1/+46
Pavel Machek pointed out that not all CPUs have an efficient idle at high frequency. Specifically, older Intel and various AMD cpus would get a higher powerusage when copying files from USB. Mike Chan pointed out that the same is true for various ARM chips as well. Thomas Renninger suggested to make this a sysfs tunable with a reasonable default. This patch adds a sysfs tunable for the new behavior, and uses a very simple function to determine a reasonable default, depending on the CPU vendor/type. Signed-off-by: Arjan van de Ven <[email protected]> Acked-by: Rik van Riel <[email protected]> Acked-by: Pavel Machek <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: [email protected] LKML-Reference: <[email protected]> [ minor tidyup ] Signed-off-by: Ingo Molnar <[email protected]>
2010-05-09ondemand: Solve a big performance issue by counting IOWAIT time as busyArjan van de Ven1-2/+28
The ondemand cpufreq governor uses CPU busy time (e.g. not-idle time) as a measure for scaling the CPU frequency up or down. If the CPU is busy, the CPU frequency scales up, if it's idle, the CPU frequency scales down. Effectively, it uses the CPU busy time as proxy variable for the more nebulous "how critical is performance right now" question. This algorithm falls flat on its face in the light of workloads where you're alternatingly disk and CPU bound, such as the ever popular "git grep", but also things like startup of programs and maildir using email clients... much to the chagarin of Andrew Morton. This patch changes the ondemand algorithm to count iowait time as busy, not idle, time. As shown in the breakdown cases above, iowait is performance critical often, and by counting iowait, the proxy variable becomes a more accurate representation of the "how critical is performance" question. The problem and fix are both verified with the "perf timechar" tool. Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Acked-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-09cpufreq: Unify sysfs attribute definition macrosBorislav Petkov1-28/+12
Multiple modules used to define those which are with identical functionality and were needlessly replicated among the different cpufreq drivers. Push them into the header and remove duplication. Signed-off-by: Borislav Petkov <[email protected]> LKML-Reference: <[email protected]> Reviewed-by: Thomas Renninger <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-01-13[CPUFREQ] Fix ondemand to not request targets outside policy limits[email protected]1-0/+3
Dominik said: target_freq cannot be below policy->min or above policy->max. If it were, the whole cpufreq subsystem is broken. But (answer): I think the "ondemand" governor can ask for a target frequency that is below policy->min. ... A patch such as below may be needed to sanitize the target frequency requested by "ondemand". The "conservative" governor already has this check: Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Dave Jones <[email protected]> # diff -bur x/drivers/cpufreq/cpufreq_ondemand.c.orig y/drivers/cpufreq/cpufreq_ondemand.c
2009-11-17[CPUFREQ] Resolve time unit thinko in ondemand/conservative govsPallipadi, Venkatesh1-2/+2
ondemand and conservative governors are messing up time units in the code path where NO_HZ is not enabled and ignore_nice is set. The walltime idletime stored is in jiffies and nice time calculation is happening in microseconds. The problem was reported and diagnosed by Alexander here. http://marc.info/?l=linux-kernel&m=125752550404513&w=2 The patch below fixes this thinko. Reported-by: Alexander Miller <[email protected]> Tested-by: Alexander Miller <[email protected]> Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-09-18Merge branch 'next' of ↵Linus Torvalds1-26/+113
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix NULL ptr regression in powernow-k8 [CPUFREQ] Create a blacklist for processors that should not load the acpi-cpufreq module. [CPUFREQ] Powernow-k8: Enable more than 2 low P-states [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) [CPUFREQ] ondemand - Use global sysfs dir for tuning settings [CPUFREQ] Introduce global, not per core: /sys/devices/system/cpu/cpufreq [CPUFREQ] Bail out of cpufreq_add_dev if the link for a managed CPU got created [CPUFREQ] Factor out policy setting from cpufreq_add_dev [CPUFREQ] Factor out interface creation from cpufreq_add_dev [CPUFREQ] Factor out symlink creation from cpufreq_add_dev [CPUFREQ] cleanup up -ENOMEM handling in cpufreq_add_dev [CPUFREQ] Reduce scope of cpu_sys_dev in cpufreq_add_dev [CPUFREQ] update Doc for cpuinfo_cur_freq and scaling_cur_freq
2009-09-01[CPUFREQ] ondemand - Use global sysfs dir for tuning settingsThomas Renninger1-26/+113
Ondemand has only global variables for userspace tunings via sysfs. But they were exposed per CPU which wrongly implies to the user that his settings are applied per cpu. Also locking sysfs against concurrent access won't be necessary anymore after deprecation time. This means the ondemand config dir is moved: /sys/devices/system/cpu/cpu*/cpufreq/ondemand -> /sys/devices/system/cpu/cpufreq/ondemand The old files will still exist, but reading or writing to them will result in one (printk_once) deprecation msg to syslog per file. Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-08-14Merge branch 'percpu-for-linus' into percpu-for-nextTejun Heo1-45/+32
Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <[email protected]>
2009-07-06[CPUFREQ] Cleanup locking in ondemand governor[email protected]1-35/+27
Redesign the locking inside ondemand driver. Make dbs_mutex handle all the global state changes inside the driver and invent a new percpu mutex to serialize percpu timer and frequency limit change. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-07-06[CPUFREQ] Eliminate the recent lockdep warnings in cpufreq[email protected]1-16/+11
Commit b14893a62c73af0eca414cfed505b8c09efc613c although it was very much needed to properly cleanup ondemand timer, opened-up a can of worms related to locking dependencies in cpufreq. Patch here defines the need for dbs_mutex and cleans up its usage in ondemand governor. This also resolves the lockdep warnings reported here http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html and few others.. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-06-24percpu: clean up percpu variable definitionsTejun Heo1-7/+8
Percpu variable definition is about to be updated such that all percpu symbols including the static ones must be unique. Update percpu variable definitions accordingly. * as,cfq: rename ioc_count uniquely * cpufreq: rename cpu_dbs_info uniquely * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and rename it * ipv4,6: rename cookie_scratch uniquely * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to pmc_irq_entry and nmi_entry to pmc_nmi_entry * perf_counter: rename disable_count to perf_disable_count * ftrace: rename test_event_disable to ftrace_test_event_disable * kmemleak: rename test_pointer to kmemleak_test_pointer * mce: rename next_interval to mce_next_interval [ Impact: percpu usage cleanups, no duplicate static percpu var names ] Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Christoph Lameter <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Dave Jones <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Cc: linux-mm <[email protected]> Cc: David S. Miller <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Li Zefan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Andi Kleen <[email protected]>
2009-06-15[CPUFREQ] Only set sampling_rate_max deprecated, sampling_rate_min is usefulThomas Renninger1-16/+2
Update the documentation accordingly. Cleanup and use printk_once. Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-06-15[CPUFREQ] ondemand: Uncouple minimal sampling rate from HZ in NO_HZ caseThomas Renninger1-27/+23
With this patch you have following minimal sampling rate restrictions: Kernel restrictions: If CONFIG_NO_HZ is set, the limit is 10ms fixed. If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the limits depend on the CONFIG_HZ option: HZ=1000: min=20000us (20ms) HZ=250: min=80000us (80ms) HZ=100: min=200000us (200ms) HW restrictions: Do not sample/poll more often than HW latency * 100 exported by the low level cpufreq HW driver The higher value of above restrictions is the minimal sampling rate that can be set (and can be seen via ondemand/sampling_rate_min sysfs file) Default sampling rate still is HW latency * 1000, but this will now end up in lower values on latest (Intel and AMD) hardware as these can switch really fast and sampling rate mostly was limited to the 80ms or 200ms (depending on whether HZ=250 or HZ=1000 is used). Signed-off-by: Thomas Renninger <[email protected]> Cc: Pallipadi Venkatesh <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-05-26[CPUFREQ] fix timer teardown in ondemand governorMathieu Desnoyers1-1/+4
* Rafael J. Wysocki ([email protected]) wrote: > This message has been generated automatically as a part of a report > of regressions introduced between 2.6.28 and 2.6.29. > > The following bug entry is on the current list of known regressions > introduced between 2.6.28 and 2.6.29. Please verify if it still should > be listed and let me know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186 > Subject : cpufreq timer teardown problem > Submitter : Mathieu Desnoyers <[email protected]> > Date : 2009-04-23 14:00 (24 days old) > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4 > Handled-By : Mathieu Desnoyers <[email protected]> > Patch : http://patchwork.kernel.org/patch/19754/ > http://patchwork.kernel.org/patch/19753/ > (updated changelog) cpufreq fix timer teardown in ondemand governor The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the workqueue handler to exit. The ondemand governor does not seem to be affected because the "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns immediately without rescheduling the work. The conservative governor in 2.6.30-rc has the same check as the ondemand governor, which makes things usually run smoothly. However, if the governor is quickly stopped and then started, this could lead to the following race : dbs_enable could be reenabled and multiple do_dbs_timer handlers would run. This is why a synchronized teardown is required. The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2. Depends on patch cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call Signed-off-by: Mathieu Desnoyers <[email protected]> CC: Andrew Morton <[email protected]> CC: [email protected] CC: [email protected] CC: [email protected] CC: Ingo Molnar <[email protected]> CC: [email protected] CC: Ben Slusky <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-02-24[CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictionsThomas Renninger1-10/+18
Limit sampling rate to transition_latency * 100 or kernel limits. If sampling_rate is tried to be set too low, set the lowest allowed value. Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-02-24[CPUFREQ] ondemand/conservative: deprecate sampling_rate{min,max}Thomas Renninger1-0/+17
The same info can be obtained via the transition_latency sysfs file Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-02-24[CPUFREQ] checkpatch cleanups for ondemand governor.Dave Jones1-16/+13
Signed-off-by: Dave Jones <[email protected]>
2009-02-05[CPUFREQ] Make ignore_nice_load setting of ondemand work as expected.Venkatesh Pallipadi1-22/+25
ondemand micro-accounting of idle time changes broke ignore_nice_load sysfs setting due to a thinko in the code. The bug entry: http://bugzilla.kernel.org/show_bug.cgi?id=12310 Reported-by: Jim Bray <[email protected]> Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2009-01-06cpumask: convert struct cpufreq_policy to cpumask_var_tRusty Russell1-2/+2
Impact: use new cpumask API to reduce memory usage This is part of an effort to reduce structure sizes for machines configured with large NR_CPUS. cpumask_t gets replaced by cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or struct cpumask * (large NR_CPUS). Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Mike Travis <[email protected]> Acked-by: Dave Jones <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-10-09[CPUFREQ] Fix BUG: using smp_processor_id() in preemptible codeAndrea Righi1-1/+4
Use get_cpu()/put_cpu() in cpufreq_ondemand init routine, instead of smp_processor_id() to avoid the following BUG: [ 35.313118] BUG: using smp_processor_id() in preemptible [00000000] code=: modprobe/4952 [ 35.313132] caller is cpufreq_gov_dbs_init+0xa/0x8f [cpufreq_ondemand] [ 35.313140] Pid: 4952, comm: modprobe Not tainted 2.6.27-rc5-mm1 #23 [ 35.313145] Call Trace: [ 35.313158] [<ffffffff80361ff7>] debug_smp_processor_id+0xd7/0xe0 [ 35.313167] [<ffffffffa010800a>] cpufreq_gov_dbs_init+0xa/0x8f [cpufreq_ondemand] [ 35.313176] [<ffffffff8020903b>] _stext+0x3b/0x160 [ 35.313185] [<ffffffff804768c5>] __mutex_unlock_slowpath+0xe5/0x190 [ 35.313195] [<ffffffff8026236a>] trace_hardirqs_on_caller+0xca/0x140 [ 35.313205] [<ffffffff8026ef4c>] sys_init_module+0xdc/0x210 [ 35.313212] [<ffffffff8020b7cb>] system_call_fastpath+0x16/0x1b Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ] Don't export governors for default governorSven Wegener1-1/+3
We don't need to export the governors for use as the default governor, because the default governor will be built-in anyway and we can access the symbol directly. This also fixes the following sparse warnings: drivers/cpufreq/cpufreq_conservative.c:578:25: warning: symbol 'cpufreq_gov_conservative' was not declared. Should it be static? drivers/cpufreq/cpufreq_ondemand.c:582:25: warning: symbol 'cpufreq_gov_ondemand' was not declared. Should it be static? drivers/cpufreq/cpufreq_performance.c:39:25: warning: symbol 'cpufreq_gov_performance' was not declared. Should it be static? drivers/cpufreq/cpufreq_powersave.c:38:25: warning: symbol 'cpufreq_gov_powersave' was not declared. Should it be static? drivers/cpufreq/cpufreq_userspace.c:190:25: warning: symbol 'cpufreq_gov_userspace' was not declared. Should it be static? Signed-off-by: Sven Wegener <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ][6/6] cpufreq: Add idle microaccounting in ondemand governor[email protected]1-1/+44
Use get_cpu_idle_time_us() to get micro-accounted idle information. This enables ondemand to get more accurate idle and busy timings than the jiffy based calculation. As a result, we can decrease the ondemand safety gaurd band from 80-10 to 95-3. Results in more aggressive power savings. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ][4/6] cpufreq_ondemand: Parameterize down differential[email protected]1-2/+9
Use a parameter for down differential, instead of hardcoded 10%. Follow-on patch changes the down-differential dynamically, based on whether we are using idle micro-accounting or not. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ][3/6] cpufreq: get_cpu_idle_time() changes in ondemand for ↵[email protected]1-14/+15
idle-microaccounting Preparatory changes for doing idle micro-accounting in ondemand governor. get_cpu_idle_time() gets extra parameter and returns idle time and also the wall time that corresponds to the idle time measurement. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ][2/6] cpufreq: Change load calculation in ondemand for software ↵[email protected]1-30/+35
coordination Change the load calculation algorithm in ondemand to work well with software coordination of frequency across the dependent cpus. Multiply individual CPU utilization with the average freq of that logical CPU during the measurement interval (using getavg call). And find the max CPU utilization number in terms of CPU freq. That number is then used to get to the target freq for next sampling interval. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg()[email protected]1-1/+1
Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software cpufreq coordination where policy->cpu may not be same as the CPU on which we want to getavg frequency. A follow-on patch will use this parameter to getavg freq from all cpus in policy->cpus. Change since last patch. Fix the offline/online and suspend/resume oops reported by Youquan Song <[email protected]> Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2008-10-09[CPUFREQ] add error handling for cpufreq_register_governor() errorAkinobu Mita1-1/+7
Add error handling for cpufreq_register_governor() error Signed-off-by: Akinobu Mita <[email protected]> Cc: [email protected] Signed-off-by: Dave Jones <[email protected]>
2008-05-23cpufreq: use performance variant for_each_cpu_mask_nrMike Travis1-2/+2
Change references from for_each_cpu_mask to for_each_cpu_mask_nr where appropriate Reviewed-by: Paul Jackson <[email protected]> Reviewed-by: Christoph Lameter <[email protected]> Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-01-17cpufreq: Initialise default governor before useJohannes Weiner1-1/+4
When the cpufreq driver starts up at boot time, it calls into the default governor which might not be initialised yet. This hurts when the governor's worker function relies on memory that is not yet set up by its init function. This migrates all governors from module_init() to fs_initcall() when being the default, as was already done in cpufreq_performance when it was the only possible choice. The performance governor is always initialized early because it might be used as fallback even when not being the default. Fixes at least one actual oops where ondemand is the default governor and cpufreq_governor_dbs() uses the uninitialised kondemand_wq work-queue during boot-time. Signed-off-by: Johannes Weiner <[email protected]> Cc: Dave Jones <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Venkatesh Pallipadi <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-04[CPUFREQ] allow ondemand and conservative cpufreq governors to be used as ↵Thomas Renninger1-13/+9
default Depending on the transition latency of the HW for cpufreq switches, the ondemand or conservative governor cannot be used with certain cpufreq drivers. Still the ondemand should be the default governor on a wide range of systems. This patch allows this and lets the governor fallback to the performance governor at cpufreq driver load time, if the driver does not support fast enough frequency switching. Main benefit is that on e.g. installation or other systems without userspace support a working dynamic cpufreq support can be achieved on most systems by simply loading the cpufreq driver. This is especially essential for recent x86(_64) laptop hardware which may rely on working dynamic cpufreq OS support. Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Venkatesh Pallipadi <[email protected]> Cc: Russell King <[email protected]> Cc: Bryan Wu <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mundt <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2007-06-21[CPUFREQ] ondemand: fix tickless accounting and software coordination bugVenki Pallipadi1-7/+18
With tickless kernel and software coordination os P-states, ondemand can look at wrong idle statistics. This can happen when ondemand sampling is happening on CPU 0 and due to software coordination sampling also looks at utilization of CPU 1. If CPU 1 is in tickless state at that moment, its idle statistics will not be uptodate and CPU 0 thinks CPU 1 is idle for less amount of time than it actually is. This can be resolved by looking at all the busy times of CPUs, which is accurate, even with tickless, and use that to determine idle time in a round about way (total time - busy time). Thanks to Arjan for originally reporting the ondemand bug on Lenovo T61. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2007-06-21[CPUFREQ] ondemand: add a check to avoid negative load calculationVenki Pallipadi1-2/+3
Due to rounding and inexact jiffy accounting, idle_ticks can sometimes be higher than total_ticks. Make sure those cases are handled as zero load case. Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2007-05-08Add a new deferrable delayed work initVenki Pallipadi1-1/+1
Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi <[email protected]> Cc: Dave Jones <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-20[CPUFREQ] cpufreq_ondemand.c: don't use _WORK_NAROleg Nesterov1-4/+1
Looks like dbs_timer() is very careful wrt per_cpu(cpu_dbs_info), and it doesn't need the help of WORK_STRUCT_NOAUTOREL. Signed-off-by: Oleg Nesterov <[email protected]> Acked-By: David Howells <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2007-02-10[CPUFREQ] Whitespace fixupDave Jones1-1/+1
Signed-off-by: Dave Jones <[email protected]>
2007-02-10[CPUFREQ] ondemand governor use new cpufreq rwsem locking in work callbackVenkatesh Pallipadi1-18/+16
Eliminate flush_workqueue in cpufreq_governor(STOP) callpath. Using flush there has a deadlock potential as in http://uwsg.iu.edu/hypermail/linux/kernel/0611.3/1223.html Also, cleanup the locking issues with do_dbs_timer delayed_work callback. As it changes the CPU frequency using __cpufreq_target, it needs to have policy_rwsem in write mode, which also protects it from hot plug. Signed-off-by: Venkatesh Pallipadi <[email protected]> Cc: Gautham R Shenoy <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2007-02-10[CPUFREQ] ondemand governor restructure the work callbackVenkatesh Pallipadi1-12/+16
Restructure the delayed_work callback in ondemand. This eliminates the need for smp_processor_id in the callback function and also helps in proper locking and avoiding flush_workqueue when stopping the governor (done in subsequent patch). Signed-off-by: Venkatesh Pallipadi <[email protected]> Cc: Gautham R Shenoy <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2007-02-10[CPUFREQ] Remove hotplug cpu crapDave Jones1-2/+0
The hotplug CPU locking in cpufreq is horrendous. No-one seems to care enough to fix it, so just remove it so that the 99.9% of the real world users of this code can use cpufreq without being bothered by warnings. Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2006-12-12Merge ../linusDave Jones1-11/+17
Conflicts: drivers/cpufreq/cpufreq.c
2006-11-22WorkStruct: make allyesconfigDavid Howells1-11/+17
Fix up for make allyesconfig. Signed-Off-By: David Howells <[email protected]>
2006-11-06[CPUFREQ] Fix coding style issues in cpufreq.Gautham R Shenoy1-4/+8
Clean up cpufreq subsystem to fix coding style issues and to improve the readability. Signed-off-by: Gautham R Shenoy <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2006-10-21[CPUFREQ] handle sysfs errorsJeff Garzik1-1/+11
Signed-off-by: Jeff Garzik <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Dave Jones <[email protected]>
2006-10-15[CPUFREQ][8/8] acpi-cpufreq: Add support for freq feedback from hardwareVenkatesh Pallipadi1-1/+8
Enable ondemand governor and acpi-cpufreq to use IA32_APERF and IA32_MPERF MSR to get active frequency feedback for the last sampling interval. This will make ondemand take right frequency decisions when hardware coordination of frequency is going on. Without APERF/MPERF, ondemand can take wrong decision at times due to underlying hardware coordination or TM2. Example: * CPU 0 and CPU 1 are hardware cooridnated. * CPU 1 running at highest frequency. * CPU 0 was running at highest freq. Now ondemand reduces it to some intermediate frequency based on utilization. * Due to underlying hardware coordination with other CPU 1, CPU 0 continues to run at highest frequency (as long as other CPU is at highest). * When ondemand samples CPU 0 again next time, without actual frequency feedback from APERF/MPERF, it will think that previous frequency change was successful and can go to wrong target frequency. This is because it thinks that utilization it has got this sampling interval is when running at intermediate frequency, rather than actual highest frequency. More information about IA32_APERF IA32_MPERF MSR: Refer to IA-32 Intel® Architecture Software Developer's Manual at http://developer.intel.com Signed-off-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Dave Jones <[email protected]>