History log of /drivers/cpufreq/cpufreq_conservative.c
Revision Date Author Comments
648616343cdbe904c585a6c12e323d3b3c72e46f 15-Dec-2011 Martin Schwidefsky <schwidefsky@de.ibm.com> [S390] cputime: add sparse checking and cleanup

Make cputime_t and cputime64_t nocast to enable sparse checking to
detect incorrect use of cputime. Drop the cputime macros for simple
scalar operations. The conversion macros are still needed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
3292beb340c76884427faa1f5d6085719477d889 28-Nov-2011 Glauber Costa <glommer@parallels.com> sched/accounting: Change cpustat fields to an array

This patch changes fields in cpustat from a structure, to an
u64 array. Math gets easier, and the code is more flexible.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Tuner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
6beea0cda8ce71c01354e688e5735c47e331e84f 24-Aug-2011 Michal Hocko <mhocko@suse.cz> nohz: Fix update_ts_time_stat idle accounting

update_ts_time_stat currently updates idle time even if we are in
iowait loop at the moment. The only real users of the idle counter
(via get_cpu_idle_time_us) are CPU governors and they expect to get
cumulative time for both idle and iowait times.
The value (idle_sleeptime) is also printed to userspace by print_cpu
but it prints both idle and iowait times so the idle part is misleading.

Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well.
If we do this we might use get_cpu_{idle,iowait}_time_us from other
contexts as well and we will get expected values.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
326c86deaed54ad1b364fcafe5073f563671eb58 03-Mar-2011 Thomas Renninger <trenn@suse.de> [CPUFREQ] Remove unneeded locks

There cannot be any concurrent access to these through
different cpu sysfs files anymore, because these tunables
are now all global (not per cpu).

I still have some doubts whether some of these locks
were needed at all. Anyway, let's get rid of them.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org
e8951251b89440644a39f2512b4f265973926b41 03-Mar-2011 Thomas Renninger <trenn@suse.de> [CPUFREQ] Remove old, deprecated per cpu ondemand/conservative sysfs files

Marked deprecated for quite a whilte now...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org
ef598549b28014ec2ea7574d4e793728e0e33d02 03-Mar-2011 Thomas Renninger <trenn@suse.de> [CPUFREQ] Remove deprecated sysfs file sampling_rate_max

Marked deprecated for quite a while now...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org
2feb690c20d52e22c7874a1e090245e6a4344ce6 15-Nov-2010 Joe Perches <joe@perches.com> [CPUFREQ] drivers/cpufreq: Remove unnecessary semicolons

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Dave Jones <davej@redhat.com>
57df5573a56322e6895451f759c19e875252817d 26-Jan-2011 Tejun Heo <tj@kernel.org> cpufreq: use system_wq instead of dedicated workqueues

With cmwq, there's no reason for cpufreq drivers to use separate
workqueues. Remove the dedicated workqueues from cpufreq_conservative
and cpufreq_ondemand and use system_wq instead. The work items are
already sync canceled on stop, so it's already guaranteed that no work
is running on module exit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
6dad2a29646ce3792c40cfc52d77e9b65a7bb143 31-Mar-2010 Borislav Petkov <borislav.petkov@amd.com> cpufreq: Unify sysfs attribute definition macros

Multiple modules used to define those which are with identical
functionality and were needlessly replicated among the different cpufreq
drivers. Push them into the header and remove duplication.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1270065406-1814-7-git-send-email-bp@amd64.org>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
fd187aaf980c45f1d16a94a846faa68e24de03c8 26-Mar-2010 Dominik Brodowski <linux@dominikbrodowski.net> [CPUFREQ] use max load in conservative governor

Instead of using the load of the last CPU in a package, use the
maximum load of all CPUs in a package.

Reported-by: Jean-Christian Goussard <jeanchristian.goussard@sfr.fr>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
49b015ce38edeb484fb2efa09048c23e903f49d6 01-Oct-2009 Thomas Renninger <trenn@suse.de> [CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings

Same adustments that have been added to the ondemand recently.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
54c9a35d9faef06e00e2a941eb8fe674f1886901 12-Nov-2009 Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com> [CPUFREQ] Resolve time unit thinko in ondemand/conservative govs

ondemand and conservative governors are messing up time units in the
code path where NO_HZ is not enabled and ignore_nice is set. The walltime
idletime stored is in jiffies and nice time calculation is happening in
microseconds.

The problem was reported and diagnosed by Alexander here.
http://marc.info/?l=linux-kernel&m=125752550404513&w=2

The patch below fixes this thinko.

Reported-by: Alexander Miller <Miller@fmi.uni-stuttgart.de>
Tested-by: Alexander Miller <Miller@fmi.uni-stuttgart.de>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
26d204afa18f7df177f21bdb3759e0098ca8f7d5 29-Jul-2009 Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com> [CPUFREQ] Fix NULL pointer dereference regression in conservative governor

Commit ee88415caf736b89500f16e0a545614541a45005
introduced this regression when it removed enable bit in cpu_dbs_info_s.
That added a possibility of dbs_cpufreq_notifier getting called for a
CPU that is not yet managed by conservative governor. That will happen
as the transition notifier is set as soon as one CPU switches to
conservative governor and other CPUs can get a NULL pointer dereference
without the enable bit check. Add the enable bit back again.

Reported-by: Lermytte Christophe <Christophe.Lermytte@thomson.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
ee88415caf736b89500f16e0a545614541a45005 03-Jul-2009 venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> [CPUFREQ] Cleanup locking in conservative governor

Redesign the locking inside conservative driver. Make dbs_mutex handle all the
global state changes inside the driver and invent a new percpu mutex
to serialize percpu timer and frequency limit change.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
7d26e2d5e2da37e92c6c7644b26b294dedd8c982 03-Jul-2009 venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com> [CPUFREQ] Eliminate the recent lockdep warnings in cpufreq

Commit b14893a62c73af0eca414cfed505b8c09efc613c although it was very
much needed to properly cleanup ondemand timer, opened-up a can of worms
related to locking dependencies in cpufreq.

Patch here defines the need for dbs_mutex and cleans up its usage in
ondemand governor. This also resolves the lockdep warnings reported here

http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html
http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html

and few others..

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
245b2e70eabd797932adb263a65da0bab3711753 24-Jun-2009 Tejun Heo <tj@kernel.org> percpu: clean up percpu variable definitions

Percpu variable definition is about to be updated such that all percpu
symbols including the static ones must be unique. Update percpu
variable definitions accordingly.

* as,cfq: rename ioc_count uniquely

* cpufreq: rename cpu_dbs_info uniquely

* xen: move nesting_count out of xen_evtchn_do_upcall() and rename it

* mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
rename it

* ipv4,6: rename cookie_scratch uniquely

* x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
pmc_irq_entry and nmi_entry to pmc_nmi_entry

* perf_counter: rename disable_count to perf_disable_count

* ftrace: rename test_event_disable to ftrace_test_event_disable

* kmemleak: rename test_pointer to kmemleak_test_pointer

* mce: rename next_interval to mce_next_interval

[ Impact: percpu usage cleanups, no duplicate static percpu var names ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
4f4d1ad6ee69027f51f9d137f7e7d3c863cbc53d 22-Apr-2009 Thomas Renninger <trenn@suse.de> [CPUFREQ] Only set sampling_rate_max deprecated, sampling_rate_min is useful

Update the documentation accordingly.
Cleanup and use printk_once.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
cef9615a853ebc4972084f7e70b52892557420ac 22-Apr-2009 Thomas Renninger <trenn@suse.de> [CPUFREQ] ondemand: Uncouple minimal sampling rate from HZ in NO_HZ case

With this patch you have following minimal sampling rate restrictions:

Kernel restrictions:
If CONFIG_NO_HZ is set, the limit is 10ms fixed.
If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
limits depend on the CONFIG_HZ option:
HZ=1000: min=20000us (20ms)
HZ=250: min=80000us (80ms)
HZ=100: min=200000us (200ms)

HW restrictions:
Do not sample/poll more often than HW latency * 100 exported by the low
level cpufreq HW driver

The higher value of above restrictions is the minimal sampling rate
that can be set (and can be seen via ondemand/sampling_rate_min sysfs file)

Default sampling rate still is HW latency * 1000, but this will now end
up in lower values on latest (Intel and AMD) hardware as these can switch
really fast and sampling rate mostly was limited to the 80ms or 200ms
(depending on whether HZ=250 or HZ=1000 is used).

Signed-off-by: Thomas Renninger <trenn@suse.de>
Cc: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
b253d2b2d28ead6fed012feb54694b3d0562839a 17-May-2009 Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> [CPUFREQ] fix timer teardown in conservative governor

* Rafael J. Wysocki (rjw@sisk.pl) wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.28 and 2.6.29.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.28 and 2.6.29. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
> Subject : cpufreq timer teardown problem
> Submitter : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Date : 2009-04-23 14:00 (24 days old)
> References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
> Handled-By : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Patch : http://patchwork.kernel.org/patch/19754/
> http://patchwork.kernel.org/patch/19753/
>

(re-send with updated changelog)

cpufreq fix timer teardown in conservative governor

The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
workqueue handler to exit.

The ondemand governor does not seem to be affected because the
"if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
immediately without rescheduling the work. The conservative governor in
2.6.30-rc has the same check as the ondemand governor, which makes things
usually run smoothly. However, if the governor is quickly stopped and then
started, this could lead to the following race :

dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
This is why a synchronized teardown is required.

Depends on patch
cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call

The following patch applies to 2.6.30-rc2. Stable kernels have a similar
issue which should also be fixed, but the code changed between 2.6.29
and 2.6.30, so this patch only applies to 2.6.30-rc.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: gregkh@suse.de
CC: stable@kernel.org
CC: cpufreq@vger.kernel.org
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
CC: Ben Slusky <sluskyb@paranoiacs.org>
Signed-off-by: Dave Jones <davej@redhat.com>
a75603a084f1dd9a9558c8af95fc2acfc54f1021 13-Feb-2009 Alexander Clouter <alex@digriz.org.uk> [CPUFREQ] conservative: remove 10x from def_sampling_rate

AMD users get particular hit by this issue (bug 8081) as it caps at
typically 90 seconds as the minimum period for a frequency change.
Harsh eh? Years ago I borked this buy puting the 10x in the wrong
place...I fix that by removing it altogether.

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Dave Jones <davej@redhat.com>
8e677ce83bf41ba9c74e5b6d9ee60b07d4e5ed93 13-Feb-2009 Alexander Clouter <alex@digriz.org.uk> [CPUFREQ] conservative: fixup governor to function more like ondemand logic

As conservative is based off ondemand the codebases occasionally need to be
resync'd. This patch, although ugly, does this.

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Dave Jones <davej@redhat.com>
f407a08bb7eff5ddbe0d9173d8717794a910771f 13-Feb-2009 Alexander Clouter <alex@digriz.org.uk> [CPUFREQ] conservative: fix dbs_cpufreq_notifier so freq is not locked

When someone added the dbs_cpufreq_notifier section to the governor the
code ended up causing the frequency to only fall. This is because
requested_freq is tinkered with and that should only modified if it has
an invlaid value due to changes in the available frequency ranges

This should fix #10055.

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Dave Jones <davej@redhat.com>
11a80a9c7668c40c40a03ae15bd2c6b215058b2e 13-Feb-2009 Alexander Clouter <alex@digriz.org.uk> [CPUFREQ] conservative: amend author's email address

Amend author's email address.

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Dave Jones <davej@redhat.com>
112124ab0a9f507a0d7fdbb1e1ed2b9a24f8c4ea 04-Feb-2009 Thomas Renninger <trenn@suse.de> [CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictions

Limit sampling rate to transition_latency * 100 or kernel limits.
If sampling_rate is tried to be set too low, set the lowest allowed value.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
9411b4ef7fcb534fe1582fe02738254e398dd931 04-Feb-2009 Thomas Renninger <trenn@suse.de> [CPUFREQ] ondemand/conservative: deprecate sampling_rate{min,max}

The same info can be obtained via the transition_latency sysfs file

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
9acef4875695a7717734f2578666a64822ea6495 18-Jan-2009 Dave Jones <davej@redhat.com> [CPUFREQ] checkpatch cleanups for conservative governor

Signed-off-by: Dave Jones <davej@redhat.com>
835481d9bcd65720b473db6b38746a74a3964218 04-Jan-2009 Rusty Russell <rusty@rustcorp.com.au> cpumask: convert struct cpufreq_policy to cpumask_var_t

Impact: use new cpumask API to reduce memory usage

This is part of an effort to reduce structure sizes for machines
configured with large NR_CPUS. cpumask_t gets replaced by
cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or
struct cpumask * (large NR_CPUS).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
c4d14bc0bb5d13e316890651ae4518b764c3344c 20-Sep-2008 Sven Wegener <sven.wegener@stealer.net> [CPUFREQ] Don't export governors for default governor

We don't need to export the governors for use as the default governor,
because the default governor will be built-in anyway and we can access
the symbol directly.

This also fixes the following sparse warnings:

drivers/cpufreq/cpufreq_conservative.c:578:25: warning: symbol 'cpufreq_gov_conservative' was not declared. Should it be static?
drivers/cpufreq/cpufreq_ondemand.c:582:25: warning: symbol 'cpufreq_gov_ondemand' was not declared. Should it be static?
drivers/cpufreq/cpufreq_performance.c:39:25: warning: symbol 'cpufreq_gov_performance' was not declared. Should it be static?
drivers/cpufreq/cpufreq_powersave.c:38:25: warning: symbol 'cpufreq_gov_powersave' was not declared. Should it be static?
drivers/cpufreq/cpufreq_userspace.c:190:25: warning: symbol 'cpufreq_gov_userspace' was not declared. Should it be static?

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Dave Jones <davej@redhat.com>
8217e4f4c93e5fb59bb3cd1e6135213889349f86 07-Jul-2008 Ben Slusky <sluskyb@paranoiacs.org> [CPUFREQ] use deferrable delayed work init in conservative governor

Venki Pallipadi made a similar change to the ondemand governor a while
back (in commit 28287033e12463c8ff89f1ea8038783d0360391c). It seems to
work just as well in the conservative governor, leading to fewer wakeups
as reported by powertop.

Signed-off-by: Ben Slusky <sluskyb@paranoiacs.org>
Signed-off-by: Dave Jones <davej@redhat.com>
f068c04ba6f308774fdd2ed5e113da7cf4ff2f2b 30-Jul-2008 Dave Jones <davej@redhat.com> [CPUFREQ] Fix -Wshadow warning in conservative governor.

drivers/cpufreq/cpufreq_conservative.c:336:15: warning: symbol 'freq_step' shadows an earlier one

Just rename the local variable.

Signed-off-by: Dave Jones <davej@redhat.com>
068b12772a64c2440ef2f64ac5d780688c06576f 12-May-2008 Mike Travis <travis@sgi.com> cpufreq: use performance variant for_each_cpu_mask_nr

Change references from for_each_cpu_mask to for_each_cpu_mask_nr
where appropriate

Reviewed-by: Paul Jackson <pj@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
6915719b36a97d28fab576c6fa2a20364b435fe6 18-Jan-2008 Johannes Weiner <hannes@saeurebad.de> cpufreq: Initialise default governor before use

When the cpufreq driver starts up at boot time, it calls into the default
governor which might not be initialised yet. This hurts when the
governor's worker function relies on memory that is not yet set up by its
init function.

This migrates all governors from module_init() to fs_initcall() when being
the default, as was already done in cpufreq_performance when it was the
only possible choice. The performance governor is always initialized early
because it might be used as fallback even when not being the default.

Fixes at least one actual oops where ondemand is the default governor and
cpufreq_governor_dbs() uses the uninitialised kondemand_wq work-queue
during boot-time.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
18a7247d1bb2e2dcbab628d7e786d03df5bf1eed 22-Oct-2007 Dave Jones <davej@redhat.com> [CPUFREQ] Fix up whitespace in conservative governor.

Signed-off-by: Dave Jones <davej@redhat.com>
a8d7c3bc2396aff14f9e920677072cb55b016040 22-Oct-2007 Elias Oltmanns <eo@nebensachen.de> [CPUFREQ] Make cpufreq_conservative handle out-of-sync events properly

Make cpufreq_conservative handle out-of-sync events properly

Currently, the cpufreq_conservative governor doesn't get notified when the
actual frequency the cpu is running at differs from what cpufreq thought it
was. As a result the cpu may stay at the maximum frequency after a s2ram /
resume cycle even though the system is idle.

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Dave Jones <davej@redhat.com>
1c2562459faedc35927546cfa5273ec6c2884cce 02-Oct-2007 Thomas Renninger <trenn@suse.de> [CPUFREQ] allow ondemand and conservative cpufreq governors to be used as default

Depending on the transition latency of the HW for cpufreq switches, the
ondemand or conservative governor cannot be used with certain cpufreq
drivers. Still the ondemand should be the default governor on a wide range
of systems. This patch allows this and lets the governor fallback to the
performance governor at cpufreq driver load time, if the driver does not
support fast enough frequency switching.

Main benefit is that on e.g. installation or other systems without
userspace support a working dynamic cpufreq support can be achieved on most
systems by simply loading the cpufreq driver. This is especially essential
for recent x86(_64) laptop hardware which may rely on working dynamic
cpufreq OS support.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <bryan.wu@analog.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
cd354f1ae75e6466a7e31b727faede57a1f89ca5 14-Feb-2007 Tim Schmielau <tim@physik3.uni-rostock.de> [PATCH] remove many unneeded #includes of sched.h

After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there. Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.

To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.

Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm. I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
c120069779e3e35917c15393cf2847fa79811eb6 06-Feb-2007 Dave Jones <davej@redhat.com> [CPUFREQ] Remove hotplug cpu crap

The hotplug CPU locking in cpufreq is horrendous. No-one seems to care
enough to fix it, so just remove it so that the 99.9% of the real world
users of this code can use cpufreq without being bothered by warnings.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
c4028958b6ecad064b1a6303a6a5906d4fe48d73 22-Nov-2006 David Howells <dhowells@redhat.com> WorkStruct: make allyesconfig

Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
e08f5f5bb5dfaaa28d69ffe37eb774533297657f 26-Oct-2006 Gautham R Shenoy <ego@in.ibm.com> [CPUFREQ] Fix coding style issues in cpufreq.

Clean up cpufreq subsystem to fix coding style issues and to improve
the readability.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Dave Jones <davej@redhat.com>
914f7c31b0bea0ccf3bf474d0b99d803f7985097 20-Oct-2006 Jeff Garzik <jeff@garzik.org> [CPUFREQ] handle sysfs errors

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
153d7f3fcae7ed4e19328549aa9467acdfbced10 26-Jul-2006 Arjan van de Ven <arjan@linux.intel.com> [PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizare

The patch below moves the cpu hotplugging higher up in the cpufreq
layering; this is needed to avoid recursive taking of the cpu hotplug
lock and to otherwise detangle the mess.

The new rules are:
1. you must do lock_cpu_hotplug() around the following functions:
__cpufreq_driver_target
__cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only)
__cpufreq_set_policy
2. governer methods (.governer) must NOT take the lock_cpu_hotplug()
lock in any way; they are called with the lock taken already
3. if your governer spawns a thread that does things, like calling
__cpufreq_driver_target, your thread must honor rule #1.
4. the policy lock and other cpufreq internal locks nest within
the lock_cpu_hotplug() lock.

I'm not entirely happy about how the __cpufreq_governor rule ended up
(conditional locking rule depending on the argument) but basically all
callers pass this as a constant so it's not too horrible.

The patch also removes the cpufreq_governor() function since during the
locking audit it turned out to be entirely unused (so no need to fix it)

The patch works on my testbox, but it could use more testing
(otoh... it can't be much worse than the current code)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
138a0128c01899ee8492858a50856b90d0d9d815 23-Jun-2006 Andrew Morton <akpm@osdl.org> [PATCH] cpufreq build fix

drivers/cpufreq/cpufreq_ondemand.c: In function 'do_dbs_timer':
drivers/cpufreq/cpufreq_ondemand.c:374: warning: implicit declaration of function 'lock_cpu_hotplug'
drivers/cpufreq/cpufreq_ondemand.c:381: warning: implicit declaration of function 'unlock_cpu_hotplug'
drivers/cpufreq/cpufreq_conservative.c: In function 'do_dbs_timer':
drivers/cpufreq/cpufreq_conservative.c:425: warning: implicit declaration of function 'lock_cpu_hotplug'
drivers/cpufreq/cpufreq_conservative.c:432: warning: implicit declaration of function 'unlock_cpu_hotplug'

Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4ec223d02f4d5f5a3129edc0e3d22550d6ac8a32 22-Jun-2006 Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> [CPUFREQ] Fix ondemand vs suspend deadlock

Rootcaused the bug to a deadlock in cpufreq and ondemand. Due to non-existent
ordering between cpu_hotplug lock and dbs_mutex. Basically a race condition
between cpu_down() and do_dbs_timer().

cpu_down() flow:
* cpu_down() call for CPU 1
* Takes hot plug lock
* Calls pre down notifier
* cpufreq notifier handler calls cpufreq_driver_target() which takes
cpu_hotplug lock again. OK as cpu_hotplug lock is recursive in same
process context
* CPU 1 goes down
* Calls post down notifier
* cpufreq notifier handler calls ondemand event stop which takes dbs_mutex

So, cpu_hotplug lock is taken before dbs_mutex in this flow.

do_dbs_timer is triggerred by a periodic timer event.
It first takes dbs_mutex and then takes cpu_hotplug lock in
cpufreq_driver_target().
Note the reverse order here compared to above. So, if this timer event happens
at right moment during cpu_down, system will deadlok.

Attached patch fixes the issue for both ondemand and conservative.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
b82fbe6c4232365272bde6f2c3f8fd9dd4dcd73a 02-Apr-2006 Dave Jones <davej@redhat.com> [CPUFREQ] Remove pointless check in conservative governor.

< 0 checks on unsigned variables are pointless.

Signed-off-by: Dave Jones <davej@redhat.com>
c326e27eb79e98050d855e371ac534ff4352e910 27-Mar-2006 Mattia Dongili <malattia@linux.it> [CPUFREQ] cpufreq_conservative: keep ignore_nice_load and freq_step values when reselected

Keep the value of ignore_nice_load and freq_step of the conservative
governor after the governor is deselected and reselected.

Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Dave Jones <davej@redhat.com>
a159b82770ab84e1b5e0306fa65e158188492b16 22-Mar-2006 Alexander Clouter <alex@digriz.org.uk> [PATCH] cpufreq_conservative: alternative initialise approach

Venki, author of cpufreq_ondemand, came up with a neater way to remove the
initialiser code from the main loop of my code and out to the point when the
governor is actually initialised.

Not only does it look but it also feels cleaner, plus its simpler to
understand. It also saves a bunch of pointless conditional statements in the
main loop.

Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
08a28e2e98aa821cf6f15f8a267beb2f33377bb9 22-Mar-2006 Alexander Clouter <alex@digriz.org.uk> [PATCH] cpufreq_conservative: make for_each_cpu() safe

All these changes should make cpufreq_conservative safe in regards to the x86
for_each_cpu cpumask.h changes and whatnot.

Whilst making it safe a number of pointless for loops related to the cpu
mask's were removed. I was never comfortable with all those for loops,
especially as the iteration is over the same data again and again for each
CPU you had in a single poll, an O(n^2) outcome to frequency scaling.

The approach I use is to assume by default no CPU's exist and it sets the
requested_freq to zero as a kind of flag, the reasoning is in the source ;)
If the CPU is queried and requested_freq is zero then it initialises the
variable to current_freq and then continues as if nothing happened which
should be the same net effect as before?

Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
e8a02572252f9115c2b8296c40fd8b985f06f872 22-Mar-2006 Alexander Clouter <alex@digriz.org.uk> [PATCH] cpufreq_conservative: alter default responsiveness

The sensible approach to making conservative less responsive than ondemand :)
As mentioned in patch [1/4]. We do not want conservative to shoot through
all the frequencies, its point (by default) is to slowly move through them.

By default its ten times less responsive.

Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2c906b317b2d9c7e32b0d513e102bd68a2c49112 22-Mar-2006 Alexander Clouter <alex@digriz.org.uk> [PATCH] cpufreq_conservative: aligning of codebase with ondemand

Since the conservative govenor was released its codebase has drifted from the
the direction and updates that have been applied to the ondemand govornor.

This patch addresses the lack of updates in that period and brings
conservative back up to date. The resulting diff file between
cpufreq_ondemand.c and cpufreq_conservative.c is now much smaller and shows
more clearly the differences between the two.

Another reason to do this is ages ago, knowingly, I did a piss poor attempt
at making conservative less responsive by knocking up
DEF_SAMPLING_RATE_LATENCY_MULTIPLIER by two orders of magnitude. I did fix
this ages ago but in my dis-organisation I must have toasted the diff and
left it the way it was. About two weeks ago a user contacted me saying he
was having problems with the conservative governor with his AMD Athlon XP-M
2800+ as /sys/devices/system/cpu/cpu0/cpufreq/conservative showed
sampling_rate_min 9950000
sampling_rate_max 1360065408

Nine seconds to decide about changing the frequency....not too responsive :)

Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
3fc54d37ab64733448faf0185e19a80f070eb9e3 14-Jan-2006 akpm@osdl.org <akpm@osdl.org> [CPUFREQ] Convert drivers/cpufreq semaphores to mutexes.

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
001893cda2f280ab882164737a0b608208524809 01-Dec-2005 Alexander Clouter <alex@digriz.org.uk> [PATCH] cpufreq_conservative/ondemand: invert meaning of 'ignore nice'

The use of the 'ignore_nice' sysfs file is confusing to anyone using it.
This removes the sysfs file 'ignore_nice' and in its place creates a
'ignore_nice_load' entry that defaults to '0'; meaning nice'd processes
_are_ counted towards the 'business' calculation.

WARNING: this obvious breaks any userland tools that expected ignore_nice'
to exist, to draw attention to this fact it was concluded on the mailing
list that the entry should be removed altogether so the userland app breaks
and so the author can build simple to detect workaround. Having said that
it seems currently very few tools even make use of this functionality; all
I could find was a Gentoo Wiki entry.

Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
927321440976d0781a252eefe686ae6b0f236ae2 28-Oct-2005 Dave Jones <davej@redhat.com> [PATCH] cpufreq: SMP fix for conservative governor

Don't try to access not-present CPUs. Conservative governor will always
oops on SMP without this fix.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=4781

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
9c7d269b9b05440dd0fe92d96f4e5d7e73dd7238 01-Jun-2005 Dave Jones <davej@redhat.com> [CPUFREQ] ondemand,conservative governor idle_tick clean-up

[PATCH] [3/5] ondemand,conservative governor idle_tick clean-up

Ondemand and conservative governor clean-up, it factorises the idle ticks
measurement.

Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
790d76fa979f55bfc49a6901bb911778949b582d 01-Jun-2005 Dave Jones <davej@redhat.com> [CPUFREQ] ondemand,conservative governor store the idle ticks for all cpus

[PATCH] [2/5] ondemand,conservative governor store the idle ticks for all cpus

Ondemand, conservative governor did not store prev_cpu_idle_up into
prev_cpu_idle_down for other CPUs than the current CPU.

Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
dac1c1a56279b4545a822ec7bc770003c233e546 01-Jun-2005 Dave Jones <davej@redhat.com> [CPUFREQ] ondemand,conservative minor bug-fix and cleanup

[PATCH] [1/5] ondemand,conservative minor bug-fix and cleanup

Attached patch fixes some minor issues with Alexander's patch and related
cleanup in both ondemand and conservative governor.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
b9170836d1aa4ded7cc1ac1cb8fbc7867061c98c 01-Jun-2005 Dave Jones <davej@redhat.com> [CPUFREQ] Conservative cpufreq governer

A new cpufreq module, based on the ondemand one with my additional patches
just posted. This one is more suitable for battery environments where its
probably more appealing to have the cpu freq gracefully increase and decrease
rather than flip between the min and max freq's.

N.B. Bruno Ducrot pointed out that the amd64's "do have unacceptable latency
between min and max freq transition, due to the step-by-step requirements
(200MHz IIRC)"; so AMD64 users would probably benefit from this too.

Signed-off-by: Alexander Clouter <alex-kernel@digriz.org.uk>
Signed-off-by: Dave Jones <davej@redhat.com>