Cross Reference: /drivers/cpufreq/cpufreq

History log of /drivers/cpufreq/cpufreq_governor.h
Revision	Date	Author	Comments
c8ae481b9a12f5cea080651ea87736104b111f8e	09-Jun-2014	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governor: remove copy_prev_load from 'struct cpu_dbs_common_info' 'copy_prev_load' was recently added by commit: 18b46ab (cpufreq: governor: Be friendly towards latency-sensitive bursty workloads). It actually is a bit redundant as we also have 'prev_load' which can store any integer value and can be used instead of 'copy_prev_load' by setting it zero. True load can also turn out to be zero during long idle intervals (and hence the actual value of 'prev_load' and the overloaded value can clash). However this is not a problem because, if the true load was really zero in the previous interval, it makes sense to evaluate the load afresh for the current interval rather than copying the previous load. So, drop 'copy_prev_load' and use 'prev_load' instead. Update comments as well to make it more clear. There is another change here which was probably missed by Srivatsa during the last version of updates he made. The unlikely in the 'if' statement was covering only half of the condition and the whole line should actually come under it. Also checkpatch is made more silent as it was reporting this (--strict option): CHECK: Alignment should match open parenthesis + if (unlikely(wall_time > (2 * sampling_rate) && + j_cdbs->prev_load)) { Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
18b46abd0009516c1973a57ccf4d01b9eaa3422a	07-Jun-2014	Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>	cpufreq: governor: Be friendly towards latency-sensitive bursty workloads Cpufreq governors like the ondemand governor calculate the load on the CPU periodically by employing deferrable timers. A deferrable timer won't fire if the CPU is completely idle (and there are no other timers to be run), in order to avoid unnecessary wakeups and thus save CPU power. However, the load calculation logic is agnostic to all this, and this can lead to the problem described below. Time (ms) CPU 1 100 Task-A running 110 Governor's timer fires, finds load as 100% in the last 10ms interval and increases the CPU frequency. 110.5 Task-A running 120 Governor's timer fires, finds load as 100% in the last 10ms interval and increases the CPU frequency. 125 Task-A went to sleep. With nothing else to do, CPU 1 went completely idle. 200 Task-A woke up and started running again. 200.5 Governor's deferred timer (which was originally programmed to fire at time 130) fires now. It calculates load for the time period 120 to 200.5, and finds the load is almost zero. Hence it decreases the CPU frequency to the minimum. 210 Governor's timer fires, finds load as 100% in the last 10ms interval and increases the CPU frequency. So, after the workload woke up and started running, the frequency was suddenly dropped to absolute minimum, and after that, there was an unnecessary delay of 10ms (sampling period) to increase the CPU frequency back to a reasonable value. And this pattern repeats for every wake-up-from-cpu-idle for that workload. This can be quite undesirable for latency- or response-time sensitive bursty workloads. So we need to fix the governor's logic to detect such wake-up-from- cpu-idle scenarios and start the workload at a reasonably high CPU frequency. One extreme solution would be to fake a load of 100% in such scenarios. But that might lead to undesirable side-effects such as frequency spikes (which might also need voltage changes) especially if the previous frequency happened to be very low. We just want to avoid the stupidity of dropping down the frequency to a minimum and then enduring a needless (and long) delay before ramping it up back again. So, let us simply carry forward the previous load - that is, let us just pretend that the 'load' for the current time-window is the same as the load for the previous window. That way, the frequency and voltage will continue to be set to whatever values they were set at previously. This means that bursty workloads will get a chance to influence the CPU frequency at which they wake up from cpu-idle, based on their past execution history. Thus, they might be able to avoid suffering from slow wakeups and long response-times. However, we should take care not to over-do this. For example, such a "copy previous load" logic will benefit cases like this: (where # represents busy and . represents idle) ##########.........#########.........###########...........##########........ but it will be detrimental in cases like the one shown below, because it will retain the high frequency (copied from the previous interval) even in a mostly idle system: ##########.........#.................#.....................#............... (i.e., the workload finished and the remaining tasks are such that their busy periods are smaller than the sampling interval, which causes the timer to always get deferred. So, this will make the copy-previous-load logic copy the initial high load to subsequent idle periods over and over again, thus keeping the frequency high unnecessarily). So, we modify this copy-previous-load logic such that it is used only once upon every wakeup-from-idle. Thus if we have 2 consecutive idle periods, the previous load won't get blindly copied over; cpufreq will freshly evaluate the load in the second idle interval, thus ensuring that the system comes back to its normal state. [ The right way to solve this whole problem is to teach the CPU frequency governors to also track load on a per-task basis, not just a per-CPU basis, and then use both the data sources intelligently to set the appropriate frequency on the CPUs. But that involves redesigning the cpufreq subsystem, so this patch should make the situation bearable until then. ] Experimental results: +-------------------+ I ran a modified version of ebizzy (called 'sleeping-ebizzy') that sleeps in between its execution such that its total utilization can be a user-defined value, say 10% or 20% (higher the utilization specified, lesser the amount of sleeps injected). This ebizzy was run with a single-thread, tied to CPU 8. Behavior observed with tracing (sample taken from 40% utilization runs): ------------------------------------------------------------------------ Without patch: ~~~~~~~~~~~~~~ kworker/8:2-12137 416.335742: cpu_frequency: state=2061000 cpu_id=8 kworker/8:2-12137 416.335744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.345741: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.345744: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-12137 416.345746: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.355738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 <snip> --------------------------------------------------------------------- <snip> <...>-40753 416.402202: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8 <idle>-0 416.502130: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy <...>-40753 416.505738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8 kworker/8:2-12137 416.505741: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.515739: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.515742: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-12137 416.515744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy Observation: Ebizzy went idle at 416.402202, and started running again at 416.502130. But cpufreq noticed the long idle period, and dropped the frequency at 416.505739, only to increase it back again at 416.515742, realizing that the workload is in-fact CPU bound. Thus ebizzy needlessly ran at the lowest frequency for almost 13 milliseconds (almost 1 full sample period), and this pattern repeats on every sleep-wakeup. This could hurt latency-sensitive workloads quite a lot. With patch: ~~~~~~~~~~~ kworker/8:2-29802 464.832535: cpu_frequency: state=2061000 cpu_id=8 <snip> --------------------------------------------------------------------- <snip> kworker/8:2-29802 464.962538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 464.972533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 464.972536: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-29802 464.972538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 464.982531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 <snip> --------------------------------------------------------------------- <snip> kworker/8:2-29802 465.022533: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.032531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 465.032532: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.035797: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8 <idle>-0 465.240178: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy <...>-40738 465.242533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 465.242535: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.252531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 Observation: Ebizzy went idle at 465.035797, and started running again at 465.240178. Since ebizzy was the only real workload running on this CPU, cpufreq retained the frequency at 4.1Ghz throughout the run of ebizzy, no matter how many times ebizzy slept and woke-up in-between. Thus, ebizzy got the 10ms worth of 4.1 Ghz benefit during every sleep-wakeup (as compared to the run without the patch) and this boost gave a modest improvement in total throughput, as shown below. Sleeping-ebizzy records-per-second: ----------------------------------- Utilization Without patch With patch Difference (Absolute and % values) 10% 274767 277046 + 2279 (+0.829%) 20% 543429 553484 + 10055 (+1.850%) 40% 1090744 1107959 + 17215 (+1.578%) 60% 1634908 1662018 + 27110 (+1.658%) A rudimentary and somewhat approximately latency-sensitive workload such as sleeping-ebizzy itself showed a consistent, noticeable performance improvement with this patch. Hence, workloads that are truly latency-sensitive will benefit quite a bit from this change. Moreover, this is an overall win-win since this patch does not hurt power-savings at all (because, this patch does not reduce the idle time or idle residency; and the high frequency of the CPU when it goes to cpu-idle does not affect/hurt the power-savings of deep idle states). Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
6f1e4efd882eccca10bac45b77e14bcb4979dc54	03-Jan-2014	Jane Li <jiel@marvell.com>	cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled When a CPU is hot removed we'll cancel all the delayed work items via gov_cancel_work(). Sometimes the delayed work function determines that it should adjust the delay for all other CPUs that the policy is managing. If this scenario occurs, the canceling CPU will cancel its own work but queue up the other CPUs works to run. Commit 3617f2 (cpufreq: Fix timer/workqueue corruption due to double queueing) has tried to fix this, but reading governor_enabled is not protected by cpufreq_governor_lock. Even though od_dbs_timer() checks governor_enabled before gov_queue_work(), this scenario may occur. For example: CPU0 CPU1 ---- ---- cpu_down() ... <work runs> __cpufreq_remove_dev() od_dbs_timer() __cpufreq_governor() policy->governor_enabled policy->governor_enabled = false; cpufreq_governor_dbs() case CPUFREQ_GOV_STOP: gov_cancel_work(dbs_data, policy); cpu0 work is canceled timer is canceled cpu1 work is canceled <waits for cpu1> gov_queue_work(, , true); cpu0 work queued cpu1 work queued cpu2 work queued ... cpu1 work is canceled cpu2 work is canceled ... At the end of the GOV_STOP case cpu0 still has a work queued to run although the code is expecting all of the works to be canceled. __cpufreq_remove_dev() will then proceed to re-initialize all the other CPUs works except for the CPU that is going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs() will trample over the queued work and debugobjects will spit out a warning: WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc() ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x14 Modules linked in: CPU: 1 PID: 1205 Comm: sh Tainted: G W 3.10.0 #200 [<c01144f0>] (unwind_backtrace+0x0/0xf8) from [<c0111d98>] (show_stack+0x10/0x14) [<c0111d98>] (show_stack+0x10/0x14) from [<c01272cc>] (warn_slowpath_common+0x4c/0x68) [<c01272cc>] (warn_slowpath_common+0x4c/0x68) from [<c012737c>] (warn_slowpath_fmt+0x30/0x40) [<c012737c>] (warn_slowpath_fmt+0x30/0x40) from [<c034c640>] (debug_print_object+0x94/0xbc) [<c034c640>] (debug_print_object+0x94/0xbc) from [<c034c7f8>] (__debug_object_init+0xc8/0x3c0) [<c034c7f8>] (__debug_object_init+0xc8/0x3c0) from [<c01360e0>] (init_timer_key+0x20/0x104) [<c01360e0>] (init_timer_key+0x20/0x104) from [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c) [<c04872ac>] (cpufreq_governor_dbs+0x1dc/0x68c) from [<c04833a8>] (__cpufreq_governor+0x80/0x1b0) [<c04833a8>] (__cpufreq_governor+0x80/0x1b0) from [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380) [<c0483704>] (__cpufreq_remove_dev.isra.12+0x22c/0x380) from [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c) [<c0692f38>] (cpufreq_cpu_callback+0x48/0x5c) from [<c014fb40>] (notifier_call_chain+0x44/0x84) [<c014fb40>] (notifier_call_chain+0x44/0x84) from [<c012ae44>] (__cpu_notify+0x2c/0x48) [<c012ae44>] (__cpu_notify+0x2c/0x48) from [<c068dd40>] (_cpu_down+0x80/0x258) [<c068dd40>] (_cpu_down+0x80/0x258) from [<c068df40>] (cpu_down+0x28/0x3c) [<c068df40>] (cpu_down+0x28/0x3c) from [<c068e4c0>] (store_online+0x30/0x74) [<c068e4c0>] (store_online+0x30/0x74) from [<c03a7308>] (dev_attr_store+0x18/0x24) [<c03a7308>] (dev_attr_store+0x18/0x24) from [<c0256fe0>] (sysfs_write_file+0x100/0x180) [<c0256fe0>] (sysfs_write_file+0x100/0x180) from [<c01fec9c>] (vfs_write+0xbc/0x184) [<c01fec9c>] (vfs_write+0xbc/0x184) from [<c01ff034>] (SyS_write+0x40/0x68) [<c01ff034>] (SyS_write+0x40/0x68) from [<c010e200>] (ret_fast_syscall+0x0/0x48) In gov_queue_work(), lock cpufreq_governor_lock before gov_queue_work, and unlock it after __gov_queue_work(). In this way, governor_enabled is guaranteed not changed in gov_queue_work(). Signed-off-by: Jane Li <jiel@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
0b981e70748861a3e10ea2e2a689bdcee3e15085	02-Oct-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: use cpufreq_driver->flags to mark CPUFREQ_HAVE_GOVERNOR_PER_POLICY Use cpufreq_driver->flags to mark CPUFREQ_HAVE_GOVERNOR_PER_POLICY instead of a separate field within cpufreq_driver. This will save some bytes of memory. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
c4afc410942f9f0675a5431adbdb03cf5908d1df	26-Aug-2013	Stratos Karafotis <stratosk@semaphore.gr>	cpufreq: governor: Fix typos in comments - 'Governer' should be 'Governor'. - 'S' is used for Siemens (electrical conductance) in SI units, so use small 's' for seconds. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
3a3e9e06d0c11b8efa95933a88c9e67209fa4330	06-Aug-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Give consistent names to cpufreq_policy objects They are called policy, cur_policy, new_policy, data, etc. Just call them policy wherever possible. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5ff0a268037d344f86df690ccb994d8bc015d2d9	06-Aug-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Clean up header files included in the core This patch addresses the following issues in the header files in the cpufreq core: - Include headers in ascending order, so that we don't add same many times by mistake. - <asm/> must be included after <linux/>, so that they override whatever they need to. - Remove unnecessary includes. - Don't include files already included by cpufreq.h or cpufreq_governor.h. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
6c4640c3adfd97ce10efed7c07405f52d002b9a8	04-Aug-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: rename ignore_nice as ignore_nice_load This sysfs file was called ignore_nice_load earlier and commit 4d5dcc4 (cpufreq: governor: Implement per policy instances of governors) changed its name to ignore_nice by mistake. Lets get it renamed back to its original name. Reported-by: Martin von Gagern <Martin.vGagern@gmx.net> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
dfa5bb622555d9da0df21b50f46ebdeef390041b	05-Jun-2013	Stratos Karafotis <stratosk@semaphore.gr>	cpufreq: ondemand: Change the calculation of target frequency The ondemand governor calculates load in terms of frequency and increases it only if load_freq is greater than up_threshold multiplied by the current or average frequency. This appears to produce oscillations of frequency between min and max because, for example, a relatively small load can easily saturate minimum frequency and lead the CPU to the max. Then, it will decrease back to the min due to small load_freq. Change the calculation method of load and target frequency on the basis of the following two observations: - Load computation should not depend on the current or average measured frequency. For example, absolute load of 80% at 100MHz is not necessarily equivalent to 8% at 1000MHz in the next sampling interval. - It should be possible to increase the target frequency to any value present in the frequency table proportional to the absolute load, rather than to the max only, so that: Target frequency = C * load where we take C = policy->cpuinfo.max_freq / 100. Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait. Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an increase ~1.5% in performance. cpufreq_stats (time_in_state) shows that middle frequencies are used more, with this patch. Highest and lowest frequencies were used less by ~9%. [rjw: We have run multiple other tests on kernels with this change applied and in the vast majority of cases it turns out that the resulting performance improvement also leads to reduced consumption of energy. The change is additionally justified by the overall simplification of the code in question.] Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
bb176f7d038fee4d46b3293e64e173bfb05ab7b5	19-Jun-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Fix minor formatting issues There were a few noticeable formatting issues in core cpufreq code. This cleans them up to make code look better. The changes include: - Whitespace cleanup. - Rearrangements of code. - Multiline comments fixes. - Formatting changes to fit 80 columns. Copyright information in cpufreq.c is also updated to include my name for 2013. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
72a4ce340a7ebf39e1c6fdc8f5feb4f974d6c635	17-May-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Move get_cpu_idle_time() to cpufreq.c Governors other than ondemand and conservative can also use get_cpu_idle_time() and they aren't required to compile cpufreq_governor.c. So, move these independent routines to cpufreq.c instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
a97c98adddbe98e824b69e6d7b320c8dc91fe581	30-Apr-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governors: Fix CPUFREQ_GOV_POLICY_{INIT\|EXIT} notifiers There are two types of INIT/EXIT activities that we need to do for governors: - Done only once per governor (doesn't depend how many instances of the governor there are). eg: cpufreq_register_notifier() for conservative governor. - Done per governor instance, eg: sysfs_{create\|remove}_group(). There were some corner cases where current code isn't able to handle them separately and so failing for some test cases. We use two separate variables now for keeping track of above two requirements. - governor->initialized for first one - dbs_data->usage_count for per governor instance Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
fb30809efa3edeb692a6b29125a07c9eceb322dc	02-Apr-2013	Jacob Shin <jacob.shin@amd.com>	cpufreq: ondemand: allow custom powersave_bias_target handler to be registered This allows for another [arch specific] driver to hook into existing powersave bias function of the ondemand governor. i.e. This allows AMD specific powersave bias function (in a separate AMD specific driver) to aid ondemand governor's frequency transition decisions. Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Thomas Renninger <trenn@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
beb0ff390e208e71668a1389e3db02631a88cff5	02-Apr-2013	Borislav Petkov <bp@suse.de>	cpufreq: Correct header guards typo It should be "governor". Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
9366d84052e7c5b2eca804c08cfcd00b490f4de2	28-Feb-2013	Stratos Karafotis <stratosk@semaphore.gr>	cpufreq: governors: Calculate iowait time only when necessary Currently we always calculate the CPU iowait time and add it to idle time. If we are in ondemand and we use io_is_busy, we re-calculate iowait time and we subtract it from idle time. With this patch iowait time is calculated only when necessary avoiding the double call to get_cpu_iowait_time_us. We use a parameter in function get_cpu_idle_time to distinguish when the iowait time will be added to idle time or not, without the need of keeping the prev_io_wait. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.,org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
031299b3be30f3ecab110fff8faad85af70e1797	26-Feb-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governors: Avoid unnecessary per cpu timer interrupts Following patch has introduced per cpu timers or works for ondemand and conservative governors. commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe Author: Rickard Andersson <rickard.andersson@stericsson.com> Date: Thu Dec 27 14:55:38 2012 +0000 cpufreq: handle SW coordinated CPUs This causes additional unnecessary interrupts on all cpus when the load is recently evaluated by any other cpu. i.e. When load is recently evaluated by cpu x, we don't really need any other cpu to evaluate this load again for the next sampling_rate time. Some sort of code is present to avoid that but we are still getting timer interrupts for all cpus. A good way of avoiding this would be to modify delays for all cpus (policy->cpus) whenever any cpu has evaluated load. This patch does this change and some related code cleanup. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
98104ee28f451024170a9dfb7bec31bfcb7e7c14	26-Feb-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governor: Set MIN_LATENCY_MULTIPLIER to 20 Currently MIN_LATENCY_MULTIPLIER is set defined as 100 and so on a system with transition latency of 1 ms, the minimum sampling time comes to be around 100 ms. That is quite big if you want to get better performance for your system. Redefine MIN_LATENCY_MULTIPLIER to 20 so that we can support 20ms sampling rate for such platforms. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4d5dcc4211f9def4281eafb54b8ed483862e8135	27-Mar-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governor: Implement per policy instances of governors Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch uses the infrastructure provided by earlier patch and implements init/exit routines for ondemand and conservative governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
e5dde92cb2befb108ae8cfe8db68a954c164d77c	28-Feb-2013	Namhyung Kim <namhyung.kim@lge.com>	cpufreq: Fix a typo in comment Fix a typo in a comment in cpufreq_governor.h. [rjw: Changelog] Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4bd4e42819c7b5b2d608b353f4d9a7717e586479	06-Feb-2013	Stratos Karafotis <stratosk@semaphore.gr>	cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold In order to avoid the calculation of up_threshold - down_differential every time that the frequency must be decreased, we replace the down_differential tuner with the adj_up_threshold which keeps the difference across multiple checks. Update the adj_up_threshold only when the up_theshold is also updated. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4447266b842d27f77b017a59eb9dc38ad7b299f1	31-Jan-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governors: Remove code redundancy between governors With the inclusion of following patches: 9f4eb10 cpufreq: conservative: call dbs_check_cpu only when necessary 772b4b1 cpufreq: ondemand: call dbs_check_cpu only when necessary code redundancy between the conservative and ondemand governors is introduced again, so get rid of it. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8eeed0956615294200be783bb67d851280b5b1b9	31-Jan-2013	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governors: Get rid of dbs_data->enable field CPUFREQ_GOV_START/STOP are called only once for all policy->cpus and hence we don't need to adapt cpufreq_governor_dbs() routine for multiple calls. So, this patch removes dbs_data->enable field entirely. And rearrange code a bit. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2624f90c16413990ecb0414400174a066319a9f5	31-Jan-2013	Fabio Baltieri <fabio.baltieri@linaro.org>	cpufreq: governors: implement generic policy_is_shared Implement a generic helper function policy_is_shared() to replace the current dbs_sw_coordinated_cpus() at cpufreq level, so that it can be used by code other than cpufreq governors. Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
da53d61e21a5869b2e44247bb37deb8be387e063	27-Dec-2012	Fabio Baltieri <fabio.baltieri@linaro.org>	cpufreq: ondemand: call dbs_check_cpu only when necessary Modify ondemand timer to not resample CPU utilization if recently sampled from another SW coordinated core. Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe	27-Dec-2012	Rickard Andersson <rickard.andersson@stericsson.com>	cpufreq: handle SW coordinated CPUs This patch fixes a bug that occurred when we had load on a secondary CPU and the primary CPU was sleeping. Only one sampling timer was spawned and it was spawned as a deferred timer on the primary CPU, so when a secondary CPU had a change in load this was not detected by the cpufreq governor (both ondemand and conservative). This patch make sure that deferred timers are run on all CPUs in the case of software controlled CPUs that run on the same frequency. Signed-off-by: Rickard Andersson <rickard.andersson@stericsson.com> Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
1e7586a18a2ab69a160837c0a4be31f7147cfb5e	26-Oct-2012	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Fix sparse warnings by updating cputime64_t to u64 There were few sparse warnings due to mismatch of type on function arguments. Two types were used u64 and cputime64_t. Both are actually u64, so use u64 only. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4471a34f9a1f2da220272e823bdb8e8fa83a7661	26-Oct-2012	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: governors: remove redundant code Initially ondemand governor was written and then using its code conservative governor is written. It used a lot of code from ondemand governor, but copy of code was created instead of using the same routines from both governors. Which increased code redundancy, which is difficult to manage. This patch is an attempt to move common part of both the governors to cpufreq_governor.c file to come over above mentioned issues. This shouldn't change anything from functionality point of view. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>