History log of /arch/arm/kernel/perf_event.c
Revision Date Author Comments
6888e32a9e0b284c4dcdefcc3158949110699bc2 03-Jun-2014 Nikolay Borisov <Nikolay.Borisov@arm.com> ARM: 8071/1: perf: Make perf use arm_get_current_stackframe

Make the perf backend use the API so that it correctly references the FP
when in THUMB2 mode

Signed-off-by: Nikolay Borisov <Nikolay.Borisov@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
4b2974fa6a4a32d390a50e23381a2270a2e0d444 07-Jul-2014 Jean Pihet <jean.pihet@linaro.org> ARM: perf: disable the pagefault handler when reading from user space

Under perf, the fp unwinding scheme requires access to user space memory
and can provoke a pagefault via call to __copy_from_user_inatomic from
user_backtrace. This unwinding can take place in response to an interrupt
(__perf_event_overflow). This is undesirable as we may already have
mmap_sem held for write. One example being a process that calls mprotect
just as a the PMU counters overflow.

An example that can provoke this behaviour:
perf record -e event:tocapture --call-graph fp ./application_to_test

This patch addresses this issue by disabling pagefaults briefly in
user_backtrace (as is done in the other architectures: ARM64, x86, Sparc etc.).

Without the patch a deadlock occurs when __perf_event_overflow is called
while reading the data from the user space:

[ INFO: possible recursive locking detected ]
3.16.0-rc2-00038-g0ed7ff6 #46 Not tainted
---------------------------------------------
stress/1634 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<c001dc04>] do_page_fault+0xa8/0x428

but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&mm->mmap_sem);
lock(&mm->mmap_sem);

*** DEADLOCK ***

May be due to missing lock nesting notation

2 locks held by stress/1634:
#0: (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8
#1: (rcu_read_lock){......}, at: [<c00c29dc>] __perf_event_overflow+0x120/0x294

stack backtrace:
CPU: 1 PID: 1634 Comm: stress Not tainted 3.16.0-rc2-00038-g0ed7ff6 #46
[<c0017c8c>] (unwind_backtrace) from [<c0012eec>] (show_stack+0x20/0x24)
[<c0012eec>] (show_stack) from [<c04de914>] (dump_stack+0x7c/0x98)
[<c04de914>] (dump_stack) from [<c006a360>] (__lock_acquire+0x1484/0x1cf0)
[<c006a360>] (__lock_acquire) from [<c006b14c>] (lock_acquire+0xa4/0x11c)
[<c006b14c>] (lock_acquire) from [<c04e3880>] (down_read+0x40/0x7c)
[<c04e3880>] (down_read) from [<c001dc04>] (do_page_fault+0xa8/0x428)
[<c001dc04>] (do_page_fault) from [<c00084ec>] (do_DataAbort+0x44/0xa8)
[<c00084ec>] (do_DataAbort) from [<c0013a1c>] (__dabt_svc+0x3c/0x60)
Exception stack(0xed7c5ae0 to 0xed7c5b28)
5ae0: ed7c5b5c b6dadff4 ffffffec 00000000 b6dadff4 ebc08000 00000000 ebc08000
5b00: 0000007e 00000000 ed7c4000 ed7c5b94 00000014 ed7c5b2c c001a438 c0236c60
5b20: 00000013 ffffffff
[<c0013a1c>] (__dabt_svc) from [<c0236c60>] (__copy_from_user+0xa4/0x3a4)

Acked-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
a7cc91001e36a4a4152c3ada6c8fe38adc5badbc 07-Jul-2014 Jean Pihet <jean.pihet@linaro.org> ARM: perf: Check that current->mm is alive before getting user callchain

An event may occur when an mm is already released.

As per commit 20afc60f892d285fde179ead4b24e6a7938c2f1b
'x86, perf: Check that current->mm is alive before getting user callchain'

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
edcb4d3c36a6429caa03ddfeab4cbb153c7002b2 16-May-2014 Vince Weaver <vincent.weaver@maine.edu> perf/ARM: Use common PMU interrupt disabled code

Make the ARM perf code use the new common PMU interrupt disabled code.

This allows perf to work on ARM machines without a working PMU
interrupt (for example, raspberry pi).

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
[peterz: applied changes suggested by Will]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: devicetree@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161712190.11099@vincent-weaver-1.umelst.maine.edu
[ Small readability tweaks to the code. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Signed-off-by: Ingo Molnar <mingo@kernel.org>
5f5092e72cc25a6a5785308270e0085b2b2772cc 11-Feb-2014 Will Deacon <will.deacon@arm.com> ARM: perf: hook up perf_sample_event_took around pmu irq handling

Since we indirect all of our PMU IRQ handling through a dispatcher, it's
trivial to hook up perf_sample_event_took to prevent applications such
as oprofile from generating interrupt storms due to an unrealisticly
low sample period.

Reported-by: Robert Richter <rric@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
eab443ef391d18772710dc2c156f7ee05e51f754 07-Feb-2014 Stephen Boyd <sboyd@codeaurora.org> ARM: perf: add hook for event index clearing

On Krait processors we have a many-to-one relationship between
raw CPU events and the event programmed into the PMNx counter.
Two raw CPU events could map to the same value programmed in the
PMNx counter. To avoid this problem, we check for collisions
during the get_event_idx() callback by setting a bit in a bitmap
whenever a certain event is used in a PMNx counter (see the next
patch). Unfortunately, we don't have a hook to clear this bit in
the bitmap when the event is deleted so let's add an optional
clear_event_idx() callback for this purpose.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
bbd64559376fa25732994c4181c8ec493fa57871 07-Feb-2014 Stephen Boyd <sboyd@codeaurora.org> ARM: perf: support percpu irqs for the CPU PMU

Some CPU PMUs are wired up with one PPI for all the CPUs instead
of with a different SPI for each CPU. Add support for these
devices.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
9450d14fb959336803e5209119eb422b667b96aa 27-Nov-2013 Will Deacon <will.deacon@arm.com> Revert "ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD"

This reverts commit 3581fe0ef37ce12ac7a4f74831168352ae848edc.

Fixes to the handling of PERF_EVENT_IOC_PERIOD in the core code mean
we no longer have to play this horrible game.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1385560479-11014-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2dfcb802d6bd54a2353678c6434846d94b058f2c 09-Oct-2013 Will Deacon <will.deacon@arm.com> ARM: perf: fix group validation for mixed software and hardware groups

Since software events can always be scheduled, perf allows software and
hardware events to be mixed together in the same event group. There are
two ways in which this can come about:

(1) A SW event is added to a HW group. This validates using the HW PMU
of the group leader.

(2) A HW event is added to a SW group. This inserts the SW events and
the new HW event into a HW context, but the SW event remains the
group leader.

When validating the latter case, we would ideally compare the PMU of
each event in the group with the relevant HW PMU. The problem is, in the
face of potentially multiple HW PMUs, we don't have a handle on the
relevant structure. Commit 7b9f72c62ed0 ("ARM: perf: clean up event
group validation") attempting to resolve this issue, but actually made
things *worse* by comparing with the leader PMU. If the leader is a SW
event, then we automatically `pass' all the HW events during validation!

This patch removes the check against the leader PMU. Whilst this will
allow events from multiple HW PMUs to be grouped together, that should
probably be dealt with in perf core as the result of a later patch.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
b88a2595b6d8aedbd275c07dfa784657b4f757eb 08-Aug-2013 Stephen Boyd <sboyd@codeaurora.org> perf/arm: Fix armpmu_map_hw_event()

Fix constraint check in armpmu_map_hw_event().

Reported-and-tested-by: Vince Weaver <vincent.weaver@maine.edu>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
d9f966357b14e356dbd83b8f4a197a287ab4ff83 08-Aug-2013 Stephen Boyd <sboyd@codeaurora.org> ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event()

Vince Weaver reports an oops in the ARM perf event code while
running his perf_fuzzer tool on a pandaboard running v3.11-rc4.

Unable to handle kernel paging request at virtual address 73fd14cc
pgd = eca6c000
[73fd14cc] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_omap_hdmi omapdss snd_soc_omap_abe_twl6040 snd_soc_twl6040 snd_soc_omap snd_soc_omap_hdmi_card snd_soc_omap_mcpdm snd_soc_omap_mcbsp snd_soc_core snd_compress regmap_spi snd_pcm snd_page_alloc snd_timer snd soundcore
CPU: 1 PID: 2790 Comm: perf_fuzzer Not tainted 3.11.0-rc4 #6
task: eddcab80 ti: ed892000 task.ti: ed892000
PC is at armpmu_map_event+0x20/0x88
LR is at armpmu_event_init+0x38/0x280
pc : [<c001c3e4>] lr : [<c001c17c>] psr: 60000013
sp : ed893e40 ip : ecececec fp : edfaec00
r10: 00000000 r9 : 00000000 r8 : ed8c3ac0
r7 : ed8c3b5c r6 : edfaec00 r5 : 00000000 r4 : 00000000
r3 : 000000ff r2 : c0496144 r1 : c049611c r0 : edfaec00
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c5387d Table: aca6c04a DAC: 00000015
Process perf_fuzzer (pid: 2790, stack limit = 0xed892240)
Stack: (0xed893e40 to 0xed894000)
3e40: 00000800 c001c17c 00000002 c008a748 00000001 00000000 00000000 c00bf078
3e60: 00000000 edfaee50 00000000 00000000 00000000 edfaec00 ed8c3ac0 edfaec00
3e80: 00000000 c073ffac ed893f20 c00bf180 00000001 00000000 c00bf078 ed893f20
3ea0: 00000000 ed8c3ac0 00000000 00000000 00000000 c0cb0818 eddcab80 c00bf440
3ec0: ed893f20 00000000 eddcab80 eca76800 00000000 eca76800 00000000 00000000
3ee0: 00000000 ec984c80 eddcab80 c00bfe68 00000000 00000000 00000000 00000080
3f00: 00000000 ed892000 00000000 ed892030 00000004 ecc7e3c8 ecc7e3c8 00000000
3f20: 00000000 00000048 ecececec 00000000 00000000 00000000 00000000 00000000
3f40: 00000000 00000000 00297810 00000000 00000000 00000000 00000000 00000000
3f60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3f80: 00000002 00000002 000103a4 00000002 0000016c c00128e8 ed892000 00000000
3fa0: 00090998 c0012700 00000002 000103a4 00090ab8 00000000 00000000 0000000f
3fc0: 00000002 000103a4 00000002 0000016c 00090ab0 00090ab8 000107a0 00090998
3fe0: bed92be0 bed92bd0 0000b785 b6e8f6d0 40000010 00090ab8 00000000 00000000
[<c001c3e4>] (armpmu_map_event+0x20/0x88) from [<c001c17c>] (armpmu_event_init+0x38/0x280)
[<c001c17c>] (armpmu_event_init+0x38/0x280) from [<c00bf180>] (perf_init_event+0x108/0x180)
[<c00bf180>] (perf_init_event+0x108/0x180) from [<c00bf440>] (perf_event_alloc+0x248/0x40c)
[<c00bf440>] (perf_event_alloc+0x248/0x40c) from [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc)
[<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc) from [<c0012700>] (ret_fast_syscall+0x0/0x48)
Code: 0a000005 e3540004 0a000016 e3540000 (0791010c)

This is because event->attr.config in armpmu_event_init()
contains a very large number copied directly from userspace and
is never checked against the size of the array indexed in
armpmu_map_hw_event(). Fix the problem by checking the value of
config before indexing the array and rejecting invalid config
values.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
c95eb3184ea1a3a2551df57190c81da695e2144b 08-Aug-2013 Will Deacon <will.deacon@arm.com> ARM: 7809/1: perf: fix event validation for software group leaders

It is possible to construct an event group with a software event as a
group leader and then subsequently add a hardware event to the group.
This results in the event group being validated by adding all members
of the group to a fake PMU and attempting to allocate each event on
their respective PMU.

Unfortunately, for software events wthout a corresponding arm_pmu, this
results in a kernel crash attempting to dereference the ->get_event_idx
function pointer.

This patch fixes the problem by checking explicitly for software events
and ignoring those in event validation (since they can always be
scheduled). We will probably want to revisit this for 3.12, since the
validation checks don't appear to work correctly when dealing with
multiple hardware PMUs anyway.

Cc: <stable@vger.kernel.org>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
c5f927a6f62196226915f12194c9d0df4e2210d7 20-Jun-2013 Jed Davis <jld@mozilla.com> ARM: 7765/1: perf: Record the user-mode PC in the call chain.

With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip.

It's possible to partially work around this problem by post-processing
the data to use the PERF_SAMPLE_IP value, but this works only if the CPU
wasn't in the kernel when the sample was taken.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jed Davis <jld@mozilla.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
cb2d8b342aa084d1f3ac29966245dec9163677fb 12-Apr-2013 Will Deacon <will.deacon@arm.com> ARM: 7698/1: perf: fix group validation when using enable_on_exec

Events may be created with attr->disabled == 1 and attr->enable_on_exec
== 1, which confuses the group validation code because events with the
PERF_EVENT_STATE_OFF are not considered candidates for scheduling, which
may lead to failure at group scheduling time.

This patch fixes the validation check for ARM, so that events in the
OFF state are still considered when enable_on_exec is true.

Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Reported-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
44d6b1fc3e3c6a3af8e599b724972e881c81e1c9 05-Mar-2013 Stephen Boyd <sboyd@codeaurora.org> ARM: 7667/1: perf: Fix section mismatch on armpmu_init()

WARNING: vmlinux.o(.text+0xfb80): Section mismatch in reference
from the function armpmu_register() to the function
.init.text:armpmu_init()
The function armpmu_register() references
the function __init armpmu_init().
This is often because armpmu_register lacks a __init
annotation or the annotation of armpmu_init is wrong.

Just drop the __init marking on armpmu_init() because
armpmu_register() no longer has an __init marking.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
e595ede6050b1ce982d74f7084f93715bcc32359 28-Feb-2013 Chen Gang <gang.chen@asianux.com> ARM: 7664/1: perf: remove erroneous semicolon from event initialisation

Commit 9dcbf466559f ("ARM: perf: simplify __hw_perf_event_init err
handling") tidied up the error handling code for perf event
initialisation on ARM, but a copy-and-paste error left a dangling
semicolon at the end of an if statement.

This patch removes the broken semicolon, restoring the old group
validation semantics.

Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
9dcbf466559f6f2f55d60eb5a1bbebc8e694b52a 18-Jan-2013 Mark Rutland <Mark.Rutland@arm.com> ARM: perf: simplify __hw_perf_event_init err handling

Currently __hw_perf_event_init has an err variable that's ignored right
until the end, where it's initialised, conditionally set, and then used
as a boolean flag deciding whether to return another error code.

This patch removes the err variable and simplifies the associated error
handling logic.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
8f3b90b585d3e879b03ce2a202da04d59dd5b699 18-Jan-2013 Mark Rutland <Mark.Rutland@arm.com> ARM: perf: remove unnecessary checks for idx < 0

We currently check for hwx->idx < 0 in armpmu_read and armpmu_del
unnecessarily. The only case where hwc->idx < 0 is when armpmu_add
fails, in which case the event's state is set to
PERF_EVENT_STATE_INACTIVE.

The perf core will not attempt to read from an event in
PERF_EVENT_STATE_INACTIVE, and so the check in armpmu_read is
unnecessary. Similarly, if perf core cannot add an event it will not
attempt to delete it, so the WARN_ON in armpmu_del is unnecessary.

This patch removes these two redundant checks.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2ac29a14a8b6b4a37c09c50db88dc893e6e7fc75 25-Oct-2012 Jon Hunter <jon-hunter@ti.com> ARM: PMU: fix runtime PM enable

Commit 7be2958 (ARM: PMU: Add runtime PM Support) updated the ARM PMU code to
use runtime PM which was prototyped and validated on the OMAP devices. In this
commit, there is no call pm_runtime_enable() and for OMAP devices
pm_runtime_enable() is currently being called from the OMAP PMU code when the
PMU device is created. However, there are two problems with this:

1. For any other ARM device wishing to use runtime PM for PMU they will need
to call pm_runtime_enable() for runtime PM to work.
2. When booting with device-tree and using device-tree to create the PMU
device, pm_runtime_enable() needs to be called from within the ARM PERF
driver as we are no longer calling any device specific code to create the
device. Hence, PMU does not work on OMAP devices that use the runtime PM
callbacks when using device-tree to create the PMU device.

Therefore, call pm_runtime_enable() directly from the ARM PMU driver when
registering the device. For platforms that do not use runtime PM,
pm_runtime_enable() does nothing and for platforms that do use runtime PM but
may not require it specifically for PMU, this will just add a little overhead
when initialising and uninitialising the PMU device.

Tested with PERF on OMAP2420, OMAP3430 and OMAP4460.

Acked-by: Kevin Hilman <khilman@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
0305230a3d92d6829db89c9e0c096d4d8733f317 21-Sep-2012 Will Deacon <will.deacon@arm.com> ARM: perf: consistently use arm_pmu->name for PMU name

Perf has three ways to name a PMU: either by passing an explicit char *,
reading arm_pmu->name or accessing arm_pmu->pmu.name.

Just use arm_pmu->name consistently in the ARM backend.

Signed-off-by: Will Deacon <will.deacon@arm.com>
ed6f2a522398c26559f4da23a80aa6195e6284c7 30-Jul-2012 Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> ARM: perf: consistently use struct perf_event in arm_pmu functions

The arm_pmu functions have wildly varied parameters which can often be
derived from struct perf_event.

This patch changes the arm_pmu function prototypes so that struct
perf_event pointers are passed in preference to fields that can be
derived from the event.

Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
e50c54189f7c6211a99539156e3978474f0b1a0b 13-Sep-2012 Marc Zyngier <Marc.Zyngier@arm.com> ARM: perf: add guest vs host discrimination

Add minimal guest support to perf, so it can distinguish whether
the PMU interrupt was in the host or the guest, as well as collecting
some very basic information (guest PC, user vs kernel mode).

This is not feature complete though, as it doesn't support backtracing
in the guest.

Based on the x86 implementation, tested with KVM/ARM.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
3581fe0ef37ce12ac7a4f74831168352ae848edc 17-Oct-2012 Will Deacon <will.deacon@arm.com> ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD

The PERF_EVENT_IOC_PERIOD ioctl command can be used to change the
sample period of a running perf_event. Consequently, when calculating
the next event period, the new period will only be considered after the
previous one has overflowed.

This patch changes the calculation of the remaining event ticks so that
they are offset if the period has changed.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Andreas Sandberg <andreas.sandberg@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
051f1b13144dd8553d5a5104dde94c7263ae3ba7 31-Jul-2012 Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> ARM: perf: move irq registration into pmu implementation

This patch moves the CPU-specific IRQ registration and parsing code into
the CPU PMU backend. This is required because a PMU may have more than
one interrupt, which in turn can be either PPI (per-cpu) or SPI
(requiring strict affinity setting at the interrupt distributor).

Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
[will: cosmetic edits and reworked interrupt dispatching]
Signed-off-by: Will Deacon <will.deacon@arm.com>
5505b206ca006d0506d1d3b3c494aa86234f66e2 29-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: perf: move CPU-specific PMU handling code into separate file

This patch moves the CPU-specific PMU handling code out of perf_event.c
and into perf_event_cpu.c.

Signed-off-by: Will Deacon <will.deacon@arm.com>
6dbc00297095122ea89e016ce6affad0b7c0ddac 29-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: perf: prepare for moving CPU PMU code into separate file

The CPU PMU code is tightly coupled with generic ARM PMU handling code.
This makes it cumbersome when trying to add support for other ARM PMUs
(e.g. interconnect, L2 cache controller, bus) as the generic parts of
the code are not readily reusable.

This patch cleans up perf_event.c so that reusable code is exposed via
header files to other potential PMU drivers. The CPU code is
consistently named to identify it as such and also to prepare for moving
it into a separate file.

Signed-off-by: Will Deacon <will.deacon@arm.com>
04236f9fe07462849215c67cae6147661368bfad 28-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: perf: probe devicetree in preference to current CPU

The CPU PMU is probed using the current cpuid information as part of the
early_initcall initialising the architecture perf backend. For
architectures without NMI (such as ARM), this does not need to be
performed early and can be deferred to the driver probe callback. This
also allows us to probe the devicetree in preference to parsing the
current cpuid, which may be invalid on a big.LITTLE multi-cluster
system.

This patch defers the PMU probing and uses the devicetree information
when available.

Signed-off-by: Will Deacon <will.deacon@arm.com>
9f44f9a234020947dd16500a203c9580a66ed67d 28-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: perf: remove mysterious compiler barrier

There's a rather strange compiler barrier in the PMU disabling code
which was presumably placed there by aliens. There's no valid reason for
the barrier and one can only suspect that it's up to no good.

This patch removes it before it has a chance to spread.

Signed-off-by: Will Deacon <will.deacon@arm.com>
f0d1bc47953743aef9d2ed5326bc5973a3db08ab 28-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: pmu: remove unused reservation mechanism

The PMU reservation mechanism was originally intended to allow OProfile
and perf-events to co-ordinate over access to the CPU PMU. Since then,
OProfile for ARM has moved to using perf as its backend, so the
reservation code is no longer used.

This patch removes the reservation code for the CPU PMU on ARM.

Signed-off-by: Will Deacon <will.deacon@arm.com>
50243efde0993f6fe98f27a35692d0e8efdf7a0f 28-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: perf: add devicetree bindings for 11MPcore, A5, A7 and A15 PMUs

This patch adds separate devicetree bindings for 11MPcore and
Cortex-{A5,A7,A15} PMUs in preparation for improved devicetree parsing
in the ARM perf-event CPU PMU driver.

Cc: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
7be2958e97b37256b8016db39ac6cf51f711e390 31-May-2012 Jon Hunter <jon-hunter@ti.com> ARM: PMU: Add runtime PM Support

Add runtime PM support to the ARM PMU driver so that devices such as OMAP
supporting dynamic PM can use the platform->runtime_* hooks to initialise
hardware at runtime. Without having these runtime PM hooks in place any
configuration of the PMU hardware would be lost when low power states are
entered and hence would prevent PMU from working.

This change also replaces the PMU platform functions enable_irq and disable_irq
added by Ming Lei with runtime_resume and runtime_suspend funtions. Ming had
added the enable_irq and disable_irq functions as a method to configure the
cross trigger interface on OMAP4 for routing the PMU interrupts. By adding
runtime PM support, we can move the code called by enable_irq and disable_irq
into the runtime PM callbacks runtime_resume and runtime_suspend.

Cc: Ming Lei <ming.lei@canonical.com>
Cc: Benoit Cousson <b-cousson@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Kevin Hilman <khilman@ti.com>
Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
4295b898f5a5c7e62ae68e7a4ecc4b414622ffe6 06-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: 7448/1: perf: remove arm_perf_pmu_ids global enumeration

In order to provide PMU name strings compatible with the OProfile
user ABI, an enumeration of all PMUs is currently used by perf to
identify each PMU uniquely. Unfortunately, this does not scale well
in the presence of multiple PMUs and creates a single, global namespace
across all PMUs in the system.

This patch removes the enumeration and instead uses the name string
for the PMU to map onto the OProfile variant. perf_pmu_name is
implemented for CPU PMUs, which is all that OProfile cares about anyway.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
fdeb8e35fd59e79dec385f98eb4b6d2e3398264b 04-Jul-2012 Will Deacon <will.deacon@arm.com> ARM: 7441/1: perf: return -EOPNOTSUPP if requested mode exclusion is unavailable

We currently return -EPERM if the user requests mode exclusion that is
not supported by the CPU. This looks pretty confusing from userspace
and is inconsistent with other architectures (ppc, x86).

This patch returns -EOPNOTSUPP instead.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
d33c88c659d708e7c5d518a05ef9349a36217bb2 03-Feb-2012 Will Deacon <will.deacon@arm.com> ARM: 7315/1: perf: add support for the Cortex-A7 PMU

Cortex-A7 implements an ARMv7-compatible PMU compliant with the PMUv2
architecture specification.

This patch adds support for the PMU to the ARM perf backend.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
a0feb6db0fe03326d7d2c7a4615ce3289615c023 06-Mar-2012 Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> ARM: 7358/1: perf: add PMU hotplug notifier

When a CPU is taken out of reset, either cold booted or hotplugged in,
some of its PMU registers can contain UNKNOWN values.

This patch adds a hotplug notifier to ARM core perf code so that upon
CPU restart the PMU unit is reset and becomes ready to use again.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
5727347180ebc6b4a866fcbe00dcb39cc03acb37 06-Mar-2012 Will Deacon <will.deacon@arm.com> ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode

On ARM, the PMU does not stop counting after an overflow and therefore
IRQ latency affects the new counter value read by the kernel. This is
significant for non-sampling runs where it is possible for the new value
to overtake the previous one, causing the delta to be out by up to
max_period events.

Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
to IRQ latency") attempted to fix this problem by allowing interrupt
handlers to pass an overflow flag to the event update function, causing
the overflow calculation to assume that the counter passed through zero
when going from prev to new. Unfortunately, this doesn't work when
overflow occurs on the perf_task_tick path because we have the flag
cleared and end up computing a large negative delta.

This patch removes the overflow flag from armpmu_event_update and
instead limits the sample_period to half of the max_period for
non-sampling profiling runs.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2481c5fa6db0237e4f0168f88913178b2b495b7c 09-Feb-2012 Stephane Eranian <eranian@google.com> perf: Disable PERF_SAMPLE_BRANCH_* when not supported

PERF_SAMPLE_BRANCH_* is disabled for:

- SW events (sw counters, tracepoints)
- HW breakpoints
- ALL but Intel x86 architecture
- AMD64 processors

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
6bd054096dce061560cee0e016e292e588dc438f 02-Dec-2011 Will Deacon <will.deacon@arm.com> ARM: 7185/1: perf: don't assign platform_device on unsupported CPUs

In the unlikely case that a platform registers a PMU platform_device
when running on a CPU that is unsupported by perf, we will encounter a
NULL dereference when trying to assign the platform_device to the
cpu_pmu structure.

This patch checks that the CPU is supported by perf before assigning
the platform_device.

Reported-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
e0516a64e7ea9d9522d98f9f5f47aa38f147779f 02-Mar-2011 Ming Lei <ming.lei@canonical.com> arm: pmu: allow platform specific irq enable/disable handling

This patch introduces .enable_irq and .disable_irq into
struct arm_pmu_platdata, so platform specific irq enablement
can be handled after request_irq, and platform specific irq
disablement can be handled before free_irq.

This patch is for support of pmu irq routed from CTI on omap4.

Acked-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
feb45d06ffd7b59f43f1ed8edf53a0cfe3e7ad2a 14-Nov-2011 Will Deacon <will.deacon@arm.com> ARM: perf: remove unused armpmu_get_max_events

armpmu_get_max_events is only called from perf_num_counters, so we can
inline it there. It existed as a separate entity as a hangover from
the original perf-based oprofile implementation.

Signed-off-by: Will Deacon <will.deacon@arm.com>
e5a21327644adba32816f74a415114d11c57f2e9 22-Nov-2011 Will Deacon <will.deacon@arm.com> ARM: perf: check that we have a platform device when reserving PMU

Attempting to use a hardware counter on a platform with a supported PMU
but where the platform_device (defining the interrupts) has not been
registered results in a NULL pointer dereference.

This patch fixes the problem by checking that we actually have a platform
device registered before attempting to grab the interrupts.

Reported-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
bce34d14428d35d9a06ddc10cd46ecef311764c9 17-Nov-2011 Will Deacon <will.deacon@arm.com> ARM: perf: initialise used_mask for fake PMU during validation

When validating an event group, we call pmu->get_event_idx for each
group member in order to check that the group can be scheduled as a
unit on an empty PMU.

As a result of 3fc2c830 ("ARM: perf: remove event limit from
pmu_hw_events"), the used_mask member of struct cpu_hw_events must be
setup explicitly, something which we don't do for the fake cpu_hw_events
used for validation.

This patch sets up an empty used_mask for the fake validation
cpu_hw_events, preventing NULL deferences when trying to get the event
index.

Reported-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
aa2bc1ade59003a379ffc485d6da2d92ea3370a6 09-Nov-2011 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Don't use -ENOSPC for out of PMU resources

People (Linus) objected to using -ENOSPC to signal not having enough
resources on the PMU to satisfy the request. Use -EINVAL.

Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-xv8geaz2zpbjhlx0svmpp28n@git.kernel.org
[ merged to newer kernel, fixed up MIPS impact ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
ecea4ab6d3d8bb4122522398200f1cd2a06af6d5 22-Jul-2011 Paul Gortmaker <paul.gortmaker@windriver.com> arm: convert core files from module.h to export.h

Many of the core ARM kernel files are not modules, but just
including module.h for exporting symbols. Now these files can
use the lighter footprint export.h for this role.

There are probably lots more, but ARM files of mach-* and plat-*
don't get coverage via a simple yesconfig build. They will have
to be cleaned up and tested via using their respective configs.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
7325eaec439cd0cc8c9b61b59d41d99abace1b23 23-Aug-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: Remove unnecessary armpmu->enable()s

Currently, armpmu_enable iterates through the events for a given
counter set, calling armpmu->enable on each before calling
armpmu->start to start the PMU's counters.

As armpmu->enable is called when each event is added, each event is
already configured in hardware. Due to this, calling armpmu->enable
in armpmu_enable is unnecessary and confusing.

This patch removes the unnecessary calls to armpmu->enable.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
0ce47080dfffe71edd433b35dcdada24c61079eb 19-May-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: move arm_pmu into <asm/pmu.h>

Currently, struct arm_pmu and related functions are only visible to
{,arch/arm/}/kernel/perf_event.c. This prevents new drivers from using
the framework.

This patch moves declarations to asm/pmu.h, allowing new PMU drivers
to use the framework.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
8be3f9a2385f91f7bf5c58f351e24b9247898e8f 17-May-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: remove cpu-related misnomers

Currently struct cpu_hw_events stores data on events running on a
PMU associated with a CPU. As this data is general enough to be used
for system PMUs, this name is a misnomer, and may cause confusion when
it is used for system PMUs.

Additionally, 'armpmu' is commonly used as a parameter name for an
instance of struct arm_pmu. The name is also used for a global instance
which represents the CPU's PMU.

As cpu_hw_events is now not tied to CPU PMUs, it is renamed to
pmu_hw_events, with instances of it renamed similarly. As the global
'armpmu' is CPU-specfic, it is renamed to cpu_pmu. This should make it
clearer which code is generic, and which is coupled with the CPU.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
3fc2c83087717dc88003428245d97b9d432fff2d 24-Jun-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: remove event limit from pmu_hw_events

Currently the event accounting data in pmu_hw_events is stored in
fixed-sized arrays within the structure.

This patch refactors the accounting data to allow any number of events
to be managed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
8a16b34e21199eb5fcf2c5050d3bc414fc5d6563 28-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: add support for multiple PMUs

Currently, a single static instance of struct pmu is used when
registering an ARM PMU with the main perf subsystem. This limits
the ARM perf code to supporting a single PMU.

This patch replaces the static struct pmu instance with a member
variable on struct arm_pmu. This provides bidirectional mapping
between the two structs, and therefore allows for support of multiple
PMUs. The function 'to_arm_pmu' is provided for convenience.

PMU-generic functions are also updated to use the new mapping, and
PMU-generic initialisation of the member variables is moved into a new
function: armpmu_init.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
e1f431b57ef9e4a68281540933fa74865cbb7a74 28-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: refactor event mapping

Currently mapping an event type to a hardware configuration value
depends on the data being pointed to from struct arm_pmu. These fields
(cache_map, event_map, raw_event_mask) are currently specific to CPU
PMUs, and do not serve the general case well.

This patch replaces the event map pointers on struct arm_pmu with a new
'map_event' function pointer. Small shim functions are used to reuse
the existing common code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
7ae18a5717cbbf1879bdd5b66d7009a9958e5aef 06-Jun-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: add type field to struct arm_pmu

Currently, the ARM perf code assumes all PMUs it will handle are
CPU PMUs, having ARM_PMU_DEVICE_CPU hardcoded when reserving or
releasing hardware. This means that currently, the ARM perf code can't
support system PMUs.

This patch adds a 'type' field to struct arm_pmu, which allows the code
to reserve & release the hardware regardless of the PMU type.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
0f78d2d5ccf72ec834da6901886a40fd8e3b7615 28-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: lock PMU registers per-CPU

Currently, a single lock serialises access to CPU PMU registers. This
global locking is unnecessary as PMU registers are local to the CPU
they monitor.

This patch replaces the global lock with a per-CPU lock. As the lock is
in struct cpu_hw_events, PMUs providing a single cpu_hw_events instance
can be locked globally.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
1b69beb7684c79673995607939d8acab51056b63 08-Aug-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: remove unnecessary armpmu->stop

As armpmu_disable will call armpmu->stop when the last event has been
removed, this is pointless and simply adds to the noise when debugging.
Additionally, due to this call occurring in a preemptible context, this
is problematic for per-cpu locking of PMU registers (where we will
attempt to access per-cpu spinlock for use with raw_spin_lock_irqsave).

This patch removes the call to armpmu->stop.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
92f701e1f429e007f9619469d548022061c41ecc 04-May-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: indirect access to cpu_hw_events

Currently, cpu_hw_events is a global per-CPU variable. To enable
support for multiple PMUs, there needs to be a mapping from an instance
of arm_pmu to its cpu_hw_events. Additionally, as system PMUs are not
CPU-affine, they should not have this stored per-CPU.

This patch moves access to the hardware events data behind an accessor
function (arm_pmu::get_hw_events). This allows each instance to have
its own hardware event data, which can be stored per-CPU or globally as
required.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
a9356a04fab912289b886824cb4b1d461987a910 04-May-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: move platform device to struct arm_pmu

Currently the ARM perf code supports having a single struct
platform_device to supply IRQ numbers, limiting it to supporting a
single PMU.

This patch makes a platform_device instance variable on struct arm_pmu.
This should allow for multiple PMUs to be supported in future.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
03b7898d300de62078cc130fbc83b84b1d1e0f8d 27-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: move active_events into struct arm_pmu

This patch moves the active_events counter into struct arm_pmu, in
preparation for supporting multiple PMUs. This also moves
pmu_reserve_mutex, as it is used to guard accesses to active_events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
c47f8684baefa2bf52c4320f894e73db08dc8a0a 19-Jul-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: remove active_mask

Currently, pmu_hw_events::active_mask is used to keep track of which
events are active in hardware. As we can stop counters and their
interrupts, this is unnecessary.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
7b9f72c62ed047a200b1ef8c70bee0b58e880af8 27-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: clean up event group validation

Currently, event group validation compares each event's 'pmu' pointer
against the static 'pmu' pointer. This limits the code to supporting
only 1 PMU.

This patch changes the behaviour to consider an event's group leader's
'pmu' pointer as canonical for validation. This should ease later
generalisation of the code to support multiple PMUs at once.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
48957155f8791964d8567479e6986f88343aba38 27-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: only register a CPU PMU when present

Currently, an "empty" struct pmu is registered as the CPU PMU,
regardless of whether there is a physical PMU. This burdens the
accessor functions with checks to see whether a PMU is actually
present.

This patch changes initialisation to register a PMU only if there is a
supported PMU present, and removes the checks that this change makes
redundant.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
05d22fde3c0b86c8395d8f12ac01fbbc524d73ca 19-Jul-2011 Will Deacon <will.deacon@arm.com> ARM: perf: allow armpmu to implement mode exclusion

Modern PMUs allow for mode exclusion, so we no longer wish to return
-EPERM if it is requested.

This patch provides a hook in the armpmu structure for implementing
mode exclusion. The hw_perf_event initialisation is slightly delayed so
that the backend code can update the structure if required.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
ecf5a893211c26e02b9d4cfd6ba2183473ac0203 19-Jul-2011 Will Deacon <will.deacon@arm.com> ARM: perf: index PMU registers from zero

ARM PMU code used to use 1-based indices for PMU registers. This caused
several data structures (pmu_hw_events::{active_events, used_mask, events})
to have an unused element at index zero. ARMPMU_MAX_HWEVENTS still takes
this indexing into account, and currently equates to 33.

This patch updates the core ARM perf code to use the 0th index again.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
0b390e2126e03b6ec41f96fb0550b1526d00e203 27-Jul-2011 Will Deacon <will.deacon@arm.com> ARM: perf: use cpumask_t to record active IRQs

Commit 5dfc54e0 ("ARM: GIC: avoid routing interrupts to offline CPUs")
prevents the GIC from setting the affinity of an IRQ to a CPU with
id >= nr_cpu_ids. This was previously abused by perf on some platforms
where more IRQs were registered than possible CPUs.

This patch fixes the problem by using a cpumask_t to keep track of the
active (requested) interrupts in perf. The same effect could be achieved
by limiting the number of IRQs to the number of CPUs, but using a mask
instead will be useful for adding extended CPU hotplug support in the
future.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
b0e89590f4f27ea5ff30bdedb9a58ea904a6b353 26-Jul-2011 Will Deacon <will.deacon@arm.com> ARM: PMU: move CPU PMU platform device handling and init into perf

Once upon a time, OProfile and Perf fought hard over who could play with
the PMU. To stop all hell from breaking loose, pmu.c offered an internal
reserve/release API and took care of parsing PMU platform data passed in
from board support code.

Now that Perf has ingested OProfile, let's move the platform device
handling into the Perf driver and out of the PMU locking code.
Unfortunately, the lock has to remain to prevent Perf being bitten by
out-of-tree modules such as LTTng, which still claim a right to the PMU
when Perf isn't looking.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
a6c93afed38c242ccf4ec5bcb5ff26ff2521cf36 15-Apr-2011 Mark Rutland <mark.rutland@arm.com> ARM: perf: de-const struct arm_pmu

This patch removes const qualifiers from instances of struct arm_pmu,
and functions initialising them, in preparation for generalising
arm_pmu usage to system (AKA uncore) PMUs.

This will allow for dynamically modifiable structures (locks,
struct pmu) to be added as members of struct arm_pmu.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
14abd038a7a209193c58ee7dde01ef4bf1523a91 19-Jan-2011 Will Deacon <will.deacon@arm.com> ARM: perf: add support for the Cortex-A15 PMU

This patch adds support for the Cortex-A15 PMU to the ARMv7
perf-event backend.

Signed-off-by: Will Deacon <will.deacon@arm.com>
0c205cbe20654616e2f8389c0c1ff707d9dccb63 03-Jun-2011 Will Deacon <will.deacon@arm.com> ARM: perf: add support for the Cortex-A5 PMU

This patch adds support for the Cortex-A5 PMU to the ARMv7 perf-event
backend.

Signed-off-by: Will Deacon <will.deacon@arm.com>
f4f38430c94c38187db73a2cf3892cc8b12a2713 01-Jul-2011 Will Deacon <will.deacon@arm.com> ARM: 6989/1: perf: do not start the PMU when no events are present

armpmu_enable can be called in situations where no events are present
(for example, from the event rotation tick after a profiled task has
exited). In this case, we currently start the PMU anyway which may
leave it active inevitably without any events being monitored.

This patch adds a simple check to the enabling code so that we avoid
starting the PMU when no events are present.

Cc: <stable@kernel.org>
Reported-by: Ashwin Chaugle <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
f12482c9393da2c1f5cb3217f29aa79c653dd980 22-Jun-2011 Mark Rutland <mark.rutland@arm.com> ARM: 6974/1: pmu: refactor reservation

Currently, PMU platform_device reservation relies on some minor abuse
of the platform_device::id field for determining the type of PMU. This
is problematic for device tree based probing, where the ID cannot be
controlled.

This patch removes reliance on the id field, and depends on each PMU's
platform driver to figure out which type it is. As all PMUs handled by
the current platform_driver name "arm-pmu" are CPU PMUs, this
convention is hardcoded. New PMU types can be supported through the use
of {of,platform}_device_id tables

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
57ce9bb39b476accf8fba6e16aea67ed76ea523d 17-May-2011 Mark Rutland <mark.rutland@arm.com> ARM: 6902/1: perf: Remove erroneous check on active_events

When initialising a PMU, there is a check to protect against races with
other CPUs filling all of the available event slots. Since armpmu_add
checks that an event can be scheduled, we do not need to do this at
initialisation time. Furthermore the current code is broken because it
assumes that atomic_inc_not_zero will unconditionally increment
active_counts and then tries to decrement it again on failure.

This patch removes the broken, redundant code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
860ad7823fdc00cd61dc70e7f35e07fb327cc9a4 18-Apr-2011 Sonny Rao <sonnyrao@chromium.org> ARM: 6884/1: Fix infinite loop in ARM user perf_event backtrace code

The ARM user backtrace code can get into an infinite loop if it
runs into an invalid stack frame which points back to itself.
This situation has been observed in practice. Fix it by capping
the number of entries in the backtrace. This is also what other
architectures do in their backtrace code.

Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
6759788b944139793bffa889761cc3d8d703fdc0 05-Apr-2011 Will Deacon <will.deacon@arm.com> ARM: 6865/1: perf: ensure pass through zero is counted on overflow

Commit a737823d ("ARM: perf: ensure overflows aren't missed due to IRQ
latency") changed the way that event deltas are calculated on overflow
so that we don't miss events when the new count value overtakes the
previous one.

Unfortunately, we forget to count the event that passes through zero so
we end up being off by 1. This patch adds on the correction.

Reported-by: Chris Moore <moore@free.fr>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
a737823d37666255e3e74ce84bc9611a038e0888 25-Mar-2011 Will Deacon <will.deacon@arm.com> ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency

If a counter overflows during a perf stat profiling run it may overtake
the last known value of the counter:

0 prev new 0xffffffff
|----------|-------|----------------------|

In this case, the number of events that have occurred is
(0xffffffff - prev) + new. Unfortunately, the event update code will
not realise an overflow has occurred and will instead report the event
delta as (new - prev) which may be considerably smaller than the real
count.

This patch adds an extra argument to armpmu_event_update which indicates
whether or not an overflow has occurred. If an overflow has occurred
then we use the maximum period of the counter to calculate the elapsed
events.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Reported-by: Ashwin Chaugule <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
574b69cbb633037a9c305d2993aeb680f4a8badd 25-Mar-2011 Will Deacon <will.deacon@arm.com> ARM: 6834/1: perf: reset counters on all CPUs during initialisation

ARMv7 dictates that the interrupt-enable and count-enable registers for
each PMU counter are UNKNOWN following core reset.

This patch adds a new (optional) function pointer to struct arm_pmu for
resetting the PMU state during init. The reset function is called on
each CPU via an arch_initcall in the generic ARM perf_event code and
allows the PMU backend to write sane values to any UNKNOWN registers.

Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
0e25a5c98067286fc727cf142fc0dadf95790921 08-Feb-2011 Rabin Vincent <rabin.vincent@stericsson.com> ARM: perf_event: allow platform-specific interrupt handler

Allow a platform-specific IRQ handler to be specified via platform data.
This will be used to implement the single-irq workaround for the DB8500.

Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
cb06199b1df492fcfbaedd2256b5054f944b664f 09-Feb-2011 Rabin Vincent <rabin.vincent@stericsson.com> ARM: 6654/1: perf/oprofile: fix off-by-one in stack check

Since tail is the previous fp - 1, we need to compare the new fp with tail + 1
to ensure that we don't end up passing in the same tail again, in order to
avoid a potential infinite loop in the perf interrupt handler (which has been
observed to occur). A similar fix seems to be needed in the OProfile code.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2e80a82a49c4c7eca4e35734380f28298ba5db19 17-Nov-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Dynamic pmu types

Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.

Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.

If we want to enumerate the PMUs they need a name, provide one.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.259707703@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
961ec6daa7b14f376c30d447a830fa4783a2112c 02-Dec-2010 Will Deacon <will.deacon@arm.com> ARM: 6521/1: perf: use raw_spinlock_t for pmu_lock

For kernels built with PREEMPT_RT, critical sections protected
by standard spinlocks are preemptible. This is not acceptable
on perf as (a) we may be scheduled onto a different CPU whilst
reading/writing banked PMU registers and (b) the latency when
reading the PMU registers becomes unpredictable.

This patch upgrades the pmu_lock spinlock to a raw_spinlock
instead.

Reported-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
4d6b7a779be34e1df296abc1dc555134a8cf34af 30-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: 6512/1: perf: fix warnings generated by sparse

Russell reported a number of warnings coming from sparse when
checking the ARM perf_event.c files:

| perf_event.c seems to also have problems too:
|
| CHECK arch/arm/kernel/perf_event.c
| arch/arm/kernel/perf_event.c:37:1: warning: symbol 'pmu_lock' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:70:1: warning: symbol 'cpu_hw_events' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:1006:1: warning: symbol 'armv6pmu_enable_event' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:1113:1: warning: symbol 'armv6pmu_stop' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:1956:6: warning: symbol 'armv7pmu_enable_event' was not declared. Should it be static?
| arch/arm/kernel/perf_event.c:3072:14: warning: incorrect type in argument 1 (different address spaces)
| arch/arm/kernel/perf_event.c:3072:14: expected void const volatile [noderef] <asn:1>*<noident>
| arch/arm/kernel/perf_event.c:3072:14: got struct frame_tail *tail
| arch/arm/kernel/perf_event.c:3074:49: warning: incorrect type in argument 2 (different address spaces)
| arch/arm/kernel/perf_event.c:3074:49: expected void const [noderef] <asn:1>*from
| arch/arm/kernel/perf_event.c:3074:49: got struct frame_tail *tail

This patch resolves these issues so we can live in silence
again.

Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
004417a6d468e24399e383645c068b498eed84ad 25-Nov-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf, arch: Cleanup perf-pmu init vs lockup-detector

The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).

The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.

Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.

Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
43eab87828fee65f89f4088736b2b7a187390a2f 13-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: perf: separate PMU backends into multiple files

The ARM perf_event.c file contains all PMU backends and, as new PMUs
are introduced, will continue to grow.

This patch follows the example of x86 and splits the PMU implementations
into separate files which are then #included back into the main
file. Compile-time guards are added to each PMU file to avoid compiling
in code that is not relevant for the version of the architecture which
we are targetting.

Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
629948310e4270e9b32c37b4a65a8cd5d6ebf38a 13-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: perf: encode PMU name in arm_pmu structure

Currently, perf uses the PMU ID as an index into a string table
to look up the name of a given PMU.

This patch encodes the name of a PMU directly into the arm_pmu
structure so that PMU-specific code can be factored out into
separate files.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
3cb314bae2191b432a7e898abf865db880f6d07d 13-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: perf: add _init() functions to PMUs

In preparation for separating the PMU-specific code, this patch adds
self-contained init functions to each PMU, therefore removing any
PMU-specific knowledge from the PMU-agnostic init_hw_perf_events
function.

Acked-by: Jamie Iles <jamie@jamieiles.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
59a98a1e56edea4d7d9c5f4ce9d50e271a04993c 13-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: perf: avoid exposing internal stop function for v6 PMU

Unlike other pmu functions, armv6pmu_pmu_stop is not declared static.
This patch adds the missing keyword.

Acked-by: Jamie Iles <jamie.iles@jamieiles.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
84fee97a026ca085f08381054513f9e24689a303 13-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: perf: consolidate common PMU behaviour

The functions for mapping PMU events (perf, cache and raw) are
common between all PMU types and differ only in the data on which
they operate.

This patch implements common definitions of these mapping functions
and changes the arm_pmu struct to hold pointers to the data which
they require. This is in anticipation of separating out the PMU-specific
code into separate files.

Acked-by: Jamie Iles <jamie.iles@jamieiles.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
c3b291d98878a5f25fee56255bcfa420e85dff59 04-Nov-2010 Will Deacon <will.deacon@arm.com> ARM: 6469/1: perf-events: squash compiler warning

armv7_pmnc_counter_has_overflowed can return uninitialised data
if an invalid counter is specified.

This patch fixes the code to return 0 in this case, which squashes
the compiler warning from GCC 4.5.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
e360adbe29241a0194e10e20595360dd7b98a2b3 14-Oct-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> irq_work: Add generic hardirq context callbacks

Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
1efeb08d7dd32c0fbd4b784ea9303b53d345bfd0 14-Oct-2010 Ingo Molnar <mingo@elte.hu> perf, ARM: Fix sysfs bits removal build failure

Fix this linux-next build failure that Stephen reported:

arch/arm/kernel/perf_event.c: In function 'armpmu_event_init':
arch/arm/kernel/perf_event.c:543: error: request for member 'num_events' in something not a structure or union

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
LKML-Reference: <20101014164925.4fa16b75.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
3bf101ba42a1c89b5afbc7492e7647dae5e18735 27-Sep-2010 Matt Fleming <matt@console-pimps.org> perf: Add helper function to return number of counters

The number of counters for the registered pmu is needed in a few places
so provide a helper function that returns this number.

Signed-off-by: Matt Fleming <matt@console-pimps.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Robert Richter <robert.richter@amd.com>
15ac9a395a753cb28c674e7ea80386ffdff21785 06-Sep-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Remove the sysfs bits

Neither the overcommit nor the reservation sysfs parameter were
actually working, remove them as they'll only get in the way.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
a4eaf7f14675cb512d69f0c928055e73d0c6d252 16-Jun-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Rework the PMU methods

Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.

The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.

This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).

It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).

The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:

1) We disable the counter:
a) the pmu has per-counter enable bits, we flip that
b) we program a NOP event, preserving the counter state

2) We store the counter state and ignore all read/overflow events

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
33696fc0d141bbbcb12f75b69608ea83282e3117 14-Jun-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Per PMU disable

Changes perf_disable() into perf_pmu_disable().

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
24cd7f54a0d47e1d5b3de29e2456bfbd2d8447b7 11-Jun-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Reduce perf_disable() usage

Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
b0a873ebbf87bf38bf70b5e39a7cadc96099fa13 11-Jun-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Register PMU implementations

Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
51b0fe39549a04858001922919ab355dee9bdfcf 11-Jun-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Deconstify struct pmu

sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
65b4711ff513767341aa1915c822de6ec0de65cb 02-Sep-2010 Will Deacon <will.deacon@arm.com> ARM: 6352/1: perf: fix event validation

The validate_event function in the ARM perf events backend has the
following problems:

1.) Events that are disabled count towards the cost.
2.) Events associated with other PMUs [for example, software events or
breakpoints] do not count towards the cost, but do fail validation,
causing the group to fail.

This patch changes validate_event so that it ignores events in the
PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.

Reported-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
25d3584797a39f57b69cd835722ac7c41113fb9a 16-Aug-2010 Will Deacon <will.deacon@arm.com> ARM: 6330/1: perf: reword comments relating to perf_event_do_pending

This is purely a cosmetic change to the ARM perf backend because the current
comments about the relationship between NMIs, interrupt context and
perf_event_do_pending are misleading.

This patch updates the comments so that they reflect what the code
actually does (which is in line with other architectures).

Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
f72c1a931e311bb7780fee19e41a89ac42cab50e 01-Jul-2010 Frederic Weisbecker <fweisbec@gmail.com> perf: Factorize callchain context handling

Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
56962b4449af34070bb1994621ef4f0265eed4d8 30-Jun-2010 Frederic Weisbecker <fweisbec@gmail.com> perf: Generalize some arch callchain code

- Most archs use one callchain buffer per cpu, except x86 that needs
to deal with NMIs. Provide a default perf_callchain_buffer()
implementation that x86 overrides.

- Centralize all the kernel/user regs handling and invoke new arch
handlers from there: perf_callchain_user() / perf_callchain_kernel()
That avoid all the user_mode(), current->mm checks and so...

- Invert some parameters in perf_callchain_*() helpers: entry to the
left, regs to the right, following the traditional (dst, src).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
70791ce9ba68a5921c9905ef05d23f62a90bc10c 29-Jun-2010 Frederic Weisbecker <fweisbec@gmail.com> perf: Generalize callchain_store()

callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.

This removes repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
c1a65932fd7216fdc9a0db8bbffe1d47842f862c 29-Jun-2010 Frederic Weisbecker <fweisbec@gmail.com> perf: Drop unappropriate tests on arch callchains

Drop the TASK_RUNNING test on user tasks for callchains as
this check doesn't seem to make any sense.

Also remove the tests for !current that is not supposed to
happen and current->pid as this should be handled at the
generic level, with exclude_idle attribute.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
446a5a8b1eb91a6990e5c8fe29f14e7a95b69132 02-Jul-2010 Will Deacon <will.deacon@arm.com> ARM: 6205/1: perf: ensure counter delta is treated as unsigned

Hardware performance counters on ARM are 32-bits wide but atomic64_t
variables are used to represent counter data in the hw_perf_event structure.

The armpmu_event_update function right-shifts a signed 64-bit delta variable
and adds the result to the event count. This can lead to shifting in sign-bits
if the MSB of the 32-bit counter value is set. This results in perf output
such as:

Performance counter stats for 'sleep 20':

18446744073460670464 cycles <-- 0xFFFFFFFFF12A6000
7783773 instructions # 0.000 IPC
465 context-switches
161 page-faults
1172393 branches

20.154242147 seconds time elapsed

This patch ensures that the delta value is treated as unsigned so that the
right shift sets the upper bits to zero.

Cc: <stable@kernel.org>
Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
e78505958cf123048fb48cb56b79cebb8edd15fb 21-May-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Convert perf_event to local_t

Since now all modification to event->count (and ->prev_count
and ->period_left) are local to a cpu, change then to local64_t so we
avoid the LOCK'ed ops.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
929f5199448a67d41bb249d58815ef77bcd53622 30-Apr-2010 Will Deacon <will.deacon@arm.com> ARM: 6071/1: perf-events: allow modules to query the number of hardware counters

For OProfile to initialise oprofilefs correctly, it needs to know
the number of counters it can represent.

This patch adds a function to the ARM perf-events backend to return
the number of hardware counters available for the current PMU.

Cc: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
49e6a32f2f0876b6267584d9c7e0e213bca6e2b8 30-Apr-2010 Will Deacon <will.deacon@arm.com> ARM: 6070/1: perf-events: add support for xscale PMUs

The perf-events framework for ARM only supports v6 and v7 cores.

This patch adds support for xscale v1 and v2 PMUs to perf, based on the
OProfile drivers in arch/arm/oprofile/op_model_xscale.c

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
181193f398e7d8da6b1196138f0e219709621743 30-Apr-2010 Will Deacon <will.deacon@arm.com> ARM: 6069/1: perf-events: use numeric ID to identify PMU

The ARM perf-events framework provides support for a number of different
PMUs using struct arm_pmu. The char *name field of this struct can be
used to identify the PMU, but this is cumbersome if used outside of perf.

This patch replaces the name string for a PMU with an enum, which holds
a unique ID for the PMU being represented. This ID can be used to index
an array of names within perf, so no functionality is lost. The presence
of the ID field, allows other kernel subsystems [currently oprofile] to
use their own mappings for the PMU name.

Cc: Jean Pihet <jpihet@mvista.com>
Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
49c006b93769a86bec2b32b9234abf016ac0d50e 29-Apr-2010 Will Deacon <will.deacon@arm.com> ARM: 6064/1: pmu: register IRQs at runtime

The current PMU infrastructure for ARM requires that the IRQs for the PMU
device are fixed at compile time and are selected based on the ARCH_ or MACH_ flags. This has the disadvantage of tying the Kernel down to a
particular board as far as profiling is concerned.

This patch replaces the compile-time IRQ registration with a runtime mechanism which allows the IRQs to be registered with the framework as
a platform_device.

A further advantage of this change is that there is scope for registering
different types of performance counters in the future by changing the id
of the platform_device and attaching different resources to it.

Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
d10fca9f39238b07cc670b441d2b423de30359d2 26-Feb-2010 Will Deacon <will.deacon@arm.com> ARM: 5960/1: ARM: perf-events: fix v7 event selection mask

The event selection mask for ARMv7 cores [ARMV7_EVTSEL_MASK]
is incorrectly set to 0x7f. This means that the top bit of an
event ID is ignored, so counting branch misses (id=0x10) and
ISBs (id=0x90) give the same results.

This patch sets the event selection mask to the correct value
of 0xff.

Signed-off-by: Jean Pihet <jpihet@mvista.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ddee87f208b6229d2910dd5930c87089dc56c87e 25-Feb-2010 Will Deacon <will.deacon@arm.com> ARM: 5959/1: ARM: perf-events: request PMU interrupts with IRQF_NOBALANCING

If IRQ balancing is used on a multicore ARM system, PMU interrupt
lines may be relocated onto CPUs other than the one causing the
counter overflow. This can result in misattribution of events to
the wrong core and, in the case that the CPU handling the interrupt
has not experience counter overflow, the interrupt can be disabled
because the handler returns IRQ_NONE.

This patch adds the IRQF_NOBALANCING flag to the request_irq call
in perf_events.c.

Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
dc1d628a67a8f042e711ea5accc0beedc3ef0092 03-Mar-2010 Peter Zijlstra <a.p.zijlstra@chello.nl> perf: Provide generic perf_sample_data initialization

This makes it easier to extend perf_sample_data and fixes a bug on arm
and sparc, which failed to set ->raw to NULL, which can cause crashes
when combined with PERF_SAMPLE_RAW.

It also optimizes PowerPC and tracepoint, because the struct
initialization is forced to zero out the whole structure.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jean Pihet <jpihet@mvista.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Jamie Iles <jamie.iles@picochip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@kernel.org
LKML-Reference: <20100304140100.315416040@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
796d12959ad374cae8eb77faaf4243455a305433 26-Jan-2010 Jean PIHET <jpihet@mvista.com> ARM: 5903/1: arm/perfevents: add support for ARMv7

Adds the Performance Events support for ARMv7 processor, using
the PMNC unit in HW.

Supports the following:
- Cortex-A8 and Cortex-A9 processors,
- dynamic detection of the number of available counters,
based on the PMCR value,
- runtime detection of the CPU arch (v6 or v7)
and model (Cortex-A8 or Cortex-A9)

Tested on OMAP3 (Cortex-A8) only.

Signed-off-by: Jean Pihet <jpihet@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1b8873a0c6ec511870c106c80b94658f857c47f2 02-Feb-2010 Jamie Iles <jamie.iles@picochip.com> ARM: 5902/4: arm/perfevents: implement perf event support for ARMv6

This patch implements support for ARMv6 performance counters in the
Linux performance events subsystem. ARMv6 architectures that have the
performance counters should enable HW_PERF_EVENTS to get hardware
performance events support in addition to the software events.

Note: only ARM Ltd ARM cores are supported.

This implementation also provides an ARM PMU abstraction layer to allow
ARMv7 and others to be supported in the future by adding new a
'struct arm_pmu'.

Cc: Jean Pihet <jpihet@mvista.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>