oprofile.xml revision 7a33c86eb98056ef0570c99e713214f8dc56b6ef
1<?xml version="1.0" encoding='ISO-8859-1'?>
2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
3
4<book id="oprofile-guide">
5<bookinfo>
6	<title>OProfile manual</title>
7 
8	<authorgroup>
9		<author>
10			<firstname>John</firstname>
11			<surname>Levon</surname>
12			<affiliation>
13				<address><email>levon@movementarian.org</email></address>
14			</affiliation>
15		</author>
16	</authorgroup>
17
18	<copyright>
19		<year>2000-2004</year>
20		<holder>Victoria University of Manchester, John Levon and others</holder>
21	</copyright>
22</bookinfo>
23
24<toc></toc>
25
26<chapter id="introduction">
27<title>Introduction</title>
28
29<para>
30This manual applies to OProfile version <oprofileversion />.
31OProfile is a profiling system for Linux 2.2/2.4/2.6 systems on a number of architectures. It is capable of profiling
32all parts of a running system, from the kernel (including modules and interrupt handlers) to shared libraries
33to binaries. It runs transparently in the background collecting information at a low overhead. These
34features make it ideal for profiling entire systems to determine bottle necks in real-world systems.
35</para>
36<para>
37Many CPUs provide "performance counters", hardware registers that can count "events"; for example,
38cache misses, or CPU cycles. OProfile provides profiles of code based on the number of these occurring events:
39repeatedly, every time a certain (configurable) number of events has occurred, the PC value is recorded.
40This information is aggregated into profiles for each binary image.</para>
41<para>
42Some hardware setups do not allow OProfile to use performance counters: in these cases, no
43events are available, and OProfile operates in timer/RTC mode, as described in later chapters.
44</para>
45<sect1 id="applications">
46<title>Applications of OProfile</title>
47<para>
48OProfile is useful in a number of situations. You might want to use OProfile when you :
49</para>
50<itemizedlist>
51<listitem><para>need low overhead</para></listitem>
52<listitem><para>cannot use highly intrusive profiling methods</para></listitem>
53<listitem><para>need to profile interrupt handlers</para></listitem>
54<listitem><para>need to profile an application and its shared libraries</para></listitem>
55<listitem><para>need to profile dynamically compiled code of supported virtual machines (see <xref linkend="jitsupport"/>)</para></listitem>
56<listitem><para>need to capture the performance behaviour of entire system</para></listitem>
57<listitem><para>want to examine hardware effects such as cache misses</para></listitem>
58<listitem><para>want detailed source annotation</para></listitem>
59<listitem><para>want instruction-level profiles</para></listitem>
60<listitem><para>want call-graph profiles</para></listitem>
61</itemizedlist>
62<para>
63OProfile is not a panacea. OProfile might not be a complete solution when you :
64</para>
65<itemizedlist>
66<listitem><para>require call graph profiles on platforms other than 2.6/x86</para></listitem>
67<listitem><para>don't have root permissions</para></listitem>
68<listitem><para>require 100% instruction-accurate profiles</para></listitem>
69<listitem><para>need function call counts or an interstitial profiling API</para></listitem>
70<listitem><para>cannot tolerate any disturbance to the system whatsoever</para></listitem>
71<listitem><para>need to profile interpreted or dynamically compiled code of non-supported virtual machines</para></listitem>
72</itemizedlist>
73<sect2 id="jitsupport">
74<title>Support for dynamically compiled (JIT) code</title>
75<para>
76Older versions of OProfile were not capable of attributing samples to symbols from dynamically
77compiled code, i.e. "just-in-time (JIT) code". Typical JIT compilers load the JIT code into
78anonymous memory regions. OProfile reported the samples from such code, but the attribution
79provided was simply:
80        <screen>"anon: &lt;tgid&gt;&lt;address range&gt;" </screen>
81Due to this limitation, it wasn't possible to profile applications executed by virtual machines (VMs)
82like the Java Virtual Machine. OProfile now contains an infrastructure to support JITed code.
83A development library is provided to allow developers
84to add support for any VM that produces dynamically compiled code (see the <emphasis>OProfile JIT agent
85developer guide</emphasis>).
86In addition, built-in support is included for the following:</para>
87<itemizedlist><listitem>JVMTI agent library for Java (1.5 and higher)</listitem>
88<listitem>JVMPI agent library for Java (1.5 and lower)</listitem>
89</itemizedlist>
90<para>
91For information on how to use OProfile's JIT support, see <xref linkend="setup-jit"/>.
92</para>
93</sect2>
94</sect1>
95
96<sect1 id="requirements">
97<title>System requirements</title>
98
99<variablelist>
100	<varlistentry>
101		<term>Linux kernel 2.2/2.4/2.6</term>
102		<listitem><para>
103			OProfile uses a kernel module that can be compiled for
104			2.2.11 or later and 2.4. 2.4.10 or above is required if you use the 
105			boot-time kernel option <option>nosmp</option>.  2.6 kernels are supported with the in-kernel
106			OProfile driver. Note that only 32-bit x86 and IA64 are supported on 2.2/2.4 kernels.
107			</para>
108
109			<para>
110			2.6 kernels are strongly recommended. Under 2.4, OProfile may cause system crashes if power
111			management is used, or the BIOS does not correctly deal with local APICs.
112			</para>
113
114			<para>
115			To use OProfile's JIT support, a kernel version 2.6.13 or later is required.
116			In earlier kernel versions, the anonymous memory regions are not reported to OProfile and results
117			in profiling reports without any samples in these regions.
118			</para>
119
120			<para>
121			PPC64 processors (Power4/Power5/PPC970, etc.) require a recent (&gt; 2.6.5) kernel with the line 
122			<constant>#define PV_970</constant> present in <filename>include/asm-ppc64/processor.h</filename>.
123<!-- FIXME: do we require always gte 2.4.10 for nosmp ? -->
124                       </para>
125                       <para>
126                       Profiling the Cell Broadband Engine PowerPC Processing Element (PPE) requires a kernel version
127                       of 2.6.18 or more recent.
128                       Profiling the Cell Broadband Engine Synergistic Processing Element (SPE) requires a kernel version
129                       of 2.6.22 or more recent.  Additionally, full support of SPE profiling requires a BFD library
130                       from binutils code dated January 2007 or later.  To ensure the proper BFD support exists, run
131                       the <code>configure</code> utility with <code>--with-target=cell-be</code>.
132
133		       Profiling the Cell Broadband Engine using SPU events requires a kernel version of 2.6.29-rc1
134		       or  more recent.
135
136                       <note>Attempting to profile SPEs with kernel versions older than 2.6.22 may cause the
137                       system to crash.</note>
138                       </para>
139		
140			<para>
141			Instruction-Based Sampling (IBS) profile on AMD family10h processors requires 
142			kernel version 2.6.28-rc2 or later.
143			</para>
144		</listitem>
145	</varlistentry>
146	<varlistentry>
147		<term>modutils 2.4.6 or above</term>
148		<listitem><para>
149			You should have installed modutils 2.4.6 or higher (in fact earlier versions work well in almost all
150			cases).
151		</para></listitem>
152	</varlistentry>
153	<varlistentry>
154		<term>Supported architecture</term>
155		<listitem><para>
156			For Intel IA32, a CPU with either a P6 generation or Pentium 4 core is
157			required. In marketing terms this translates to anything
158			between an Intel Pentium Pro (not Pentium Classics) and
159			a Pentium 4 / Xeon, including all Celerons.  The AMD
160			Athlon, Opteron, Phenom, and Turion CPUs are also supported.  Other IA32
161			CPU types only support the RTC mode of OProfile; please
162			see later in this manual for details.  Hyper-threaded Pentium IVs
163			are not supported in 2.4. For 2.4 kernels, the Intel
164			IA-64 CPUs are also supported. For 2.6 kernels, there is additionally
165			support for Alpha processors, MIPS, ARM, x86-64, sparc64, ppc64, AVR32, and,
166			in timer mode, PA-RISC and s390.
167		</para></listitem>
168	</varlistentry>
169	<varlistentry>
170		<term>Uniprocessor or SMP</term>
171		<listitem><para>
172			SMP machines are fully supported.
173		</para></listitem>
174	</varlistentry>
175	<varlistentry>
176		<term>Required libraries</term>
177		<listitem><para>
178			These libraries are required : <filename>popt</filename>, <filename>bfd</filename>,
179			<filename>liberty</filename> (debian users: libiberty is provided in binutils-dev package), <filename>dl</filename>,
180			plus the standard C++ libraries.
181		</para></listitem>
182	</varlistentry>
183	<varlistentry>
184		<term>Required user account</term>
185		<listitem><para>
186			For secure processing of sample data from JIT virtual machines (e.g., Java),
187			the special user account "oprofile" must exist on the system.  The 'configure'
188			and 'make install' operations will print warning messages if this
189			account is not found.  If you intend to profile JITed code, you must create
190			a group account named 'oprofile' and then create the 'oprofile' user account,
191			setting the default group to 'oprofile'.  A runtime error message is printed to
192			the oprofile daemon log when processing JIT samples if this special user
193			account cannot be found.
194		</para></listitem>
195	</varlistentry>
196	<varlistentry>
197		<term>OProfile GUI</term>
198		<listitem><para>
199			The use of the GUI to start the profiler requires the <filename>Qt 2</filename> library. <filename>Qt 3</filename> should
200			also work.
201		</para></listitem>
202	</varlistentry>
203	<varlistentry>
204 		<term><acronym>ELF</acronym></term>
205		<listitem><para>
206			Probably not too strenuous a requirement, but older <acronym>A.OUT</acronym> binaries/libraries are not supported.
207		</para></listitem>
208	</varlistentry>
209	<varlistentry>
210		<term>K&amp;R coding style</term>
211		<listitem><para>
212			OK, so it's not really a requirement, but I wish it was...
213		</para></listitem>
214	</varlistentry>
215</variablelist>
216
217
218</sect1>
219
220<sect1 id="resources">
221<title>Internet resources</title>
222
223<variablelist>
224	<varlistentry>
225		<term>Web page</term>
226		<listitem><para>
227			There is a web page (which you may be reading now) at
228			<ulink url="http://oprofile.sf.net/">http://oprofile.sf.net/</ulink>.
229		</para></listitem>
230	</varlistentry>
231	<varlistentry>
232		<term>Download</term>
233		<listitem><para>
234			You can download a source tarball or get anonymous CVS at the sourceforge page,
235			<ulink url="http://sf.net/projects/oprofile/">http://sf.net/projects/oprofile/</ulink>.
236		</para></listitem>
237	</varlistentry>
238	<varlistentry>
239		<term>Mailing list</term>
240		<listitem><para>
241			There is a low-traffic OProfile-specific mailing list, details at
242			<ulink url="http://sf.net/mail/?group_id=16191">http://sf.net/mail/?group_id=16191</ulink>.
243		</para></listitem>
244	</varlistentry>
245	<varlistentry>
246		<term>Bug tracker</term>
247		<listitem><para>
248			There is a bug tracker for OProfile at SourceForge,
249			<ulink url="http://sf.net/tracker/?group_id=16191&atid=116191">http://sf.net/tracker/?group_id=16191&atid=116191</ulink>.
250		</para></listitem>
251	</varlistentry>
252	<varlistentry>
253		<term>IRC channel</term>
254		<listitem><para>
255			Several OProfile developers and users sometimes hang out on channel <command>#oprofile</command>
256			on the <ulink url="http://oftc.net">OFTC</ulink> network. 
257		</para></listitem>
258	</varlistentry>
259</variablelist>
260
261</sect1>
262
263<sect1 id="install">
264<title>Installation</title>
265
266<para>
267First you need to build OProfile and install it. <command>/configure</command>, <command>make</command>, <command>make install</command>
268is often all you need, but note these arguments to <command>/configure</command> :
269</para>
270<variablelist>
271	<varlistentry>
272		<term><option>--with-linux</option></term>
273		<listitem><para>
274			Use this option to specify the location of the kernel source tree you wish
275			to compile against. The kernel module is built against this source and
276			will only work with a running kernel built from the same source with
277			exact same options, so it is important you specify this option if you need
278			to.
279		</para></listitem>
280	</varlistentry>
281	<varlistentry>
282		<term><option>--with-java</option></term>
283		<listitem>
284			<para>
285			Use this option if you need to profile Java applications.  Also, see
286			<xref linkend="requirements"/>, "Required user account".  This option
287			is used to specify the location of the Java Development Kit (JDK)
288			source tree you wish to use. This is necessary to get the interface description
289			of the JVMPI (or JVMTI) interface to compile the JIT support code successfully.
290			</para>
291			<note>
292				<para>
293				The Java Runtime Environment (JRE) does not include the development
294				files that are required to compile the JIT support code, so the full
295				JDK must be installed in order to use this option.
296				</para>
297			</note>
298			<para>
299			By default, the Oprofile JIT support libraries will be installed in
300			<filename>&lt;oprof_install_dir&gt;/lib/oprofile</filename>.  To build
301			and install OProfile and the JIT support libraries as 64-bit, you can
302			do something like the following:
303			<screen>
304			# CFLAGS="-m64" CXXFLAGS="-m64" /configure \
305			--with-kernel-support --with-java={my_jdk_installdir} \
306			--libdir=/usr/local/lib64
307			</screen>
308			</para>
309			<note>
310				<para>
311				If you encounter errors building 64-bit, you should
312				install libtool 1.5.26 or later since that release of
313				libtool fixes known problems for certain platforms.
314				If you install libtool into a non-standard location,
315				you'll need to edit the invocation of 'aclocal' in
316				OProfile's autogen.sh as follows (assume an install
317				location of /usr/local):
318				</para>
319				<para>
320				<code>aclocal -I m4 -I /usr/local/share/aclocal</code>
321				</para>
322			</note> 
323		</listitem>
324	</varlistentry>
325	<varlistentry>
326		<term><option>--with-kernel-support</option></term>
327		<listitem><para>
328			Use this option with 2.6 and above kernels to indicate the 
329	    		kernel provides the OProfile device driver.
330		</para></listitem>
331	</varlistentry>
332	<varlistentry>
333		<term><option>--with-qt-dir/includes/libraries</option></term>
334		<listitem><para>
335			Specify the location of Qt headers and libraries. It defaults to searching in
336			<constant>$QTDIR</constant> if these are not specified.
337		</para></listitem>
338	</varlistentry>
339	<varlistentry id="disable-werror">
340		<term><option>--disable-werror</option></term>
341		<listitem><para>
342			Development versions of OProfile build by
343			default with <option>-Werror</option>. This option turns
344			<option>-Werror</option> off.
345		</para></listitem>
346	</varlistentry>
347	<varlistentry id="disable-optimization">
348		<term><option>--disable-optimization</option></term>
349		<listitem><para>
350			Disable the <option>-O2</option> compiler flag
351			(useful if you discover an OProfile bug and want to give a useful
352			back-trace etc.)
353		</para></listitem>
354	</varlistentry>
355</variablelist>
356<para>
357You'll need to have a configured kernel source for the current kernel
358to build the module for 2.4 kernels.  Since all distributions provide different kernels it's unlikely the running kernel match the configured source
359you installed. The safest way is to recompile your own kernel, run it and compile oprofile. It is also recommended that if you have a
360uniprocessor machine, you enable the local APIC / IO_APIC support for
361your kernel (this is automatically enabled for SMP kernels). With many BIOS, kernel &gt;= 2.6.9 and UP kernel it's not sufficient to enable the local APIC you must also turn it on explicitly at boot time by providing "lapic" option to the kernel. On
362machines with power management, such as laptops, the power management
363must be turned off when using OProfile with 2.4 kernels. The power management software
364in the BIOS cannot handle the non-maskable interrupts (NMIs) used by
365OProfile for data collection. If you use the NMI watchdog, be aware that
366the watchdog is disabled when profiling starts, and not re-enabled until the
367OProfile module is removed (or, in 2.6, when OProfile is not running). If you compile OProfile for
368a 2.2 kernel you must be root to compile the module. If you are using
3692.6 kernels or higher, you do not need kernel source, as long as the
370OProfile driver is enabled; additionally, you should not need to disable
371power management.
372</para>
373<para>
374Please note that you must save or have available the <filename>vmlinux</filename> file
375generated during a kernel compile, as OProfile needs it (you can use
376<option>--no-vmlinux</option>, but this will prevent kernel profiling).
377</para>
378
379</sect1>
380
381<sect1 id="uninstall">
382<title>Uninstalling OProfile</title>
383<para>
384You must have the source tree available to uninstall OProfile; a <command>make uninstall</command> will
385remove all installed files except your configuration file in the directory <filename>~/.oprofile</filename>.
386</para>
387</sect1>
388
389</chapter>
390
391<chapter id="overview"> 
392<title>Overview</title>
393
394<sect1 id="getting-started">
395<title>Getting started</title>
396<para>
397Before you can use OProfile, you must set it up. The minimum setup required for this
398is to tell OProfile where the <filename>vmlinux</filename> file corresponding to the
399running kernel is, for example :
400</para>
401<screen>opcontrol --vmlinux=/boot/vmlinux-`uname -r`</screen>
402<para>
403If you don't want to profile the kernel itself,
404you can tell OProfile you don't have a <filename>vmlinux</filename> file :
405</para>
406<screen>opcontrol --no-vmlinux</screen>
407<para>
408Now we are ready to start the daemon (<command>oprofiled</command>) which collects
409the profile data :
410</para>
411<screen>opcontrol --start</screen>
412<para>
413When I want to stop profiling, I can do so with :
414</para>
415<screen>opcontrol --shutdown</screen>
416<para>
417Note that unlike <command>gprof</command>, no instrumentation (<option>-pg</option>
418and <option>-a</option> options to <command>gcc</command>)
419is necessary.
420</para>
421<para>
422Periodically (or on <command>opcontrol --shutdown</command> or <command>opcontrol --dump</command>)
423the profile data is written out into the $SESSION_DIR/samples directory (by default at <filename>/var/lib/oprofile/samples</filename>).
424These profile files cover shared libraries, applications, the kernel (vmlinux), and kernel modules.
425You can clear the profile data (at any time) with <command>opcontrol --reset</command>.
426</para>
427<para>
428To place these sample database files in a specific directory instead of the default location (<filename>/var/lib/oprofile</filename>) use the <option>--session-dir=dir</option> option. You must also specify the <option>--session-dir</option> to tell the tools to continue using this directory. (In the future, we should allow this to be specified in an environment variable.) :
429</para>
430<screen>opcontrol --no-vmlinux --session-dir=/home/me/tmpsession</screen>
431<screen>opcontrol --start --session-dir=/home/me/tmpsession</screen>
432<para>
433You can get summaries of this data in a number of ways at any time. To get a summary of
434data across the entire system for all of these profiles, you can do :
435</para>
436<screen>opreport [--session-dir=dir]</screen>
437<para>
438Or to get a more detailed summary, for a particular image, you can do something like :
439</para>
440<screen>opreport -l /boot/vmlinux-`uname -r`</screen>
441<para>
442There are also a number of other ways of presenting the data, as described later in this manual.
443Note that OProfile will choose a default profiling setup for you. However, there are a number
444of options you can pass to <command>opcontrol</command> if you need to change something,
445also detailed later.
446</para>
447
448</sect1>
449
450<sect1 id="tools-overview">
451<title>Tools summary</title>
452<para>
453This section gives a brief description of the available OProfile utilities and their purpose.
454</para>
455<variablelist>
456<varlistentry>
457	<term><filename>ophelp</filename></term>
458	<listitem><para>
459		This utility lists the available events and short descriptions.
460	</para></listitem>
461</varlistentry>
462	
463<varlistentry>
464	<term><filename>opcontrol</filename></term>
465	<listitem><para>
466		Used for controlling the OProfile data collection, discussed in <xref linkend="controlling" />.
467	</para></listitem>
468</varlistentry>
469
470<varlistentry>
471	<term><filename>agent libraries</filename></term>
472	<listitem><para>
473			Used by virtual machines (like the Java VM) to record information about JITed code being profiled. See <xref linkend="setup-jit" />.
474		</para></listitem>
475</varlistentry>
476
477<varlistentry>
478	<term><filename>opreport</filename></term>
479	<listitem><para>
480		This is the main tool for retrieving useful profile data, described in
481		<xref linkend="opreport" />.
482	</para></listitem>
483</varlistentry>
484
485<varlistentry>
486	<term><filename>opannotate</filename></term>
487	<listitem><para>
488		This utility can be used to produce annotated source, assembly or mixed source/assembly.
489		Source level annotation is available only if the application was compiled with 
490		debugging symbols. See <xref linkend="opannotate" />.
491	</para></listitem>
492</varlistentry>
493
494<varlistentry>
495	<term><filename>opgprof</filename></term>
496	<listitem><para>
497		This utility can output gprof-style data files for a binary, for use with
498		<command>gprof -p</command>. See <xref linkend="opgprof" />.
499	</para></listitem>
500</varlistentry>
501
502<varlistentry>
503	<term><filename>oparchive</filename></term>
504	<listitem><para>
505		This utility can be used to collect executables, debuginfo,
506		and sample files and copy the files into an archive.
507		The archive is self-contained and can be moved to another
508		machine for further analysis.
509		See <xref linkend="oparchive" />.
510	</para></listitem>
511</varlistentry>
512
513<varlistentry>
514	<term><filename>opimport</filename></term>
515	<listitem><para>
516		This utility converts sample database files from a foreign binary format (abi) to
517		the native format. This is useful only when moving sample files between hosts,
518		for analysis on platforms other than the one used for collection.
519		See <xref linkend="opimport" />.
520	</para></listitem>
521</varlistentry>
522
523</variablelist>
524</sect1>
525	
526</chapter>
527 
528<chapter id="controlling">
529<title>Controlling the profiler</title>
530
531<sect1 id="controlling-daemon">
532<title>Using <command>opcontrol</command></title>
533<para>
534In this section we describe the configuration and control of the profiling system
535with opcontrol in more depth.
536The <command>opcontrol</command> script has a default setup, but you
537can alter this with the options given below. In particular,
538if your hardware supports performance counters, you can configure them.
539There are a number of counters (for example, counter 0 and counter 1
540on the Pentium III). Each of these counters can be programmed with
541an event to count, such as cache misses or MMX operations. The event
542chosen for each counter is reflected in the profile data collected
543by OProfile: functions and binaries at the top of the profiles reflect
544that most of the chosen events happened within that code.
545</para>
546<para>
547Additionally, each counter has a "count" value: this corresponds to how
548detailed the profile is. The lower the value, the more frequently profile
549samples are taken. A counter can choose to sample only kernel code, user-space code,
550or both (both is the default). Finally, some events have a "unit mask"
551- this is a value that further restricts the types of event that are counted. 
552The event types and unit masks for your CPU are listed by <command>opcontrol
553--list-events</command>.
554</para>
555<para>
556The <command>opcontrol</command> script provides the following actions :
557</para>
558<variablelist>
559	<varlistentry>
560		<term><option>--init</option></term>
561		<listitem><para>
562		Loads the OProfile module if required and makes the OProfile driver
563		interface available.
564		</para></listitem>
565	</varlistentry>
566	<varlistentry>
567		<term><option>--setup</option></term>
568		<listitem><para>
569		    Followed by list arguments for profiling set up. List of arguments
570		    saved in <filename>/root/.oprofile/daemonrc</filename>.
571		    Giving this option is not necessary; you can just directly pass one
572		    of the setup options, e.g. <command>opcontrol --no-vmlinux</command>.
573		  </para></listitem>
574	</varlistentry>
575	<varlistentry>
576		<term><option>--status</option></term>
577		<listitem><para>
578		Show configuration information.
579		</para></listitem>
580	</varlistentry>
581	<varlistentry>
582		<term><option>--start-daemon</option></term>
583		<listitem><para>
584		    Start the oprofile daemon without starting actual profiling. The profiling
585		can then be started using <option>--start</option>. This is useful for avoiding
586		measuring the cost of daemon startup, as <option>--start</option> is a simple
587		write to a file in oprofilefs. Not available in 2.2/2.4 kernels.
588		</para></listitem>
589	</varlistentry>
590	<varlistentry>
591		<term><option>--start</option></term>
592		<listitem><para>
593		    Start data collection with either arguments provided by <option>--setup</option>
594		or information saved in <filename>/root/.oprofile/daemonrc</filename>. Specifying
595		the addition <option>--verbose</option> makes the daemon generate lots of debug data
596		whilst it is running.
597		</para></listitem>
598	</varlistentry>
599	<varlistentry>
600		<term><option>--dump</option></term>
601		<listitem><para>
602		    Force a flush of the collected profiling data to the daemon.
603		</para></listitem>
604	</varlistentry>
605	<varlistentry>
606		<term><option>--stop</option></term>
607		<listitem><para>
608		    Stop data collection (this separate step is not possible with 2.2 or 2.4 kernels).
609		</para></listitem>
610	</varlistentry>
611	<varlistentry>
612		<term><option>--shutdown</option></term>
613		<listitem><para>
614		    Stop data collection and kill the daemon.
615		</para></listitem>
616	</varlistentry>
617	<varlistentry>
618		<term><option>--reset</option></term>
619		<listitem><para>
620		    Clears out data from current session, but leaves saved sessions.
621		</para></listitem>
622	</varlistentry>
623	<varlistentry>
624		<term><option>--save=</option>session_name</term>
625		<listitem><para>
626		    Save data from current session to session_name.
627		</para></listitem>
628	</varlistentry>
629	<varlistentry>
630		<term><option>--deinit</option></term>
631		<listitem><para>
632                Shuts down daemon. Unload the OProfile module and oprofilefs.
633		</para></listitem>
634	</varlistentry>
635	<varlistentry>
636		<term><option>--list-events</option></term>
637		<listitem><para>
638		    List event types and unit masks.
639		</para></listitem>
640	</varlistentry>
641	<varlistentry>
642		<term><option>--help</option></term>
643		<listitem><para>
644		    Generate usage messages.
645		</para></listitem>
646	</varlistentry>
647</variablelist>
648
649<para>
650There are a number of possible settings, of which, only
651<option>--vmlinux</option> (or <option>--no-vmlinux</option>)
652is required. These settings are stored in <filename>~/.oprofile/daemonrc</filename>.
653</para>
654<variablelist>
655	<varlistentry>
656		<term><option>--buffer-size=</option>num</term>
657		<listitem><para>
658		Number of samples in kernel buffer. When using a 2.6 kernel
659		buffer watershed need to be tweaked when changing this value.
660		</para></listitem>
661	</varlistentry>
662	<varlistentry>
663		<term><option>--buffer-watershed=</option>num</term>
664		<listitem><para>
665		Set kernel buffer watershed to num samples (2.6 only). When it'll remain only
666		buffer-size - buffer-watershed free entry in the kernel buffer data will be
667		flushed to daemon, most usefull value are in the range [0.25 - 0.5] * buffer-size.
668		</para></listitem>
669	</varlistentry>
670	<varlistentry>
671		<term><option>--cpu-buffer-size=</option>num</term>
672		<listitem><para>
673		Number of samples in kernel per-cpu buffer (2.6 only). If you
674		profile at high rate it can help to increase this if the log
675		file show excessive count of sample lost cpu buffer overflow. 
676		</para></listitem>
677	</varlistentry>
678	<varlistentry>
679		<term><option>--event=</option>[eventspec]</term>
680		<listitem><para>
681		Use the given performance counter event to profile.
682		See <xref linkend="eventspec" /> below.
683		</para></listitem>
684	</varlistentry>
685	<varlistentry>
686		<term><option>--session-dir=</option>dir_path</term>
687		<listitem><para>
688		    Create/use sample database out of directory <filename>dir_path</filename> instead of
689		the default location (/var/lib/oprofile).
690		</para></listitem>
691	</varlistentry>
692	<varlistentry>
693		<term><option>--separate=</option>[none,lib,kernel,thread,cpu,all]</term>
694		<listitem><para>
695		By default, every profile is stored in a single file. Thus, for example,
696		samples in the C library are all accredited to the <filename>/lib/libc.o</filename>
697		profile. However, you choose to create separate sample files by specifying
698		one of the below options.
699		</para>
700		<informaltable frame="all">
701		<tgroup cols='2'> 
702		<tbody>
703		<row><entry><option>none</option></entry><entry>No profile separation (default)</entry></row>
704		<row><entry><option>lib</option></entry><entry>Create per-application profiles for libraries</entry></row>
705		<row><entry><option>kernel</option></entry><entry>Create per-application profiles for the kernel and kernel modules</entry></row>
706		<row><entry><option>thread</option></entry><entry>Create profiles for each thread and each task</entry></row>
707		<row><entry><option>cpu</option></entry><entry>Create profiles for each CPU</entry></row>
708		<row><entry><option>all</option></entry><entry>All of the above options</entry></row>
709		</tbody>
710		</tgroup>
711		</informaltable>
712		<para>
713		Note  that <option>--separate=kernel</option> also turns on <option>--separate=lib</option>.
714		<!-- FIXME: update if this change -->
715		When using <option>--separate=kernel</option>, samples in hardware interrupts, soft-irqs, or other
716		asynchronous kernel contexts are credited to the task currently running. This means you will see
717		seemingly nonsense profiles such as <filename>/bin/bash</filename> showing samples for the PPP modules,
718		etc.
719		</para>
720		<para>
721		On 2.2/2.4 only kernel threads already started when profiling begins are correctly profiled;
722		newly started kernel thread samples are credited to the vmlinux (kernel) profile.
723		</para>
724		<para>
725		Using <option>--separate=thread</option> creates a lot
726		of sample files if you leave OProfile running for a while; it's most
727		useful when used for short sessions, or when using image filtering.
728		</para>
729		</listitem>
730	</varlistentry>
731	<varlistentry>
732		<term><option>--callgraph=</option>#depth</term>
733		<listitem><para>
734		Enable call-graph sample collection with a maximum depth. Use 0 to disable
735		callgraph profiling.  NOTE: Callgraph support is available on a limited
736		number of platforms at this time; for example:
737		<para>
738		<itemizedlist>
739		<listitem><para>x86 with recent 2.6 kernel</para></listitem>
740		<listitem><para>ARM with recent 2.6 kernel</para></listitem>
741		<listitem><para>PowerPC with 2.6.17 kernel</para></listitem>
742		</itemizedlist>
743		</para>
744		</para></listitem>
745	</varlistentry>
746	<varlistentry>
747		<term><option>--image=</option>image,[images]|"all"</term>
748		<listitem><para>
749		Image filtering. If you specify one or more absolute
750		paths to binaries, OProfile will only produce profile results for those
751		binary images. This is useful for restricting the sometimes voluminous
752		output you may get otherwise, especially with
753		<option>--separate=thread</option>. Note that if you are using
754		<option>--separate=lib</option> or
755		<option>--separate=kernel</option>, then if you specification an
756		application binary, the shared libraries and kernel code
757		<emphasis>are</emphasis> included. Specify the value
758		"all" to profile everything (the default).
759		</para></listitem>
760	</varlistentry>
761	<varlistentry>
762		<term><option>--vmlinux=</option>file</term>
763		<listitem><para>
764		vmlinux kernel image.
765		</para></listitem>
766	</varlistentry>
767	<varlistentry>
768		<term><option>--no-vmlinux</option></term>
769		<listitem><para>
770		Use this when you don't have a kernel vmlinux file, and you don't want
771		to profile the kernel. This still counts the total number of kernel samples,
772		but can't give symbol-based results for the kernel or any modules.
773		</para></listitem>
774	</varlistentry>
775</variablelist>
776
777<sect2 id="opcontrolexamples">
778<title>Examples</title>
779
780<sect3 id="examplesperfctr">
781<title>Intel performance counter setup</title>
782<para>
783Here, we have a Pentium III running at 800MHz, and we want to look at where data memory
784references are happening most, and also get results for CPU time.
785</para>
786<screen>
787# opcontrol --event=CPU_CLK_UNHALTED:400000 --event=DATA_MEM_REFS:10000
788# opcontrol --vmlinux=/boot/2.6.0/vmlinux
789# opcontrol --start
790</screen>
791</sect3>
792
793<sect3 id="examplesrtc">
794<title>RTC mode</title>
795<para>
796Here, we have an Intel laptop without support for performance counters, running on 2.4 kernels.
797</para>
798<screen>
799# ophelp -r
800CPU with RTC device
801# opcontrol --vmlinux=/boot/2.4.13/vmlinux --event=RTC_INTERRUPTS:1024
802# opcontrol --start
803</screen>
804</sect3>
805
806<sect3 id="examplesstartdaemon">
807<title>Starting the daemon separately</title>
808<para>
809If we're running 2.6 kernels, we can use <option>--start-daemon</option> to avoid
810the profiler startup affecting results.
811</para>
812<screen>
813# opcontrol --vmlinux=/boot/2.6.0/vmlinux
814# opcontrol --start-daemon
815# my_favourite_benchmark --init
816# opcontrol --start ; my_favourite_benchmark --run ; opcontrol --stop
817</screen>
818</sect3>
819
820<sect3 id="exampleseparate">
821<title>Separate profiles for libraries and the kernel</title>
822<para>
823Here, we want to see a profile of the OProfile daemon itself, including when
824it was running inside the kernel driver, and its use of shared libraries.
825</para>
826<screen>
827# opcontrol --separate=kernel --vmlinux=/boot/2.6.0/vmlinux
828# opcontrol --start
829# my_favourite_stress_test --run
830# opreport -l -p /lib/modules/2.6.0/kernel /usr/local/bin/oprofiled
831</screen>
832</sect3>
833
834<sect3 id="examplessessions">
835<title>Profiling sessions</title>
836<para>
837It can often be useful to split up profiling data into several different
838time periods. For example, you may want to collect data on an application's
839startup separately from the normal runtime data. You can use the simple
840command <command>opcontrol --save</command> to do this. For example :
841</para>
842<screen>
843# opcontrol --save=blah
844</screen>
845<para>
846will create a sub-directory in <filename>$SESSION_DIR/samples</filename> containing the samples
847up to that point (the current session's sample files are moved into this
848directory). You can then pass this session name as a parameter to the post-profiling
849analysis tools, to only get data up to the point you named the
850session. If you do not want to save a session, you can do
851<command>rm -rf $SESSION_DIR/samples/sessionname</command> or, for the
852current session, <command>opcontrol --reset</command>.
853</para>
854</sect3>
855</sect2> 
856
857<sect2 id="eventspec">
858<title>Specifying performance counter events</title>
859<para>
860The <option>--event</option> option to <command>opcontrol</command>
861takes a specification that indicates how the details of each
862hardware performance counter should be setup. If you want to
863revert to OProfile's default setting (<option>--event</option>
864is strictly optional), use <option>--event=default</option>. Use of this
865option over-rides all previous event selections.
866</para>
867<para>
868You can pass multiple event specifications. OProfile will allocate
869hardware counters as necessary. Note that some combinations are not
870allowed by the CPU; running <command>opcontrol --list-events</command> gives the details
871of each event. The event specification is a colon-separated string
872of the form <option><emphasis>name</emphasis>:<emphasis>count</emphasis>:<emphasis>unitmask</emphasis>:<emphasis>kernel</emphasis>:<emphasis>user</emphasis></option> as described in this table:
873</para>
874<informaltable frame="all">
875<tgroup cols='2'> 
876<tbody>
877<row><entry><option>name</option></entry><entry>The symbolic event name, e.g. <constant>CPU_CLK_UNHALTED</constant></entry></row>
878<row><entry><option>count</option></entry><entry>The counter reset value, e.g. 100000</entry></row>
879<row><entry><option>unitmask</option></entry><entry>The unit mask, as given in the events list, e.g. 0x0f</entry></row>
880<row><entry><option>kernel</option></entry><entry>Whether to profile kernel code</entry></row>
881<row><entry><option>user</option></entry><entry>Whether to profile userspace code</entry></row>
882</tbody>
883</tgroup>
884</informaltable>
885<para>
886The last three values are optional, if you omit them (e.g. <option>--event=DATA_MEM_REFS:30000</option>),
887they will be set to the default values (a unit mask of 0, and profiling both kernel and
888userspace code). Note that some events require a unit mask.
889</para>
890<note><para>
891For the PowerPC platforms, all events specified must be in the same group; i.e., the group number
892appended to the event name (e.g. <constant>&lt;<emphasis>some-event-name</emphasis>&gt;_GRP9</constant>) must be the same.
893</para></note>
894<para>
895If OProfile is using RTC mode, and you want to alter the default counter value,
896you can use something like <option>--event=RTC_INTERRUPTS:2048</option>. Note the last
897three values here are ignored.
898If OProfile is using timer-interrupt mode, there is no configuration possible.
899</para>
900<para>
901The table below lists the events selected by default
902(<option>--event=default</option>) for the various computer architectures:
903</para>
904<informaltable frame="all">
905<tgroup cols='3'> 
906<tbody>
907<row><entry>Processor</entry><entry>cpu_type</entry><entry>Default event</entry></row>
908<row><entry>Alpha EV4</entry><entry>alpha/ev4</entry><entry>CYCLES:100000:0:1:1</entry></row>
909<row><entry>Alpha EV5</entry><entry>alpha/ev5</entry><entry>CYCLES:100000:0:1:1</entry></row>
910<row><entry>Alpha PCA56</entry><entry>alpha/pca56</entry><entry>CYCLES:100000:0:1:1</entry></row>
911<row><entry>Alpha EV6</entry><entry>alpha/ev6</entry><entry>CYCLES:100000:0:1:1</entry></row>
912<row><entry>Alpha EV67</entry><entry>alpha/ev67</entry><entry>CYCLES:100000:0:1:1</entry></row>
913<row><entry>ARM/XScale PMU1</entry><entry>arm/xscale1</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
914<row><entry>ARM/XScale PMU2</entry><entry>arm/xscale2</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
915<row><entry>ARM/MPCore</entry><entry>arm/mpcore</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
916<row><entry>AVR32</entry><entry>avr32</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
917<row><entry>Athlon</entry><entry>i386/athlon</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
918<row><entry>Pentium Pro</entry><entry>i386/ppro</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
919<row><entry>Pentium II</entry><entry>i386/pii</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
920<row><entry>Pentium III</entry><entry>i386/piii</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
921<row><entry>Pentium M (P6 core)</entry><entry>i386/p6_mobile</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
922<row><entry>Pentium 4 (non-HT)</entry><entry>i386/p4</entry><entry>GLOBAL_POWER_EVENTS:100000:1:1:1</entry></row>
923<row><entry>Pentium 4 (HT)</entry><entry>i386/p4-ht</entry><entry>GLOBAL_POWER_EVENTS:100000:1:1:1</entry></row>
924<row><entry>Hammer</entry><entry>x86-64/hammer</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
925<row><entry>Family10h</entry><entry>x86-64/family10</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
926<row><entry>Family11h</entry><entry>x86-64/family11h</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
927<row><entry>Itanium</entry><entry>ia64/itanium</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
928<row><entry>Itanium 2</entry><entry>ia64/itanium2</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
929<row><entry>TIMER_INT</entry><entry>timer</entry><entry>None selectable</entry></row>
930<row><entry>IBM iseries</entry><entry>PowerPC 4/5/970</entry><entry>CYCLES:10000:0:1:1</entry></row>
931<row><entry>IBM pseries</entry><entry>PowerPC 4/5/970/Cell</entry><entry>CYCLES:10000:0:1:1</entry></row>
932<row><entry>IBM s390</entry><entry>timer</entry><entry>None selectable</entry></row>
933<row><entry>IBM s390x</entry><entry>timer</entry><entry>None selectable</entry></row>
934</tbody>
935</tgroup>
936</informaltable>
937
938</sect2>
939
940</sect1>
941 
942<sect1 id="setup-jit">
943	<title>Setting up the JIT profiling feature</title>
944	<para>
945		To gather information about JITed code from a virtual machine,
946		it needs to be instrumented with an agent library. We use the
947		agent libraries for Java in the following example. To use the
948		Java profiling feature, you must build OProfile with the "--with-java" option
949                (<xref linkend="install" />).
950
951	</para>
952
953	<sect2 id="setup-jit-jvm">
954		<title>JVM instrumentation</title>
955		<para>
956			Add this to the startup parameters of the JVM (for JVMTI):
957
958			<screen><option>-agentpath:&lt;libdir&gt;/libjvmti_oprofile.so[=&lt;options&gt;]</option> </screen>
959			or
960			<screen><option>-agentlib:jvmti_oprofile[=&lt;options&gt;]</option> </screen>
961		</para>
962		<para>
963			The JVMPI agent implementation is enabled with the command line option
964			<screen><option>-Xrunjvmpi_oprofile[:&lt;options&gt;]</option> </screen>
965		</para>
966		<para>
967			Currently, there is just one option available -- <option>debug</option>. For JVMPI,
968			the convention for specifying an option is <option>option_name=[yes|no]</option>.
969			For JVMTI, the option specification is simply the option name, implying
970			"yes"; no option specified implies "no".
971		</para>
972                <para>
973                        The agent library (installed in <filename>&lt;oprof_install_dir&gt;/lib/oprofile</filename>)
974                        needs to be in the library search path (e.g. add the library directory
975                        to <constant>LD_LIBRARY_PATH</constant>). If the command line of
976                        the JVM is not accessible, it may be buried within shell scripts or a
977                        launcher program. It may also be possible to set an environment variable to add
978                        the instrumentation.
979                        For Sun JVMs this is <constant>JAVA_TOOL_OPTIONS</constant>. Please check
980                        your JVM documentation for
981                        further information on the agent startup options.
982                </para>
983
984	</sect2>
985</sect1>
986
987<sect1 id="oprofile-gui">
988<title>Using <command>oprof_start</command></title>
989<para>
990The <command>oprof_start</command> application provides a convenient way to start the profiler.
991Note that <command>oprof_start</command> is just a wrapper around the <command>opcontrol</command> script,
992so it does not provide more services than the script itself.
993</para>
994<para>
995After <command>oprof_start</command> is started you can select the event type for each counter;
996the sampling rate and other related parameters are explained in <xref linkend="controlling-daemon" />.
997The "Configuration" section allows you to set general parameters such as the buffer size, kernel filename
998etc. The counter setup interface should be self-explanatory; <xref linkend="hardware-counters" /> and related 
999links contain information on using unit masks.
1000</para>
1001<para>
1002A status line shows the current status of the profiler: how long it has been running, and the average
1003number of interrupts received per second and the total, over all processors.
1004Note that quitting <command>oprof_start</command> does not stop the profiler.
1005</para>
1006<para>
1007Your configuration is saved in the same file as <command>opcontrol</command> uses; that is,
1008<filename>~/.oprofile/daemonrc</filename>.
1009</para>
1010
1011</sect1>
1012
1013<sect1 id="detailed-parameters">
1014<title>Configuration details</title>
1015
1016<sect2 id="hardware-counters">
1017<title>Hardware performance counters</title>
1018<note>
1019<para>
1020Your CPU type may not include the requisite support for hardware performance counters, in which case
1021you must use OProfile in RTC mode in 2.4 (see <xref linkend="rtc" />), or timer mode in 2.6 (see <xref linkend="timer" />). 
1022You do not really need to read this section unless you are interested in using 
1023events other than the default event chosen by OProfile.
1024</para>
1025</note>
1026<para>
1027The Intel hardware performance counters are detailed in the Intel IA-32 Architecture Manual, Volume 3, available
1028from <ulink url="http://developer.intel.com/">http://developer.intel.com/</ulink>. 
1029The AMD Athlon/Opteron/Phenom/Turion implementation is detailed in <ulink
1030url="http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf">
1031http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf</ulink>.
1032For PowerPC64 processors in IBM iSeries, pSeries, and blade server systems, processor documentation
1033is available at <ulink url="http://www-01.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC/">
1034http://www-01.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC</ulink>.  (For example, the
1035specific publication containing information on the performance monitor unit for the PowerPC970 is 
1036"IBM PowerPC 970FX RISC Microprocessor User's Manual.")
1037These processors are capable of delivering an interrupt when a counter overflows.
1038This is the basic mechanism on which OProfile is based. The delivery mode is <acronym>NMI</acronym>,
1039so blocking interrupts in the kernel does not prevent profiling. When the interrupt handler is called,
1040the current <acronym>PC</acronym> value and the current task are recorded into the profiling structure.
1041This allows the overflow event to be attached to a specific assembly instruction in a binary image.
1042The daemon receives this data from the kernel, and writes it to the sample files.
1043</para>
1044<para>
1045If we use an event such as <constant>CPU_CLK_UNHALTED</constant> or <constant>INST_RETIRED</constant>
1046(<constant>GLOBAL_POWER_EVENTS</constant> or <constant>INSTR_RETIRED</constant>, respectively, on the Pentium 4), we can
1047use the overflow counts as an estimate of actual time spent in each part of code. Alternatively we can profile interesting
1048data such as the cache behaviour of routines with the other available counters.
1049</para>
1050<para>
1051However there are several caveats. First, there are those issues listed in the Intel manual. There is a delay
1052between the counter overflow and the interrupt delivery that can skew results on a small scale - this means
1053you cannot rely on the profiles at the instruction level as being perfectly accurate.
1054If you are using an "event-mode" counter such as the cache counters, a count registered against it doesn't mean
1055that it is responsible for that event. However, it implies that the counter overflowed in the dynamic
1056vicinity of that instruction, to within a few instructions. Further details on this problem can be found in 
1057<xref linkend="interpreting" /> and also in the Digital paper "ProfileMe: A Hardware Performance Counter".
1058</para>
1059<para>
1060Each counter has several configuration parameters.
1061First, there is the unit mask: this simply further specifies what to count.
1062Second, there is the counter value, discussed below. Third, there is a parameter whether to increment counts
1063whilst in kernel or user space. You can configure these separately for each counter.
1064</para>
1065<para>
1066After each overflow event, the counter will be re-initialized
1067such that another overflow will occur after this many events have been counted. Thus, higher
1068values mean less-detailed profiling, and lower values mean more detail, but higher overhead.
1069Picking a good value for this
1070parameter is, unfortunately, somewhat of a black art. It is of course dependent on the event
1071you have chosen.
1072Specifying too large a value will mean not enough interrupts are generated
1073to give a realistic profile (though this problem can be ameliorated by profiling for <emphasis>longer</emphasis>).
1074Specifying too small a value can lead to higher performance overhead.
1075</para>
1076
1077</sect2>
1078
1079<sect2 id="rtc">
1080<title>OProfile in RTC mode</title>
1081<note><para>
1082This section applies to 2.2/2.4 kernels only.
1083</para></note>
1084<para>
1085Some CPU types do not provide the needed hardware support to use the hardware performance counters. This includes
1086some laptops, classic Pentiums, and other CPU types not yet supported by OProfile (such as Cyrix). 
1087On these machines, OProfile falls
1088back to using the real-time clock interrupt to collect samples. This interrupt is also used by the <command>rtc</command>
1089module: you cannot have both the OProfile and rtc modules loaded nor the rtc support compiled in the kernel.
1090</para>
1091<para>
1092RTC mode is less capable than the hardware counters mode; in particular, it is unable to profile sections of
1093the kernel where interrupts are disabled. There is just one available event, "RTC interrupts", and its value 
1094corresponds to the number of interrupts generated per second (that is, a higher number means a better profiling
1095resolution, and higher overhead). The current implementation of the real-time clock supports only power-of-two
1096sampling rates from 2 to 4096 per second.  Other values within this range are rounded to the nearest power of
1097two.
1098</para>
1099<para>
1100You can force use of the RTC interrupt with the <option>force_rtc=1</option> module parameter.
1101</para>
1102<para>
1103Setting the value from the GUI should be straightforward. On the command line, you need to specify the
1104event to <command>opcontrol</command>, e.g. :
1105</para>
1106<para><command>opcontrol --event=RTC_INTERRUPTS:256</command></para>
1107</sect2>
1108
1109<sect2 id="timer">
1110<title>OProfile in timer interrupt mode</title>
1111<note><para>
1112This section applies to 2.6 kernels and above only.
1113</para></note>
1114<para>
1115In 2.6 kernels on CPUs without OProfile support for the hardware performance counters, the driver
1116falls back to using the timer interrupt for profiling. Like the RTC mode in 2.4 kernels, this is not able to
1117profile code that has interrupts disabled. Note that there are no configuration parameters for
1118setting this, unlike the RTC and hardware performance counter setup.
1119</para>
1120<para>
1121You can force use of the timer interrupt by using the <option>timer=1</option> module
1122parameter (or <option>oprofile.timer=1</option> on the boot command line if OProfile is
1123built-in).
1124</para>
1125</sect2>
1126
1127<sect2 id="p4">
1128<title>Pentium 4 support</title>
1129<para>
1130The Pentium 4 / Xeon performance counters are organized around 3 types of model specific registers (MSRs): 45 event
1131selection control registers (ESCRs), 18 counter configuration control registers (CCCRs) and 18 counters. ESCRs describe a
1132particular set of events which are to be recorded, and CCCRs bind ESCRs to counters and configure their
1133operation. Unfortunately the relationship between these registers is quite complex; they cannot all be used with one
1134another at any time. There is, however, a subset of 8 counters, 8 ESCRs, and 8 CCCRs which can be used independently of
1135one another, so OProfile only accesses those registers, treating them as a bank of 8 "normal" counters, similar
1136to those in the P6 or Athlon/Opteron/Phenom/Turion families of CPU.
1137</para>
1138<para>
1139There is currently no support for Precision Event-Based Sampling (PEBS), nor any advanced uses of the Debug Store
1140(DS). Current support is limited to the conservative extension of OProfile's existing interrupt-based model described
1141above.  Performance monitoring hardware on Pentium 4 / Xeon processors with Hyperthreading enabled (multiple logical
1142processors on a single die) is not supported in 2.4 kernels (you can use OProfile if you disable hyper-threading,
1143though).
1144</para>
1145</sect2>
1146
1147<sect2 id="ia64">
1148<title>Intel Itanium 2 support</title>
1149<para>
1150The Itanium 2 performance monitoring unit (PMU) organizes the counters as four
1151pairs of performance event monitoring registers. Each pair is composed of a
1152Performance Monitoring Configuration (PMC) register and Performance Monitoring
1153Data (PMD) register.  The PMC selects the performance event being monitored and
1154the PMD determines the sampling interval. The IA64 Performance Monitoring Unit
1155(PMU) triggers sampling with maskable interrupts. Thus, samples will not occur
1156in sections of the IA64 kernel where interrupts are disabled.
1157</para>
1158<para>
1159None of the advance features of the Itanium 2 performance monitoring unit
1160such as opcode matching, address range matching, or precise event sampling are
1161supported by this version of OProfile.  The Itanium 2 support only maps OProfile's
1162existing interrupt-based model to the PMU hardware.
1163</para>
1164</sect2>
1165
1166<sect2 id="ppc64">
1167<title>PowerPC64 support</title>
1168<para>
1169The performance monitoring unit (PMU) for the IBM PowerPC 64-bit processors 
1170consists of between 4 and 8 counters (depending on the model), plus three
1171special purpose registers used for programming the counters -- MMCR0, MMCR1,
1172and MMCRA.  Advanced features such as instruction matching and thresholding are
1173not supported by this version of OProfile.
1174<note>Later versions of the IBM POWER5+ processor (beginning with revision 3.0)
1175run the performance monitor unit in POWER6 mode, effectively removing OProfile's
1176access to counters 5 and 6.  These two counters are dedicated to counting
1177instructions completed and cycles, respectively.  In POWER6 mode, however, the
1178counters do not generate an interrupt on overflow and so are unusable by
1179OProfile.  Kernel versions 2.6.23 and higher will recognize this mode
1180and export "ppc64/power5++" as the cpu_type to the oprofilefs pseudo filesystem.
1181OProfile userspace responds to this cpu_type by removing these counters from
1182the list of potential events to count.  Without this kernel support, attempts
1183to profile using an event from one of these counters will yield incorrect
1184results -- typically, zero (or near zero) samples in the generated report.
1185</note>
1186</para>
1187
1188</sect2>
1189
1190<sect2 id="cell-be">
1191<title>Cell Broadband Engine support</title>
1192<para>
1193The Cell Broadband Engine (CBE) processor core consists of a PowerPC Processing
1194Element (PPE) and 8 Synergistic Processing Elements (SPE).  PPEs and SPEs each
1195consist of a processing unit (PPU and SPU, respectively) and other hardware
1196components, such as memory controllers.
1197</para>
1198<para>
1199A PPU has two hardware threads (aka "virtual CPUs").  The performance monitor
1200unit of the CBE collects event information on one hardware thread at a time.
1201Therefore, when profiling PPE events,
1202OProfile collects the profile based on the selected events by time slicing the
1203performance counter hardware between the two threads.   The user must ensure the
1204collection interval is long enough so that the time spent collecting data for
1205each PPU is sufficient to obtain a good profile.
1206</para>
1207<para>
1208To profile an SPU application, the user should specify the SPU_CYCLES event.
1209When starting OProfile with SPU_CYCLES, the opcontrol script enforces certain
1210separation parameters (separate=cpu,lib) to ensure that sufficient information
1211is collected in the sample data in order to generate a complete report.  The
1212--merge=cpu option can be used to obtain a more readable report if analyzing
1213the performance of each separate SPU is not necessary.
1214</para>
1215<para>
1216Profiling with an SPU event (events 4100 through 4163) is not compatible with any other
1217event.  Further more, only one SPU event can be specified at a time.  The hardware only
1218supports profiling on one SPU per node at a time.  The OProfile kernel code time slices
1219between the eight SPUs to collect data on all SPUs.
1220</para>
1221<para>
1222SPU profile reports have some unique characteristics compared to reports for
1223standard architectures:
1224</para>
1225<itemizedlist>
1226<listitem>Typically no "app name" column.  This is really standard OProfile behavior
1227when the report contains samples for just a single application, which is
1228commonly the case when profiling SPUs.</listitem>
1229<listitem>"CPU" equates to "SPU"</listitem>
1230<listitem>Specifying '--long-filenames' on the opreport command does not always result
1231in long filenames.  This happens when the SPU application code is embedded in
1232the PPE executable or shared library.  The embedded SPU ELF data contains only the
1233short filename (i.e., no path information) for the SPU binary file that was used as
1234the source for embedding.   The reason that just the short filename is used is because
1235the original SPU binary file may not exist or be accessible at runtime.  The performance
1236analyst must have sufficient knowledge of the application to be able to correlate the
1237SPU binary image names found in the  report to the application's source files.
1238<note>
1239Compile the application with -g and generate the OProfile report
1240with -g to facilitate finding the right source file(s) on which to focus.
1241</note>
1242</listitem>
1243</itemizedlist>
1244
1245</sect2>
1246
1247<sect2 id="amd-ibs-support">
1248<title>AMD64 (x86_64) Instruction-Based Sampling (IBS) support</title>
1249
1250<para>
1251Instruction-Based Sampling (IBS) is a new performance measurement technique
1252available on AMD Family 10h processors. Traditional performance counter
1253sampling is not precise enough to isolate performance issues to individual
1254instructions. IBS, however, precisely identifies instructions which are not
1255making the best use of the processor pipeline and memory hierarchy.
1256For more information, please refer to the "Instruction-Based Sampling:
1257A New Performance Analysis Technique for AMD Family 10h Processors" (
1258<ulink url="http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf">
1259http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf</ulink>).
1260There are two types of IBS profile types, described in the following sections.
1261</para>
1262
1263<sect3 id="ibs-fetch">
1264<title>IBS Fetch</title>
1265
1266<para>
1267IBS fetch sampling is a statistical sampling method which counts completed
1268fetch operations. When the number of completed fetch operations reaches the
1269maximum fetch count (the sampling period), IBS tags the fetch operation and
1270monitors that operation until it either completes or aborts. When a tagged
1271fetch completes or aborts, a sampling interrupt is generated and an IBS fetch
1272sample is taken. An IBS fetch sample contains a timestamp, the identifier of
1273the interrupted process, the virtual fetch address, and several event flags
1274and values that describe what happened during the fetch operation. 
1275</para>
1276
1277</sect3>
1278
1279<sect3 id="ibs-op">
1280<title>IBS Op</title>
1281
1282<para>
1283IBS op sampling selects, tags, and monitors macro-ops as issued from AMD64
1284instructions. Two options are available for selecting ops for sampling:
1285</para>
1286
1287<itemizedlist>
1288<listitem>
1289Cycles-based selection counts CPU clock cycles. The op is tagged and monitored
1290when the count reaches a threshold (the sampling period) and a valid op is
1291available. 
1292</listitem>
1293
1294<listitem>
1295Dispatched op-based selection counts dispatched macro-ops.
1296When the count reaches a threshold, the next valid op is tagged and monitored. 
1297</listitem>
1298</itemizedlist>
1299
1300<para>
1301In both cases, an IBS sample is generated only if the tagged op retires.
1302Thus, IBS op event information does not measure speculative execution activity.
1303The execution stages of the pipeline monitor the tagged macro-op. When the
1304tagged macro-op retires, a sampling interrupt is generated and an IBS op
1305sample is taken. An IBS op sample contains a timestamp, the identifier of
1306the interrupted process, the virtual address of the AMD64 instruction from
1307which the op was issued, and several event flags and values that describe
1308what happened when the macro-op executed.
1309</para>
1310
1311</sect3>
1312
1313<para>
1314Enabling IBS profiling is done simply by specifying IBS performance events
1315through the "--event=" options. These events are listed in the
1316<function>opcontrol --list-events</function>.
1317</para>
1318
1319<screen>
1320opcontrol --event=IBS_FETCH_XXX:&lt;count&gt;:&lt;um&gt;:&lt;kernel&gt;:&lt;user&gt;
1321opcontrol --event=IBS_OP_XXX:&lt;count&gt;:&lt;um&gt;:&lt;kernel&gt;:&lt;user&gt;
1322
1323Note: * All IBS fetch event must have the same event count and unitmask,
1324        as do those for IBS op.
1325</screen>
1326
1327</sect2>
1328
1329
1330<sect2 id="misuse">
1331<title>Dangerous counter settings</title>
1332<para>
1333OProfile is a low-level profiler which allow continuous profiling with a low-overhead cost.
1334If too low a count reset value is set for a counter, the system can become overloaded with counter
1335interrupts, and seem as if the system has frozen. Whilst some validation is done, it
1336is not foolproof.
1337</para>
1338<note><para>
1339This can happen as follows: When the profiler count
1340reaches zero an NMI handler is called which stores the sample values in an internal buffer, then resets the counter
1341to its original value. If the count is very low, a pending NMI can be sent before the NMI handler has
1342completed. Due to the priority of the NMI, the local APIC delivers the pending interrupt immediately after
1343completion of the previous interrupt handler, and control never returns to other parts of the system.
1344In this way the system seems to be frozen.
1345</para></note>
1346<para>If this happens, it will be impossible to bring the system back to a workable state.
1347There is no way to provide real security against this happening, other than making sure to use a reasonable value
1348for the counter reset. For example, setting <constant>CPU_CLK_UNHALTED</constant> event type with a ridiculously low reset count (e.g. 500)
1349is likely to freeze the system.
1350</para>
1351<para>
1352In short : <command>Don't try a foolish sample count value</command>. Unfortunately the definition of a foolish value
1353is really dependent on the event type - if ever in doubt, e-mail </para>
1354<address><email>oprofile-list@lists.sf.net</email>.</address>
1355</sect2>
1356
1357</sect1>
1358 
1359</chapter>
1360
1361<chapter id="results">
1362<title>Obtaining results</title>
1363<para>
1364OK, so the profiler has been running, but it's not much use unless we can get some data out. Fairly often,
1365OProfile does a little <emphasis>too</emphasis> good a job of keeping overhead low, and no data reaches
1366the profiler. This can happen on lightly-loaded machines. Remember you can force a dump at any time with :
1367</para>
1368<para><command>opcontrol --dump</command></para>
1369<para>Remember to do this before complaining there is no profiling data !
1370Now that we've got some data, it has to be processed. That's the job of <command>opreport</command>,
1371<command>opannotate</command>, or <command>opgprof</command>.
1372</para>
1373
1374<sect1 id="profile-spec">
1375<title>Profile specifications</title>
1376
1377<para>
1378All of the analysis tools take a <emphasis>profile specification</emphasis>.
1379This is a set of definitions that describe which actual profiles should be
1380examined. The simplest profile specification is empty: this will match all
1381the available profile files for the current session (this is what happens
1382when you do <command>opreport</command>).
1383</para>
1384<para>
1385Specification parameters are of the form <option>name:value[,value]</option>.
1386For example, if I wanted to get a combined symbol summary for
1387<filename>/bin/myprog</filename> and <filename>/bin/myprog2</filename>,
1388I could do <command>opreport -l image:/bin/myprog,/bin/myprog2</command>.
1389As a special case, you don't actually need to specify the <option>image:</option>
1390part here: anything left on the command line is assumed to be an
1391<option>image:</option> name. Similarly, if no <option>session:</option>
1392is specified, then <option>session:current</option> is assumed ("current"
1393is a special name of the current / last profiling session).
1394</para>
1395<para>
1396In addition to the comma-separated list shown above, some of the 
1397specification parameters can take <command>glob</command>-style
1398values. For example, if I want to see image summaries for all
1399binaries profiled in <filename>/usr/bin/</filename>, I could do
1400<command>opreport image:/usr/bin/\*</command>. Note the necessity
1401to escape the special character from the shell.
1402</para>
1403<para>
1404For <command>opreport</command>, profile specifications can be used to
1405define two profiles, giving differential output. This is done by
1406enclosing each of the two specifications within curly braces, as shown
1407in the examples below. Any specifications outside of curly braces are
1408shared across both.
1409</para>
1410
1411<sect2 id="profile-spec-examples">
1412<title>Examples</title>
1413
1414<para>
1415Image summaries for all profiles with <constant>DATA_MEM_REFS</constant>
1416samples in the saved session called "stresstest" :
1417</para>
1418<screen>
1419# opreport session:stresstest event:DATA_MEM_REFS
1420</screen>
1421
1422<para>
1423Symbol summary for the application called "test_sym53c8xx,9xx". Note the
1424escaping is necessary as <option>image:</option> takes a comma-separated list.
1425</para>
1426<screen>
1427# opreport -l /test/test_sym53c8xx\,9xx
1428</screen>
1429
1430<para>
1431Image summaries for all binaries in the <filename>test</filename> directory,
1432excepting <filename>boring-test</filename> :
1433</para>
1434<screen>
1435# opreport image:/test/\* image-exclude:/test/boring-test
1436</screen>
1437
1438<para>
1439Differential profile of a binary stored in two archives :
1440</para>
1441<screen>
1442# opreport -l /bin/bash { archive:/orig } { archive:/new }
1443</screen>
1444
1445<para>
1446Differential profile of an archived binary with the current session :
1447</para>
1448<screen>
1449# opreport -l /bin/bash { archive:/orig } { }
1450</screen>
1451
1452</sect2> <!-- profile spec examples -->
1453
1454<sect2 id="profile-spec-details">
1455<title>Profile specification parameters</title>
1456
1457<variablelist>
1458	<varlistentry>
1459		<term><option>archive:</option><emphasis>archivepath</emphasis></term>
1460		<listitem><para>
1461		A path to an archive made with <command>oparchive</command>.
1462		Absence of this tag, unlike others, means "the current system",
1463		equivalent to specifying "archive:".
1464		</para></listitem>
1465	</varlistentry>
1466	<varlistentry>
1467		<term><option>session:</option><emphasis>sessionlist</emphasis></term>
1468		<listitem><para>
1469		A comma-separated list of session names to resolve in. Absence of this
1470		tag, unlike others, means "the current session", equivalent to
1471		specifying "session:current".
1472		</para></listitem>
1473	</varlistentry>
1474	<varlistentry>
1475		<term><option>session-exclude:</option><emphasis>sessionlist</emphasis></term>
1476		<listitem><para>
1477                A comma-separated list of sessions to exclude.
1478		</para></listitem>
1479	</varlistentry>
1480	<varlistentry>
1481		<term><option>image:</option><emphasis>imagelist</emphasis></term>
1482		<listitem><para>
1483                A comma-separated list of image names to resolve. Each entry may be relative
1484                path, <command>glob</command>-style name, or full path, e.g.</para>
1485		<screen>opreport 'image:/usr/bin/oprofiled,*op*,/opreport'</screen>
1486		</listitem>
1487	</varlistentry>
1488
1489	<varlistentry>
1490		<term><option>image-exclude:</option><emphasis>imagelist</emphasis></term>
1491		<listitem><para>
1492		Same as <option>image:</option>, but the matching images are excluded.
1493		</para></listitem>
1494	</varlistentry>
1495
1496	<varlistentry>
1497		<term><option>lib-image:</option><emphasis>imagelist</emphasis></term>
1498		<listitem><para>
1499		Same as <option>image:</option>, but only for images that are for
1500		a particular primary binary image (namely, an application). This only
1501		makes sense to use if you're using <option>--separate</option>.
1502		This includes kernel modules and the kernel when using
1503		<option>--separate=kernel</option>.
1504		</para></listitem>
1505	</varlistentry>
1506
1507	<varlistentry>
1508		<term><option>lib-image-exclude:</option><emphasis>imagelist</emphasis></term>
1509		<listitem><para>
1510		Same as <option>lib-image:</option>, but the matching images
1511		are excluded.
1512		</para></listitem>
1513	</varlistentry>
1514
1515	<varlistentry>
1516		<term><option>event:</option><emphasis>eventlist</emphasis></term>
1517		<listitem><para>
1518		The symbolic event name to match on, e.g. <option>event:DATA_MEM_REFS</option>.
1519		You can pass a list of events for side-by-side comparison with <command>opreport</command>.
1520		When using the timer interrupt, the event is always "TIMER".
1521		</para></listitem>
1522	</varlistentry>
1523
1524	<varlistentry>
1525		<term><option>count:</option><emphasis>eventcountlist</emphasis></term>
1526		<listitem><para>
1527		The event count to match on, e.g. <option>event:DATA_MEM_REFS count:30000</option>.
1528		Note that this value refers to the setting used for <command>opcontrol</command>
1529		only, and has nothing to do with the sample counts in the profile data
1530		itself.
1531		You can pass a list of events for side-by-side comparison with <command>opreport</command>.
1532		When using the timer interrupt, the count is always 0 (indicating it cannot be set).
1533		</para></listitem>
1534	</varlistentry>
1535
1536	<varlistentry>
1537		<term><option>unit-mask:</option><emphasis>masklist</emphasis></term>
1538		<listitem><para>
1539		The unit mask value of the event to match on, e.g. <option>unit-mask:1</option>.
1540		You can pass a list of events for side-by-side comparison with <command>opreport</command>.
1541		</para></listitem>
1542	</varlistentry>
1543
1544	<varlistentry>
1545		<term><option>cpu:</option><emphasis>cpulist</emphasis></term>
1546		<listitem><para>
1547		Only consider profiles for the given numbered CPU (starting from zero).
1548		This is only useful when using CPU profile separation.
1549		</para></listitem>
1550	</varlistentry>
1551
1552	<varlistentry>
1553		<term><option>tgid:</option><emphasis>pidlist</emphasis></term>
1554		<listitem><para>
1555		Only consider profiles for the given task groups. Unless some program
1556		is using threads, the task group ID of a process is the same
1557		as its process ID. This option corresponds to the POSIX
1558		notion of a thread group.
1559		This is only useful when using per-process profile separation.
1560		</para></listitem>
1561	</varlistentry>
1562
1563	<varlistentry>
1564		<term><option>tid:</option><emphasis>tidlist</emphasis></term>
1565		<listitem><para>
1566		Only consider profiles for the given threads. When using
1567		recent thread libraries, all threads in a process share the
1568		same task group ID, but have different thread IDs. You can
1569		use this option in combination with <option>tgid:</option> to
1570		restrict the results to particular threads within a process.
1571		This is only useful when using per-process profile separation.
1572		</para></listitem>
1573	</varlistentry>
1574</variablelist>
1575
1576</sect2>
1577
1578<sect2 id="locating-and-managing-binary-images">
1579<title>Locating and managing binary images</title>
1580<para>
1581Each session's sample files can be found in the $SESSION_DIR/samples/ directory (default: <filename>/var/lib/oprofile/samples/</filename>).
1582These are used, along with the binary image files, to produce human-readable data.
1583In some circumstances (kernel modules in an initrd, or modules on 2.6 kernels), OProfile
1584will not be able to find the binary images. All the tools have an <option>--image-path</option>
1585option to which you can pass a comma-separated list of alternate paths to search. For example,
1586I can let OProfile find my 2.6 modules by using <command>--image-path /lib/modules/2.6.0/kernel/</command>.
1587It is your responsibility to ensure that the correct images are found when using this
1588option.
1589</para>
1590<para>
1591Note that if a binary image changes after the sample file was created, you won't be able to get useful
1592symbol-based data out. This situation is detected for you. If you replace a binary, you should
1593make sure to save the old binary if you need to do comparative profiles.
1594</para>
1595
1596</sect2>
1597
1598<sect2 id="no-results">
1599<title>What to do when you don't get any results</title>
1600<para>
1601When attempting to get output, you may see the error :
1602</para>
1603<screen>
1604error: no sample files found: profile specification too strict ?
1605</screen>
1606<para>
1607What this is saying is that the profile specification you passed in,
1608when matched against the available sample files, resulted in no matches.
1609There are a number of reasons this might happen:
1610</para>
1611<variablelist>
1612<varlistentry><term>spelling</term><listitem><para>
1613You specified a binary name, but spelt it wrongly. Check your spelling !
1614</para></listitem></varlistentry>
1615<varlistentry><term>profiler wasn't running</term><listitem><para>
1616Make very sure that OProfile was actually up and running when you ran
1617the binary.
1618</para></listitem></varlistentry>
1619<varlistentry><term>binary didn't run long enough</term><listitem><para>
1620Remember OProfile is a statistical profiler - you're not guaranteed to
1621get samples for short-running programs. You can help this by using a
1622lower count for the performance counter, so there are a lot more samples
1623taken per second.
1624</para></listitem></varlistentry>
1625<varlistentry><term>binary spent most of its time in libraries</term><listitem><para>
1626Similarly, if the binary spends little time in the main binary image
1627itself, with most of it spent in shared libraries it uses, you might
1628not see any samples for the binary image itself. You can check this
1629by using <command>opcontrol --separate=lib</command> before the
1630profiling session, so <command>opreport</command> and friends show
1631the library profiles on a per-application basis.
1632</para></listitem></varlistentry>
1633<varlistentry><term>specification was really too strict</term><listitem><para>
1634For example, you specified something like <option>tgid:3433</option>,
1635but no task with that group ID ever ran the code.
1636</para></listitem></varlistentry>
1637<varlistentry><term>binary didn't generate any events</term><listitem><para>
1638If you're using a particular event counter, for example counting MMX
1639operations, the code might simply have not generated any events in the
1640first place. Verify the code you're profiling does what you expect it
1641to.
1642</para></listitem></varlistentry>
1643<varlistentry><term>you didn't specify kernel module name correctly</term><listitem><para>
1644If you're using 2.6 kernels, and trying to get reports for a kernel
1645module, make sure to use the <option>-p</option> option, and specify the
1646module name <emphasis>with</emphasis> the <filename>.ko</filename>
1647extension. Check if the module is one loaded from initrd.
1648</para></listitem></varlistentry>
1649</variablelist>
1650
1651</sect2>
1652
1653</sect1> <!-- profile-spec -->
1654
1655<sect1 id="opreport">
1656<title>Image summaries and symbol summaries (<command>opreport</command>)</title>
1657<para>
1658The <command>opreport</command> utility is the primary utility you will use for 
1659getting formatted data out of OProfile. It produces two types of data: image summaries
1660and symbol summaries. An image summary lists the number of samples for individual
1661binary images such as libraries or applications. Symbol summaries provide per-symbol
1662profile data. In the following example, we're getting an image summary for the whole
1663system:
1664</para>
1665<screen>
1666$ opreport --long-filenames
1667CPU: PIII, speed 863.195 MHz (estimated)
1668Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 23150
1669   905898 59.7415 /usr/lib/gcc-lib/i386-redhat-linux/3.2/cc1plus
1670   214320 14.1338 /boot/2.6.0/vmlinux
1671   103450  6.8222 /lib/i686/libc-2.3.2.so
1672    60160  3.9674 /usr/local/bin/madplay
1673    31769  2.0951 /usr/local/oprofile-pp/bin/oprofiled
1674    26550  1.7509 /usr/lib/libartsflow.so.1.0.0
1675    23906  1.5765 /usr/bin/as
1676    18770  1.2378 /oprofile
1677    15528  1.0240 /usr/lib/qt-3.0.5/lib/libqt-mt.so.3.0.5
1678    11979  0.7900 /usr/X11R6/bin/XFree86
1679    11328  0.7471 /bin/bash
1680    ...
1681</screen>
1682<para>
1683If we had specified <option>--symbols</option> in the previous command, we would have
1684gotten a symbol summary of all the images across the entire system. We can restrict this to only
1685part of the system profile; for example,
1686below is a symbol summary of the OProfile daemon. Note that as we used
1687<command>opcontrol --separate=kernel</command>, symbols from images that <command>oprofiled</command>
1688has used are also shown.
1689</para>
1690<screen>
1691$ opreport -l `which oprofiled` 2>/dev/null | more
1692CPU: PIII, speed 863.195 MHz (estimated)
1693Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 23150
1694vma      samples  %           image name               symbol name
16950804be10 14971    28.1993     oprofiled                odb_insert
16960804afdc 7144     13.4564     oprofiled                pop_buffer_value
1697c01daea0 6113     11.5144     vmlinux                  __copy_to_user_ll
16980804b060 2816      5.3042     oprofiled                opd_put_sample
16990804b4a0 2147      4.0441     oprofiled                opd_process_samples
17000804acf4 1855      3.4941     oprofiled                opd_put_image_sample
17010804ad84 1766      3.3264     oprofiled                opd_find_image
17020804a5ec 1084      2.0418     oprofiled                opd_find_module
17030804ba5c 741       1.3957     oprofiled                odb_hash_add_node
1704...
1705</screen>
1706
1707<para>
1708These are the two basic ways you are most likely to use regularly, but <command>opreport</command>
1709can do a lot more than that, as described below.
1710</para>
1711
1712<sect2 id="opreport-merging">
1713<title>Merging separate profiles</title>
1714
1715If you have used one of the <option>--separate=</option> options
1716whilst profiling, there can be several separate profiles for
1717a single binary image within a session. Normally the output
1718will keep these images separated (so, for example, the image summary
1719output shows library image summaries on a per-application basis,
1720when using <option>--separate=lib</option>).
1721Sometimes it can be useful to merge these results back together
1722before getting results. The <option>--merge</option> option allows
1723you to do that.
1724</sect2>
1725
1726<sect2 id="opreport-comparison">
1727<title>Side-by-side multiple results</title>
1728If you have used multiple events when profiling, by default you get
1729side-by-side results of each event's sample values from <command>opreport</command>.
1730You can restrict which events to list by appropriate use of the
1731<option>event:</option> profile specifications, etc.
1732</sect2>
1733
1734<sect2 id="opreport-callgraph">
1735<title>Callgraph output</title>
1736<para>
1737This section provides details on how to use the OProfile callgraph feature.
1738</para>
1739<sect3 id="op-cg1">
1740<title>Callgraph details</title>
1741<para>
1742When using the <option>opcontrol --callgraph</option> option, you can see what
1743functions are calling other functions in the output. Consider the
1744following program:
1745</para>
1746<screen>
1747#include &lt;string.h&gt;
1748#include &lt;stdlib.h&gt;
1749#include &lt;stdio.h&gt;
1750
1751#define SIZE 500000
1752
1753static int compare(const void *s1, const void *s2)
1754{
1755        return strcmp(s1, s2);
1756}
1757
1758static void repeat(void)
1759{
1760        int i;
1761        char *strings[SIZE];
1762        char str[] = "abcdefghijklmnopqrstuvwxyz";
1763
1764        for (i = 0; i &lt; SIZE; ++i) {
1765                strings[i] = strdup(str);
1766                strfry(strings[i]);
1767        }
1768
1769        qsort(strings, SIZE, sizeof(char *), compare);
1770}
1771
1772int main()
1773{
1774        while (1)
1775                repeat();
1776}
1777</screen>
1778<para>
1779When running with the call-graph option, OProfile will
1780record the function stack every time it takes a sample.
1781<command>opreport --callgraph</command> outputs an entry for each
1782function, where each entry looks similar to:
1783</para>
1784<screen>
1785samples  %        image name               symbol name
1786  197       0.1548  cg                       main
1787  127036   99.8452  cg                       repeat
178884590    42.5084  libc-2.3.2.so            strfry
1789  84590    66.4838  libc-2.3.2.so            strfry [self]
1790  39169    30.7850  libc-2.3.2.so            random_r
1791  3475      2.7312  libc-2.3.2.so            __i686.get_pc_thunk.bx
1792-------------------------------------------------------------------------------
1793</screen>
1794<para>
1795Here the non-indented line is the function we're focussing upon
1796(<function>strfry()</function>). This
1797line is the same as you'd get from a normal <command>opreport</command>
1798output.
1799</para>
1800<para>
1801Above the non-indented line we find the functions that called this
1802function (for example, <function>repeat()</function> calls
1803<function>strfry()</function>). The samples and percentage values here
1804refer to the number of times we took a sample where this call was found
1805in the stack; the percentage is relative to all other callers of the
1806function we're focussing on. Note that these values are
1807<emphasis>not</emphasis> call counts; they only reflect the call stack
1808every time a sample is taken; that is, if a call is found in the stack
1809at the time of a sample, it is recorded in this count.
1810</para>
1811<para>
1812Below the line are functions that are called by
1813<function>strfry()</function> (called <emphasis>callees</emphasis>).
1814It's clear here that <function>strfry()</function> calls
1815<function>random_r()</function>. We also see a special entry with a
1816"[self]" marker. This records the normal samples for the function, but
1817the percentage becomes relative to all callees. This allows you to
1818compare time spent in the function itself compared to functions it
1819calls. Note that if a function calls itself, then it will appear in the
1820list of callees of itself, but without the "[self]" marker; so recursive
1821calls are still clearly separable.
1822</para>
1823<para>
1824You may have noticed that the output lists <function>main()</function>
1825as calling <function>strfry()</function>, but it's clear from the source
1826that this doesn't actually happen. See <xref
1827linkend="interpreting-callgraph" /> for an explanation.
1828</para>
1829</sect3>
1830<sect3 id="cg-with-jitsupport">
1831<title>Callgraph and JIT support</title>
1832<para>
1833Callgraph output where anonymously mapped code is in the callstack can sometimes be misleading.
1834For all such code, the samples for the anonymously mapped code are stored in a samples subdirectory
1835named <filename>{anon:anon}/&lt;tgid&gt;.&lt;begin_addr&gt;.&lt;end_addr&gt;</filename>.
1836As stated earlier, if this anonymously mapped code is JITed code from a supported VM like Java,
1837OProfile creates an ELF file to provide a (somewhat) permanent backing file for the code.
1838However, when viewing callgraph output, any anonymously mapped code in the callstack
1839will be attributed to <filename>anon (&lt;tgid&gt;: range:&lt;begin_addr&gt;-&lt;end_addr&gt;</filename>,
1840even if a <filename>.jo</filename> ELF file had been created for it.  See the example below.
1841</para>
1842<screen>
1843-------------------------------------------------------------------------------
1844  1         2.2727  libj9ute23.so            java.bin                 traceV
1845  2         4.5455  libj9ute23.so            java.bin                 utsTraceV
1846  4         9.0909  libj9trc23.so            java.bin                 fillInUTInterfaces
1847  37       84.0909  libj9trc23.so            java.bin                 twGetSequenceCounter
18488         0.0154  libj9prt23.so            java.bin                 j9time_hires_clock
1849  27       61.3636  anon (tgid:10014 range:0x100000-0x103000) java.bin                 (no symbols)
1850  9        20.4545  libc-2.4.so              java.bin                 gettimeofday
1851  8        18.1818  libj9prt23.so            java.bin                 j9time_hires_clock [self]
1852-------------------------------------------------------------------------------
1853</screen>
1854<para>
1855The output shows that "anon (tgid:10014 range:0x100000-0x103000)" was a callee of
1856<code>j9time_hires_clock</code>, even though the ELF file <filename>10014.jo</filename> was
1857created for this profile run.  Unfortunately, there is currently no way to correlate
1858that anonymous callgraph entry with its corresponding <filename>.jo</filename> file.
1859</para>
1860</sect3>
1861
1862
1863</sect2> <!-- opreport-callgraph -->
1864
1865<sect2 id="opreport-diff">
1866<title>Differential profiles with <command>opreport</command></title>
1867
1868<para>
1869Often, we'd like to be able to compare two profiles. For example, when
1870analysing the performance of an application, we'd like to make code
1871changes and examine the effect of the change. This is supported in
1872<command>opreport</command> by giving a profile specification that
1873identifies two different profiles. The general form is of:
1874</para>
1875<screen>
1876$ opreport &lt;shared-spec&gt; { &lt;first-profile&gt; } { &lt;second-profile&gt; }
1877</screen>
1878<note><para>
1879We lost our Dragon book down the back of the sofa, so you have to be
1880careful to have spaces around those braces, or things will get
1881hopelessly confused. We can only apologise.
1882</para></note>
1883<para>
1884For each of the profiles, the shared section is prefixed, and then the
1885specification is analysed. The usual parameters work both within the
1886shared section, and in the sub-specification within the curly braces.
1887</para>
1888<para>
1889A typical way to use this feature is with archives created with
1890<command>oparchive</command>. Let's look at an example:
1891</para>
1892<screen>
1893$ ./a
1894$ oparchive -o orig ./a
1895$ opcontrol --reset
1896  # edit and recompile a
1897$ ./a
1898  # now compare the current profile of a with the archived profile
1899$ opreport -xl ./a { archive:/orig } { }
1900CPU: PIII, speed 863.233 MHz (estimated)
1901Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a
1902unit mask of 0x00 (No unit mask) count 100000
1903samples  %        diff %    symbol name
190492435    48.5366  +0.4999   a
190554226    ---      ---       c
190649222    25.8459  +++       d
190748787    25.6175  -2.2e-01  b
1908</screen>
1909<para>
1910Note that we specified an empty second profile in the curly braces, as
1911we wanted to use the current session; alternatively, we could
1912have specified another archive, or a tgid etc. We specified the binary
1913<command>a</command> in the shared section, so we matched that in both
1914the profiles we're diffing.
1915</para>
1916<para>
1917As in the normal output, the results are sorted by the number of
1918samples, and the percentage field represents the relative percentage of
1919the symbol's samples in the second profile.
1920</para>
1921<para>
1922Notice the new column in the output. This value represents the
1923percentage change of the relative percent between the first and the
1924second profile: roughly, "how much more important this symbol is".
1925Looking at the symbol <function>a()</function>, we can see that it took
1926roughly the same amount of the total profile in both the first and the
1927second profile. The function <function>c()</function> was not in the new
1928profile, so has been marked with <function>---</function>. Note that the
1929sample value is the number of samples in the first profile; since we're
1930displaying results for the second profile, we don't list a percentage
1931value for it, as it would be meaningless. <function>d()</function> is
1932new in the second profile, and consequently marked with
1933<function>+++</function>.
1934</para>
1935<para>
1936When comparing profiles between different binaries, it should be clear
1937that functions can change in terms of VMA and size. To avoid this
1938problem, <command>opreport</command> considers a symbol to be the same
1939if the symbol name, image name, and owning application name all match;
1940any other factors are ignored. Note that the check for application name
1941means that trying to compare library profiles between two different
1942applications will not work as you might expect: each symbol will be
1943considered different.
1944</para>
1945
1946</sect2> <!-- opreport-diff -->
1947
1948<sect2 id="opreport-anon">
1949<title>Anonymous executable mappings</title>
1950<para>
1951Many applications, typically ones involving dynamic compilation into
1952machine code (just-in-time, or "JIT", compilation), have executable mappings that
1953are not backed by an ELF file. <command>opreport</command> has basic support for showing the
1954samples taken in these regions; for example:
1955<screen>
1956$ opreport /usr/bin/mono -l
1957CPU: ppc64 POWER5, speed 1654.34 MHz (estimated)
1958Counted CYCLES events (Processor Cycles using continuous sampling) with a unit mask of 0x00 (No unit mask) count 100000
1959samples  %        image name    		                symbol name
196047       58.7500  mono                     			(no symbols)
196114       17.5000  anon (tgid:3189 range:0xf72aa000-0xf72fa000)  (no symbols)
19629        11.2500  anon (tgid:3189 range:0xf6cca000-0xf6dd9000)  (no symbols)
1963.	 .	  .						.
1964</screen>
1965</para>
1966<para>
1967Note that, since such mappings are dependent upon individual invocations of
1968a binary, these mappings are always listed as a dependent image,
1969even when using <option>--separate=none</option>.
1970Equally, the results are not affected by the <option>--merge</option>
1971option.
1972</para>
1973<para>
1974As shown in the opreport output above, OProfile is unable to attribute the samples to any
1975symbol(s) because there is no ELF file for this code.
1976Enhanced support for JITed code is now available for some virtual machines; 
1977e.g., the Java Virtual Machine.  For details about OProfile output for
1978JITed code, see <xref linkend="getting-jit-reports" />.
1979</para>
1980<para>For more information about JIT support in OProfile, see <xref linkend="jitsupport"/>.
1981</para>
1982</sect2> <!-- opreport-anon -->
1983
1984<sect2 id="opreport-xml">
1985<title>XML formatted output</title>
1986<para>
1987The -xml option can be used to generate XML instead of the usual
1988text format.  This allows opreport to eliminate some of the constraints
1989dictated by the two dimensional text format.  For example, it is possible
1990to separate the sample data across multiple events, cpus and threads.  The XML
1991schema implemented by opreport is found in doc/opreport.xsd. It contains
1992more detailed comments about the structure of the XML generated by opreport.
1993</para>
1994<para>
1995Since XML is consumed by a client program rather than a user, its structure
1996is fairly static.  In particular, the --sort option is incompatible with the
1997--xml option.  Percentages are not dislayed in the XML so the options related
1998to percentages will have no effect.  Full pathnames are always displayed in
1999the XML so --long-filenames is not necessary.  The --details option will cause
2000all of the individual sample data to be included in the XML as well as the
2001instruction byte stream for each symbol (for doing disassembly) and can result
2002in very large XML files.
2003</para>
2004</sect2> <!-- opreport-xml -->
2005
2006<sect2 id="opreport-options">
2007<title>Options for <command>opreport</command></title>
2008
2009<variablelist>
2010<varlistentry><term><option>--accumulated / -a</option></term><listitem><para>
2011Accumulate sample and percentage counts in the symbol list.
2012</para></listitem></varlistentry>
2013<varlistentry><term><option>--callgraph / -c</option></term><listitem><para>
2014Show callgraph information.
2015</para></listitem></varlistentry>
2016<varlistentry><term><option>--debug-info / -g</option></term><listitem><para>
2017Show source file and line for each symbol.
2018</para></listitem></varlistentry>
2019<varlistentry><term><option>--demangle / -D none|normal|smart</option></term><listitem><para>
2020none: no demangling. normal: use default demangler (default) smart: use
2021pattern-matching to make C++ symbol demangling more readable.
2022</para></listitem></varlistentry>
2023<varlistentry><term><option>--details / -d</option></term><listitem><para>
2024Show per-instruction details for all selected symbols. Note that, for
2025binaries without symbol information, the VMA values shown are raw file
2026offsets for the image binary.
2027</para></listitem></varlistentry>
2028<varlistentry><term><option>--exclude-dependent / -x</option></term><listitem><para>
2029Do not include application-specific images for libraries, kernel modules
2030and the kernel. This option only makes sense if the profile session
2031used --separate.
2032</para></listitem></varlistentry>
2033<varlistentry><term><option>--exclude-symbols / -e [symbols]</option></term><listitem><para>
2034Exclude all the symbols in the given comma-separated list.
2035</para></listitem></varlistentry>
2036<varlistentry><term><option>--global-percent / -%</option></term><listitem><para>
2037Make all percentages relative to the whole profile.
2038</para></listitem></varlistentry>
2039<varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
2040Show help message.
2041</para></listitem></varlistentry>
2042<varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
2043Comma-separated list of additional paths to search for binaries.
2044This is needed to find modules in kernels 2.6 and upwards.
2045</para></listitem></varlistentry>
2046<varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
2047A path to a filesystem to search for additional binaries.
2048</para></listitem></varlistentry>
2049<varlistentry><term><option>--include-symbols / -i [symbols]</option></term><listitem><para>
2050Only include symbols in the given comma-separated list.
2051</para></listitem></varlistentry>
2052<varlistentry><term><option>--long-filenames / -f</option></term><listitem><para>
2053Output full paths instead of basenames.
2054</para></listitem></varlistentry>
2055<varlistentry><term><option>--merge / -m [lib,cpu,tid,tgid,unitmask,all]</option></term><listitem><para>
2056Merge any profiles separated in a --separate session.
2057</para></listitem></varlistentry>
2058<varlistentry><term><option>--no-header</option></term><listitem><para>
2059Don't output a header detailing profiling parameters.
2060</para></listitem></varlistentry>
2061<varlistentry><term><option>--output-file / -o [file]</option></term><listitem><para>
2062Output to the given file instead of stdout.
2063</para></listitem></varlistentry>
2064<varlistentry><term><option>--reverse-sort / -r</option></term><listitem><para>
2065Reverse the sort from the default.
2066</para></listitem></varlistentry>
2067<varlistentry><term><option>--session-dir=</option>dir_path</term><listitem><para>
2068Use sample database out of directory <filename>dir_path</filename> 
2069instead of the default location (/var/lib/oprofile).
2070</para></listitem></varlistentry>
2071<varlistentry><term><option>--show-address / -w</option></term><listitem><para>
2072Show the VMA address of each symbol (off by default).
2073</para></listitem></varlistentry>
2074<varlistentry><term><option>--sort / -s [vma,sample,symbol,debug,image]</option></term><listitem><para>
2075Sort the list of symbols by, respectively, symbol address,
2076number of samples, symbol name, debug filename and line number,
2077binary image filename.
2078</para></listitem></varlistentry>
2079<varlistentry><term><option>--symbols / -l</option></term><listitem><para>
2080List per-symbol information instead of a binary image summary.
2081</para></listitem></varlistentry>
2082<varlistentry><term><option>--threshold / -t [percentage]</option></term><listitem><para>
2083Only output data for symbols that have more than the given percentage
2084of total samples.
2085</para></listitem></varlistentry>
2086<varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
2087Give verbose debugging output.
2088</para></listitem></varlistentry>
2089<varlistentry><term><option>--version / -v</option></term><listitem><para>
2090Show version.
2091</para></listitem></varlistentry>
2092<varlistentry><term><option>--xml / -X</option></term><listitem><para>
2093Generate XML output.
2094</para></listitem></varlistentry>
2095</variablelist>
2096
2097</sect2>
2098
2099</sect1> <!-- opreport -->
2100
2101<sect1 id="opannotate">
2102<title>Outputting annotated source (<command>opannotate</command>)</title>
2103<para>
2104The <command>opannotate</command> utility generates annotated source files or assembly listings, optionally
2105mixed with source.
2106If you want to see the source file, the profiled application needs to have debug information, and the source
2107must be available through this debug information. For GCC, you must use the <option>-g</option> option
2108when you are compiling.
2109If the binary doesn't contain sufficient debug information, you can still
2110use <command>opannotate <option>--assembly</option></command> to get annotated assembly.
2111</para>
2112<para>
2113Note that for the reason explained in <xref linkend="hardware-counters" /> the results can be
2114inaccurate. The debug information itself can add other problems; for example, the line number for a symbol can be
2115incorrect. Assembly instructions can be re-ordered and moved by the compiler, and this can lead to
2116crediting source lines with samples not really "owned" by this line. Also see
2117<xref linkend="interpreting" />.
2118</para>
2119<para>
2120You can output the annotation to one single file, containing all the source found using the
2121<option>--source</option>. You can use this in conjunction with <option>--assembly</option>
2122to get combined source/assembly output.
2123</para>
2124<para>
2125You can also output a directory of annotated source files that maintains the structure of
2126the original sources. Each line in the annotated source is prepended with the samples
2127for that line. Additionally, each symbol is annotated giving details for the symbol
2128as a whole. An example:
2129</para>
2130<screen>
2131$ opannotate --source --output-dir=annotated /usr/local/oprofile-pp/bin/oprofiled
2132$ ls annotated/home/moz/src/oprofile-pp/daemon/
2133opd_cookie.h  opd_image.c  opd_kernel.c  opd_sample_files.c  oprofiled.c
2134</screen>
2135<para>
2136Line numbers are maintained in the source files, but each file has
2137a footer appended describing the profiling details. The actual annotation
2138looks something like this :
2139</para>
2140<screen>
2141...
2142               :static uint64_t pop_buffer_value(struct transient * trans)
2143 11510  1.9661 :{ /* pop_buffer_value total:  89901 15.3566 */
2144               :        uint64_t val;
2145               :
2146 10227  1.7469 :        if (!trans->remaining) {
2147               :                fprintf(stderr, "BUG: popping empty buffer !\n");
2148               :                exit(EXIT_FAILURE);
2149               :        }
2150               :
2151               :        val = get_buffer_value(trans->buffer, 0);
2152  2281  0.3896 :        trans->remaining--;
2153  2296  0.3922 :        trans->buffer += kernel_pointer_size;
2154               :        return val;
2155 10454  1.7857 :}
2156...
2157</screen>
2158
2159<para>
2160The first number on each line is the number of samples, whilst the second is
2161the relative percentage of total samples.
2162</para>
2163
2164<sect2 id="opannotate-finding-source">
2165<title>Locating source files</title>
2166<para>
2167Of course, <command>opannotate</command> needs to be able to locate the source files
2168for the binary image(s) in order to produce output. Some binary images have debug
2169information where the given source file paths are relative, not absolute. You can
2170specify search paths to look for these files (similar to <command>gdb</command>'s
2171<option>dir</option> command) with the <option>--search-dirs</option> option.
2172</para>
2173<para>
2174Sometimes you may have a binary image which gives absolute paths for the source files,
2175but you have the actual sources elsewhere (commonly, you've installed an SRPM for
2176a binary on your system and you want annotation from an existing profile). You can
2177use the <option>--base-dirs</option> option to redirect OProfile to look somewhere
2178else for source files. For example, imagine we have a binary generated from a source
2179file that is given in the debug information as <filename>/tmp/build/libfoo/foo.c</filename>,
2180and you have the source tree matching that binary installed in <filename>/home/user/libfoo/</filename>.
2181You can redirect OProfile to find <filename>foo.c</filename> correctly like this :
2182</para>
2183<screen>
2184$ opannotate --source --base-dirs=/tmp/build/libfoo/ --search-dirs=/home/user/libfoo/ --output-dir=annotated/ /lib/libfoo.so
2185</screen>
2186<para>
2187You can specify multiple (comma-separated) paths to both options.
2188</para>
2189</sect2>
2190
2191<sect2 id="opannotate-details">
2192<title>Usage of <command>opannotate</command></title>
2193
2194<variablelist>
2195<varlistentry><term><option>--assembly / -a</option></term><listitem><para>
2196Output annotated assembly. If this is combined with --source, then mixed
2197source / assembly annotations are output.
2198</para></listitem></varlistentry>
2199<varlistentry><term><option>--base-dirs / -b [paths]/</option></term><listitem><para>
2200Comma-separated list of path prefixes. This can be used to point OProfile to a
2201different location for source files when the debug information specifies an
2202absolute path on your system for the source that does not exist. The prefix
2203is stripped from the debug source file paths, then searched in the search dirs
2204specified by <option>--search-dirs</option>.
2205</para></listitem></varlistentry>
2206<varlistentry><term><option>--demangle / -D none|normal|smart</option></term><listitem><para>
2207none: no demangling. normal: use default demangler (default) smart: use
2208pattern-matching to make C++ symbol demangling more readable.
2209</para></listitem></varlistentry>
2210<varlistentry><term><option>--exclude-dependent / -x</option></term><listitem><para>
2211Do not include application-specific images for libraries, kernel modules
2212and the kernel. This option only makes sense if the profile session
2213used --separate.
2214</para></listitem></varlistentry>
2215<varlistentry><term><option>--exclude-file [files]</option></term><listitem><para>
2216Exclude all files in the given comma-separated list of glob patterns.
2217</para></listitem></varlistentry>
2218<varlistentry><term><option>--exclude-symbols / -e [symbols]</option></term><listitem><para>
2219Exclude all the symbols in the given comma-separated list.
2220</para></listitem></varlistentry>
2221<varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
2222Show help message.
2223</para></listitem></varlistentry>
2224<varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
2225Comma-separated list of additional paths to search for binaries.
2226This is needed to find modules in kernels 2.6 and upwards.
2227</para></listitem></varlistentry>
2228<varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
2229A path to a filesystem to search for additional binaries.
2230</para></listitem></varlistentry>
2231<varlistentry><term><option>--include-file [files]</option></term><listitem><para>
2232Only include files in the given comma-separated list of glob patterns.
2233</para></listitem></varlistentry>
2234<varlistentry><term><option>--include-symbols / -i [symbols]</option></term><listitem><para>
2235Only include symbols in the given comma-separated list.
2236</para></listitem></varlistentry>
2237<varlistentry><term><option>--objdump-params [params]</option></term><listitem><para>
2238Pass the given parameters as extra values when calling objdump.
2239</para></listitem></varlistentry>
2240<varlistentry><term><option>--output-dir / -o [dir]</option></term><listitem><para>
2241Output directory. This makes opannotate output one annotated file for each
2242source file. This option can't be used in conjunction with --assembly.
2243</para></listitem></varlistentry>
2244<varlistentry><term><option>--search-dirs / -d [paths]</option></term><listitem><para>
2245Comma-separated list of paths to search for source files. This is useful to find
2246source files when the debug information only contains relative paths.
2247</para></listitem></varlistentry>
2248<varlistentry><term><option>--source / -s</option></term><listitem><para>
2249Output annotated source. This requires debugging information to be available
2250for the binaries.
2251</para></listitem></varlistentry>
2252<varlistentry><term><option>--threshold / -t [percentage]</option></term><listitem><para>
2253Only output data for symbols that have more than the given percentage
2254of total samples.
2255</para></listitem></varlistentry>
2256<varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
2257Give verbose debugging output.
2258</para></listitem></varlistentry>
2259<varlistentry><term><option>--version / -v</option></term><listitem><para>
2260Show version.
2261</para></listitem></varlistentry>
2262</variablelist>
2263
2264
2265</sect2> <!-- opannotate-details -->
2266
2267</sect1> <!-- opannotate -->
2268
2269<sect1 id="getting-jit-reports">
2270	<title>OProfile results with JIT samples</title>
2271	<para>
2272		After profiling a Java (or other supported VM) application, the command
2273		<screen><command>"opcontrol --dump"</command> </screen>
2274		flushes the sample buffers and creates ELF binaries from the
2275		intermediate files that were written by the agent library.
2276		The ELF binaries are named <filename>&lt;tgid&gt;.jo</filename>.
2277		With the symbol information stored in these ELF files, it is
2278		possible to map samples to the appropriate symbols.
2279	</para>
2280	<para>
2281		The usual analysis tools (<command>opreport</command> and/or 
2282		<command>opannotate</command>) can now be used
2283		to get symbols and assembly code for the instrumented VM processes.
2284	</para>
2285<para>
2286Below is an example of a profile report of a Java application that has been
2287instrumented with the provided agent library.
2288<screen>
2289$ opreport -l /usr/lib/jvm/jre-1.5.0-ibm/bin/java
2290CPU: Core Solo / Duo, speed 2167 MHz (estimated)
2291Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
2292samples  %        image name               symbol name
2293186020   50.0523  no-vmlinux               no-vmlinux               (no symbols)
229434333     9.2380  7635.jo                  java                     void test.f1()
229519022     5.1182  libc-2.5.so              libc-2.5.so              _IO_file_xsputn@@GLIBC_2.1
229618762     5.0483  libc-2.5.so              libc-2.5.so              vfprintf
229716408     4.4149  7635.jo                  java                     void test$HelloThread.run()
229816250     4.3724  7635.jo                  java                     void test$test_1.f2(int)
229915303     4.1176  7635.jo                  java                     void test.f2(int, int)
230013252     3.5657  7635.jo                  java                     void test.f2(int)
23015165      1.3897  7635.jo                  java                     void test.f4()
2302955       0.2570  7635.jo                  java                     void test$HelloThread.run()~
2303
2304</screen>
2305</para>
2306<note><para>
2307	  Depending on the JVM that is used, certain options of opreport and opannotate
2308	  do NOT work since they rely on debug information (e.g. source code line number)
2309	  that is not always available. The Sun JVM does provide the necessary debug
2310	  information via the JVMTI[PI] interface,
2311	  but other JVMs do not.
2312  </para></note>
2313	<para>
2314		As you can see in the opreport output, the JIT support agent for Java
2315		generates symbols to include the class and method signature.
2316		A symbol with the suffix &tilde;&lt;n&gt; (e.g.
2317		<code>void test$HelloThread.run()&tilde;1</code>) means that this is
2318		the &lt;n&gt;th occurrence of the identical name. This happens if a method is re-JITed.
2319		A symbol with the suffix %&lt;n&gt;, means that the address space of this symbol
2320		was reused during the sample session (see <xref linkend="overlapping-symbols" />).
2321		The value &lt;n&gt; is the percentage of time that this symbol/code was present in
2322		relation to the total lifetime of all overlapping other symbols. A symbol of the form
2323		<code>&lt;return_val&gt; &lt;class_name&gt;$&lt;method_sig&gt;</code> denotes an
2324		inner class.
2325	</para>
2326</sect1>
2327
2328<sect1 id="opgprof">
2329<title><command>gprof</command>-compatible output (<command>opgprof</command>)</title>
2330<para>
2331If you're familiar with the output produced by <command>GNU gprof</command>,
2332you may find <command>opgprof</command> useful. It takes a single binary
2333as an argument, and produces a <filename>gmon.out</filename> file for use
2334with <command>gprof -p</command>. If call-graph profiling is enabled,
2335then this is also included.
2336</para>
2337<screen>
2338$ opgprof `which oprofiled` # generates gmon.out file
2339$ gprof -p `which oprofiled` | head
2340Flat profile:
2341
2342Each sample counts as 1 samples.
2343  %   cumulative   self              self     total
2344 time   samples   samples    calls  T1/call  T1/call  name
2345 33.13 206237.00 206237.00                             odb_insert
2346 22.67 347386.00 141149.00                             pop_buffer_value
2347  9.56 406881.00 59495.00                             opd_put_sample
2348  7.34 452599.00 45718.00                             opd_find_image
2349  7.19 497327.00 44728.00                             opd_process_samples
2350</screen>
2351
2352<sect2 id="opgprof-details">
2353<title>Usage of <command>opgprof</command></title>
2354
2355<variablelist>
2356<varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
2357Show help message.
2358</para></listitem></varlistentry>
2359<varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
2360Comma-separated list of additional paths to search for binaries.
2361This is needed to find modules in kernels 2.6 and upwards.
2362</para></listitem></varlistentry>
2363<varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
2364A path to a filesystem to search for additional binaries.
2365</para></listitem></varlistentry>
2366<varlistentry><term><option>--output-filename / -o [file]</option></term><listitem><para>
2367Output to the given file instead of the default, gmon.out
2368</para></listitem></varlistentry>
2369<varlistentry><term><option>--threshold / -t [percentage]</option></term><listitem><para>
2370Only output data for symbols that have more than the given percentage
2371of total samples.
2372</para></listitem></varlistentry>
2373<varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
2374Give verbose debugging output.
2375</para></listitem></varlistentry>
2376<varlistentry><term><option>--version / -v</option></term><listitem><para>
2377Show version.
2378</para></listitem></varlistentry>
2379</variablelist>
2380
2381</sect2> <!-- opgprof-details -->
2382
2383</sect1> <!-- opgprof -->
2384
2385<sect1 id="oparchive">
2386<title>Archiving measurements (<command>oparchive</command>)</title>
2387<para>
2388	The <command>oparchive</command> utility generates a directory populated
2389	with executable, debug, and oprofile sample files. This directory can be
2390	moved to another machine via <command>tar</command> and analyzed without
2391	further use of the data collection machine.
2392</para>
2393
2394<para>
2395	The following command would collect the sample files, the executables
2396	associated with the sample files, and the debuginfo files associated
2397	with the executables and copy them into
2398	<filename>/tmp/current_data</filename>:
2399</para>
2400
2401<screen>
2402# oparchive -o /tmp/current_data
2403</screen>
2404
2405<sect2 id="oparchive-details">
2406<title>Usage of <command>oparchive</command></title>
2407
2408<variablelist>
2409<varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
2410Show help message.
2411</para></listitem></varlistentry>
2412<varlistentry><term><option>--exclude-dependent / -x</option></term><listitem><para>
2413Do not include application-specific images for libraries, kernel modules
2414and the kernel. This option only makes sense if the profile session
2415used --separate.
2416</para></listitem></varlistentry>
2417<varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
2418Comma-separated list of additional paths to search for binaries.
2419This is needed to find modules in kernels 2.6 and upwards.
2420</para></listitem></varlistentry>
2421<varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
2422A path to a filesystem to search for additional binaries.
2423</para></listitem></varlistentry>
2424<varlistentry><term><option>--output-directory / -o [directory]</option></term><listitem><para>
2425Output to the given directory. There is no default. This must be specified.
2426</para></listitem></varlistentry>
2427<varlistentry><term><option>--list-files / -l</option></term><listitem><para>
2428Only list the files that would be archived, don't copy them.
2429</para></listitem></varlistentry>
2430<varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
2431Give verbose debugging output.
2432</para></listitem></varlistentry>
2433<varlistentry><term><option>--version / -v</option></term><listitem><para>
2434Show version.
2435</para></listitem></varlistentry>
2436</variablelist>
2437
2438</sect2> <!-- oparchive-details -->
2439
2440</sect1> <!-- oparchive -->
2441
2442<sect1 id="opimport">
2443<title>Converting sample database files (<command>opimport</command>)</title>
2444<para>
2445	This utility converts sample database files from a foreign binary format (abi) to
2446	the native format. This is useful only when moving sample files between hosts,
2447	for analysis on platforms other than the one used for collection. The abi format
2448	of the file to be imported is described in a text file located in <filename>$SESSION_DIR/abi</filename>.
2449</para>
2450
2451<para>
2452	The following command would convert the input samples files to the
2453	output samples files using the given abi file as a binary description
2454	of the input file and the curent platform abi as a binary description
2455	of the output file.
2456</para>
2457
2458<screen>
2459# opimport -a /var/lib/oprofile/abi -o /tmp/current/.../GLOBAL_POWER_EVENTS.200000.1.all.all.all /var/lib/.../mprime/GLOBAL_POWER_EVENTS.200000.1.all.all.all
2460</screen>
2461
2462<sect2 id="opimport-details">
2463<title>Usage of <command>opimport</command></title>
2464
2465<variablelist>
2466<varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
2467Show help message.
2468</para></listitem></varlistentry>
2469<varlistentry><term><option>--abi / -a [filename]</option></term><listitem><para>
2470Input abi file description location.
2471</para></listitem></varlistentry>
2472<varlistentry><term><option>--force / -f</option></term><listitem><para>
2473Force conversion even if the input and output abi are identical.
2474</para></listitem></varlistentry>
2475<varlistentry><term><option>--output / -o [filename]</option></term><listitem><para>
2476Specify the output filename. If the output file already exists, the file is
2477not overwritten but data are accumulated in. Sample filename are informative
2478for post profile tools and must be kept identical, in other word the pathname
2479from the first path component containing a '{' must be kept as it in the
2480output filename.
2481</para></listitem></varlistentry>
2482<varlistentry><term><option>--verbose / -V</option></term><listitem><para>
2483Give verbose debugging output.
2484</para></listitem></varlistentry>
2485<varlistentry><term><option>--version / -v</option></term><listitem><para>
2486Show version.
2487</para></listitem></varlistentry>
2488</variablelist>
2489
2490</sect2> <!-- opimport-details -->
2491
2492</sect1> <!-- opimport -->
2493
2494</chapter>
2495
2496<chapter id="interpreting">
2497<title>Interpreting profiling results</title>
2498<para>
2499The standard caveats of profiling apply in interpreting the results from OProfile:
2500profile realistic situations, profile different scenarios, profile
2501for as long as a time as possible, avoid system-specific artifacts, don't trust
2502the profile data too much. Also bear in mind the comments on the performance
2503counters above - you <emphasis>cannot</emphasis> rely on totally accurate
2504instruction-level profiling.  However, for almost all circumstances the data
2505can be useful. Ideally a utility such as Intel's VTUNE would be available to
2506allow careful instruction-level analysis; go hassle Intel for this, not me ;)
2507</para>
2508<sect1 id="irq-latency">
2509<title>Profiling interrupt latency</title>
2510<para>
2511This is an example of how the latency of delivery of profiling interrupts
2512can impact the reliability of the profiling data. This is pretty much a 
2513worst-case-scenario example: these problems are fairly rare.
2514</para>
2515<screen>
2516double fun(double a, double b, double c)
2517{
2518 double result = 0;
2519 for (int i = 0 ; i &lt; 10000; ++i) {
2520  result += a;
2521  result *= b;
2522  result /= c;
2523 }
2524 return result;
2525}
2526</screen>
2527<para>
2528Here the last instruction of the loop is very costly, and you would expect the result
2529reflecting that - but (cutting the instructions inside the loop):
2530</para>
2531<screen>
2532$ opannotate -a -t 10 /a.out
2533
2534     88 15.38% : 8048337:       fadd   %st(3),%st
2535     48 8.391% : 8048339:       fmul   %st(2),%st
2536     68 11.88% : 804833b:       fdiv   %st(1),%st
2537    368 64.33% : 804833d:       inc    %eax
2538               : 804833e:       cmp    $0x270f,%eax
2539               : 8048343:       jle    8048337
2540</screen>
2541<para>
2542The problem comes from the x86 hardware; when the counter overflows the IRQ
2543is asserted but the hardware has features that can delay the NMI interrupt:
2544x86 hardware is synchronous (i.e. cannot interrupt during an instruction);
2545there is also a latency when the IRQ is asserted, and the multiple
2546execution units and the out-of-order model of modern x86 CPUs also causes
2547problems. This is the same function, with annotation :
2548</para>
2549<screen>
2550$ opannotate -s -t 10 /a.out
2551
2552               :double fun(double a, double b, double c)
2553               :{ /* _Z3funddd total:     572 100.0% */
2554               : double result = 0;
2555    368 64.33% : for (int i = 0 ; i &lt; 10000; ++i) {
2556     88 15.38% :  result += a;
2557     48 8.391% :  result *= b;
2558     68 11.88% :  result /= c;
2559               : }
2560               : return result;
2561               :}
2562</screen>
2563<para>
2564The conclusion: don't trust samples coming at the end of a loop,
2565particularly if the last instruction generated by the compiler is costly. This
2566case can also occur for branches. Always bear in mind that samples
2567can be delayed by a few cycles from its real position. That's a hardware
2568problem and OProfile can do nothing about it.
2569</para>
2570</sect1>
2571<sect1 id="kernel-profiling">
2572<title>Kernel profiling</title>
2573<sect2 id="irq-masking">
2574<title>Interrupt masking</title>
2575<para>
2576OProfile uses non-maskable interrupts (NMI) on the P6 generation, Pentium 4,
2577Athlon, Opteron, Phenom, and Turion processors. These interrupts can occur even in section of the
2578Linux where interrupts are disabled, allowing collection of samples in virtually
2579all executable code.  The RTC, timer interrupt mode, and Itanium 2 collection mechanisms
2580use maskable interrupts. Thus, the RTC and Itanium 2 data collection mechanism have "sample
2581shadows", or blind spots: regions where no samples will be collected. Typically, the samples
2582will be attributed to the code immediately after the interrupts are re-enabled.
2583</para>
2584</sect2>
2585<sect2 id="idle">
2586<title>Idle time</title>
2587<para>
2588Your kernel is likely to support halting the processor when a CPU is idle. As
2589the typical hardware events like <constant>CPU_CLK_UNHALTED</constant> do not
2590count when the CPU is halted, the kernel profile will not reflect the actual
2591amount of time spent idle. You can change this behaviour by booting with
2592the <option>idle=poll</option> option, which uses a different idle routine. This
2593will appear as <function>poll_idle()</function> in your kernel profile.
2594</para>
2595</sect2>
2596<sect2 id="kernel-modules">
2597<title>Profiling kernel modules</title>
2598<para>
2599OProfile profiles kernel modules by default. However, there are a couple of problems
2600you may have when trying to get results. First, you may have booted via an initrd;
2601this means that the actual path for the module binaries cannot be determined automatically.
2602To get around this, you can use the <option>-p</option> option to the profiling tools
2603to specify where to look for the kernel modules.
2604</para>
2605<para>
2606In 2.6, the information on where kernel module binaries are located has been removed.
2607This means OProfile needs guiding with the <option>-p</option> option to find your
2608modules. Normally, you can just use your standard module top-level directory for this.
2609Note that due to this problem, OProfile cannot check that the modification times match;
2610it is your responsibility to make sure you do not modify a binary after a profile
2611has been created.
2612</para>
2613<para>
2614If you have run <command>insmod</command> or <command>modprobe</command> to insert a module
2615in a particular directory, it is important that you specify this directory with the 
2616<option>-p</option> option first, so that it over-rides an older module binary that might
2617exist in other directories you've specified with <option>-p</option>. It is up to you
2618to make sure that these values are correct: 2.6 kernels simply do not provide enough
2619information for OProfile to get this information.
2620</para>
2621</sect2>
2622</sect1>
2623
2624<sect1 id="interpreting-callgraph">
2625<title>Interpreting call-graph profiles</title>
2626<para>
2627Sometimes the results from call-graph profiles may be different to what
2628you expect to see. The first thing to check is whether the target
2629binaries where compiled with frame pointers enabled (if the binary was
2630compiled using <command>gcc</command>'s
2631<option>-fomit-frame-pointer</option> option, you will not get
2632meaningful results). Note that as of this writing, the GCC developers
2633plan to disable frame pointers by default. The Linux kernel is built
2634without frame pointers by default; there is a configuration option you
2635can use to turn it on under the "Kernel Hacking" menu.
2636</para>
2637<para>
2638Often you may see a caller of a function that does not actually directly
2639call the function you're looking at (e.g. if <function>a()</function>
2640calls <function>b()</function>, which in turn calls
2641<function>c()</function>, you may see an entry for
2642<function>a()->c()</function>).  What's actually occurring is that we
2643are taking samples at the very start (or the very end) of
2644<function>c()</function>; at these few instructions, we haven't yet
2645created the new function's frame, so it appears as if
2646<function>a()</function> is calling directly into
2647<function>c()</function>. Be careful not to be misled by these
2648entries.
2649</para>
2650<para>
2651Like the rest of OProfile, call-graph profiling uses a statistical
2652approach; this means that sometimes a backtrace sample is truncated, or
2653even partially wrong. Bear this in mind when examining results.
2654</para>
2655<!--  FIXME: what do we need here ? -->
2656</sect1>
2657
2658<sect1 id="debug-info">
2659<title>Inaccuracies in annotated source</title>
2660<sect2 id="effect-of-optimizations">
2661<title>Side effects of optimizations</title>
2662<para>
2663The compiler can introduce some pitfalls in the annotated source output.
2664The optimizer can move pieces of code in such manner that two line of codes
2665are interlaced (instruction scheduling). Also debug info generated by the compiler 
2666can show strange behavior. This is especially true for complex expressions e.g. inside
2667an if statement:
2668</para>
2669<screen>
2670	if (a &amp;&amp; ..
2671	    b &amp;&amp; ..
2672	    c &amp;&amp;)
2673</screen>
2674<para>
2675here the problem come from the position of line number. The available debug
2676info does not give enough details for the if condition, so all samples are
2677accumulated at the position of the right brace of the expression. Using
2678<command>opannotate <option>-a</option></command> can help to show the real
2679samples at an assembly level.
2680</para>
2681</sect2>
2682<sect2 id="prologues">
2683<title>Prologues and epilogues</title>
2684<para>
2685The compiler generally needs to generate "glue" code across function calls, dependent
2686on the particular function call conventions used. Additionally other things
2687need to happen, like stack pointer adjustment for the local variables; this
2688code is known as the function prologue. Similar code is needed at function return,
2689and is known as the function epilogue. This will show up in annotations as
2690samples at the very start and end of a function, where there is no apparent
2691executable code in the source.
2692</para>
2693</sect2>
2694<sect2 id="inlined-function">
2695<title>Inlined functions</title>
2696<para>
2697You may see that a function is credited with a certain number of samples, but
2698the listing does not add up to the correct total. To pick a real example :
2699</para>
2700<screen>
2701               :internal_sk_buff_alloc_security(struct sk_buff *skb)
2702 353 2.342%    :{ /* internal_sk_buff_alloc_security total: 1882 12.48% */
2703               :
2704               :        sk_buff_security_t *sksec;
2705  15 0.0995%   :        int rc = 0;
2706               :
2707  10 0.06633%  :        sksec = skb-&gt;lsm_security;
2708 468 3.104%    :        if (sksec &amp;&amp; sksec-&gt;magic == DSI_MAGIC) {
2709               :                goto out;
2710               :        }
2711               :
2712               :        sksec = (sk_buff_security_t *) get_sk_buff_memory(skb);
2713   3 0.0199%   :        if (!sksec) {
2714  38 0.2521%   :                rc = -ENOMEM;
2715               :                goto out;
2716  10 0.06633%  :        }
2717               :        memset(sksec, 0, sizeof (sk_buff_security_t));
2718  44 0.2919%   :        sksec-&gt;magic = DSI_MAGIC;
2719  32 0.2123%   :        sksec-&gt;skb = skb;
2720  45 0.2985%   :        sksec-&gt;sid = DSI_SID_NORMAL;
2721  31 0.2056%   :        skb-&gt;lsm_security = sksec;
2722               :
2723               :      out:
2724               :
2725 146 0.9685%   :        return rc;
2726               :
2727  98 0.6501%   :}
2728</screen>
2729<para>
2730Here, the function is credited with 1,882 samples, but the annotations
2731below do not account for this. This is usually because of inline functions -
2732the compiler marks such code with debug entries for the inline function
2733definition, and this is where <command>opannotate</command> annotates
2734such samples. In the case above, <function>memset</function> is the most
2735likely candidate for this problem. Examining the mixed source/assembly
2736output can help identify such results.
2737</para>
2738<para>
2739This problem is more visible when there is no source file available, in the
2740following example it's trivially visible the sums of symbols samples is less
2741than the number of the samples for this file. The difference must be accounted
2742to inline functions.
2743</para>
2744<screen>
2745/*
2746 * Total samples for file : "arch/i386/kernel/process.c"
2747 *
2748 *    109  2.4616
2749 */
2750
2751 /* default_idle total:     84  1.8970 */
2752 /* cpu_idle total:         21  0.4743 */
2753 /* flush_thread total:      1  0.0226 */
2754 /* prepare_to_copy total:   1  0.0226 */
2755 /* __switch_to total:      18  0.4065 */
2756</screen>
2757<para>
2758The missing samples are not lost, they will be credited to another source
2759location where the inlined function is defined. The inlined function will be
2760credited from multiple call site and merged in one place in the annotated
2761source file so there is no way to see from what call site are coming the
2762samples for an inlined function.
2763</para>
2764<para>
2765When running <command>opannotate</command>, you may get a warning
2766"some functions compiled without debug information may have incorrect source line attributions".
2767In some rare cases, OProfile is not able to verify that the derived source line
2768is correct (when some parts of the binary image are compiled without debugging
2769information). Be wary of results if this warning appears.
2770</para>
2771<para>
2772Furthermore, for some languages the compiler can implicitly generate functions,
2773such as default copy constructors. Such functions are labelled by the compiler
2774as having a line number of 0, which means the source annotation can be confusing.
2775</para>
2776<!-- FIXME so what *actually* happens to those samples ? ignored ? -->
2777</sect2>
2778<sect2 id="wrong-linenr-info">
2779<title>Inaccuracy in line number information</title>
2780<para>
2781Depending on your compiler you can fall into the following problem:
2782</para>
2783<screen>
2784struct big_object { int a[500]; };
2785
2786int main()
2787{
2788	big_object a, b;
2789	for (int i = 0 ; i != 1000 * 1000; ++i)
2790		b = a;
2791	return 0;
2792}
2793
2794</screen>
2795<para>
2796Compiled with <command>gcc</command> 3.0.4 the annotated source is clearly inaccurate:
2797</para>
2798<screen>
2799               :int main()
2800               :{  /* main total: 7871 100% */
2801               :        big_object a, b;
2802               :        for (int i = 0 ; i != 1000 * 1000; ++i)
2803               :                b = a;
2804 7871 100%     :        return 0;
2805               :}
2806</screen>
2807<para>
2808The problem here is distinct from the IRQ latency problem; the debug line number
2809information is not precise enough; again, looking at output of <command>opannoatate -as</command> can help.
2810</para>
2811<screen>
2812               :int main()
2813               :{
2814               :        big_object a, b;
2815               :        for (int i = 0 ; i != 1000 * 1000; ++i)
2816               : 80484c0:       push   %ebp
2817               : 80484c1:       mov    %esp,%ebp
2818               : 80484c3:       sub    $0xfac,%esp
2819               : 80484c9:       push   %edi
2820               : 80484ca:       push   %esi
2821               : 80484cb:       push   %ebx
2822               :                b = a;
2823               : 80484cc:       lea    0xfffff060(%ebp),%edx
2824               : 80484d2:       lea    0xfffff830(%ebp),%eax
2825               : 80484d8:       mov    $0xf423f,%ebx
2826               : 80484dd:       lea    0x0(%esi),%esi
2827               :        return 0;
2828    3 0.03811% : 80484e0:       mov    %edx,%edi
2829               : 80484e2:       mov    %eax,%esi
2830    1 0.0127%  : 80484e4:       cld
2831    8 0.1016%  : 80484e5:       mov    $0x1f4,%ecx
2832 7850 99.73%   : 80484ea:       repz movsl %ds:(%esi),%es:(%edi)
2833    9 0.1143%  : 80484ec:       dec    %ebx
2834               : 80484ed:       jns    80484e0
2835               : 80484ef:       xor    %eax,%eax
2836               : 80484f1:       pop    %ebx
2837               : 80484f2:       pop    %esi
2838               : 80484f3:       pop    %edi
2839               : 80484f4:       leave
2840               : 80484f5:       ret
2841</screen>
2842<para>
2843So here it's clear that copying is correctly credited with of all the samples, but the
2844line number information is misplaced. <command>objdump -dS</command> exposes the
2845same problem. Note that maintaining accurate debug information for compilers when optimizing is difficult, so this problem is not suprising.
2846The problem of debug information
2847accuracy is also dependent on the binutils version used; some BFD library versions
2848contain a work-around for known problems of <command>gcc</command>, some others do not. This is unfortunate but we must live with that,
2849since profiling is pointless when you disable optimisation (which would give better debugging entries).
2850</para>
2851</sect2>
2852</sect1>
2853<sect1 id="symbol-without-debug-info">
2854<title>Assembly functions</title>
2855<para>
2856Often the assembler cannot generate debug information automatically.
2857This means that you cannot get a source report unless 
2858you manually define the neccessary debug information; read your assembler documentation for how you might
2859do that. The only
2860debugging info needed currently by OProfile is the line-number/filename-VMA association. When profiling assembly
2861without debugging info you can always get report for symbols, and optionally for VMA, through <command>opreport -l</command>
2862or <command>opreport -d</command>, but this works only for symbols with the right attributes.
2863For <command>gas</command> you can get this by
2864</para>
2865<screen>
2866.globl foo
2867	.type	foo,@function
2868</screen>
2869<para> 
2870whilst for <command>nasm</command> you must use
2871</para>
2872<screen>
2873	  GLOBAL foo:function		; [1]
2874</screen>
2875<para>
2876Note that OProfile does not need the global attribute, only the function attribute.
2877</para>
2878</sect1>
2879<!-- 
2880
2881FIXME: I commented this bit out until we've written something ...
2882
2883improve this ? but look first why this file is special 
2884<sect2 id="small-functions">
2885<title>Small functions</title>
2886<para>
2887Very small functions can show strange behavior. The file in your source
2888directory of OProfile <filename>$SRC/test-oprofile/understanding/puzzle.c</filename>
2889show such example
2890</para>
2891</sect2>
2892--> 
2893
2894<sect1 id="overlapping-symbols">
2895	<title>Overlapping symbols in JITed code</title>
2896	<para>
2897	Some virtual machines (e.g., Java) may re-JIT a method, resulting in previously
2898	allocated space for a piece of compiled code to be reused. This means that, at one distinct
2899	code address, multiple symbols/methods may be present during the run time of the application.
2900	</para>
2901	<para>
2902	Since OProfile samples are buffered and don&prime;t have timing information, there is no way
2903	to correlate samples with the (possibly) varying address ranges in which the code for a symbol
2904	may reside.
2905	An alternative would be flushing the OProfile sampling buffer when we get an unload event,
2906	but this could result in high overhead.
2907	</para>
2908	<para>
2909	To moderate the problem of overlapping symbols, OProfile tries to select the symbol that was
2910	present at this address range most of the time. Additionally, other overlapping symbols
2911	are truncated in the overlapping area.
2912	This gives reasonable results, because in reality, address reuse typically takes place
2913	during phase changes of the application -- in particular, during application  startup.
2914	Thus, for optimum profiling results, start the sampling session after application startup
2915	and burn in.
2916	</para>
2917</sect1>
2918
2919<sect1 id="hidden-cost">
2920<title>Other discrepancies</title>
2921<para>
2922Another cause of apparent problems is the hidden cost of instructions. A very
2923common example is two memory reads: one from L1 cache and the other from memory:
2924the second memory read is likely to have more samples.
2925There are many other causes of hidden cost of instructions. A non-exhaustive
2926list: mis-predicted branch, TLB cache miss, partial register stall,
2927partial register dependencies, memory mismatch stall, re-executed �ops. If you want to write
2928programs at the assembly level, be sure to take a look at the Intel and
2929AMD documentation at <ulink url="http://developer.intel.com/">http://developer.intel.com/</ulink>
2930and <ulink url="http://developer.amd.com/devguides.jsp/">http://developer.amd.com/devguides.jsp</ulink>.
2931</para>
2932</sect1>
2933</chapter>
2934
2935
2936<chapter id="ack">
2937<title>Acknowledgments</title>
2938<para>
2939Thanks to (in no particular order) : Arjan van de Ven, Rik van Riel, Juan Quintela, Philippe Elie,
2940Phillipp Rumpf, Tigran Aivazian, Alex Brown, Alisdair Rawsthorne, Bob Montgomery, Ray Bryant, H.J. Lu,
2941Jeff Esper, Will Cohen, Graydon Hoare, Cliff Woolley, Alex Tsariounov, Al Stone, Jason Yeh,
2942Randolph Chung, Anton Blanchard, Richard Henderson, Andries Brouwer, Bryan Rittmeyer,
2943Maynard P. Johnson,
2944Richard Reich (rreich@rdrtech.com), Zwane Mwaikambo, Dave Jones, Charles Filtness; and finally Pulp, for "Intro".
2945</para>
2946</chapter>
2947
2948</book>
2949