1\input texinfo   @c -*-texinfo-*-
2@c %**start of header
3@setfilename netperf.info
4@settitle Care and Feeding of Netperf 2.6.X
5@c %**end of header
6
7@copying
8This is Rick Jones' feeble attempt at a Texinfo-based manual for the
9netperf benchmark. 
10
11Copyright @copyright{} 2005-2012 Hewlett-Packard Company
12@quotation
13Permission is granted to copy, distribute and/or modify this document
14per the terms of the netperf source license, a copy of which can be
15found in the file @file{COPYING} of the basic netperf distribution.
16@end quotation
17@end copying
18
19@titlepage
20@title Care and Feeding of Netperf
21@subtitle Versions 2.6.0 and Later
22@author Rick Jones @email{rick.jones2@@hp.com}
23@c this is here to start the copyright page
24@page
25@vskip 0pt plus 1filll
26@insertcopying
27@end titlepage
28
29@c begin with a table of contents
30@contents
31
32@ifnottex
33@node Top, Introduction, (dir), (dir)
34@top Netperf Manual
35
36@insertcopying
37@end ifnottex
38
39@menu
40* Introduction::                An introduction to netperf - what it
41is and what it is not.
42* Installing Netperf::          How to go about installing netperf.
43* The Design of Netperf::       
44* Global Command-line Options::  
45* Using Netperf to Measure Bulk Data Transfer::  
46* Using Netperf to Measure Request/Response ::  
47* Using Netperf to Measure Aggregate Performance::  
48* Using Netperf to Measure Bidirectional Transfer::  
49* The Omni Tests::              
50* Other Netperf Tests::         
51* Address Resolution::          
52* Enhancing Netperf::           
53* Netperf4::                    
54* Concept Index::               
55* Option Index::                
56@end menu
57
58@node Introduction, Installing Netperf, Top, Top
59@chapter Introduction
60
61@cindex Introduction
62
63Netperf is a benchmark that can be use to measure various aspect of
64networking performance.  The primary foci are bulk (aka
65unidirectional) data transfer and request/response performance using
66either TCP or UDP and the Berkeley Sockets interface.  As of this
67writing, the tests available either unconditionally or conditionally
68include:
69
70@itemize @bullet
71@item
72TCP and UDP unidirectional transfer and request/response over IPv4 and
73IPv6 using the Sockets interface.
74@item
75TCP and UDP unidirectional transfer and request/response over IPv4
76using the XTI interface.
77@item
78Link-level unidirectional transfer and request/response using the DLPI
79interface. 
80@item
81Unix domain sockets
82@item
83SCTP unidirectional transfer and request/response over IPv4 and IPv6
84using the sockets interface.
85@end itemize
86
87While not every revision of netperf will work on every platform
88listed, the intention is that at least some version of netperf will
89work on the following platforms:
90
91@itemize @bullet
92@item
93Unix - at least all the major variants.
94@item
95Linux
96@item
97Windows
98@item
99Others
100@end itemize
101
102Netperf is maintained and informally supported primarily by Rick
103Jones, who can perhaps be best described as Netperf Contributing
104Editor.  Non-trivial and very appreciated assistance comes from others
105in the network performance community, who are too numerous to mention
106here. While it is often used by them, netperf is NOT supported via any
107of the formal Hewlett-Packard support channels.  You should feel free
108to make enhancements and modifications to netperf to suit your
109nefarious porpoises, so long as you stay within the guidelines of the
110netperf copyright.  If you feel so inclined, you can send your changes
111to
112@email{netperf-feedback@@netperf.org,netperf-feedback} for possible
113inclusion into subsequent versions of netperf.
114
115It is the Contributing Editor's belief that the netperf license walks
116like open source and talks like open source. However, the license was
117never submitted for ``certification'' as an open source license.  If
118you would prefer to make contributions to a networking benchmark using
119a certified open source license, please consider netperf4, which is
120distributed under the terms of the GPLv2.
121
122The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is
123available to discuss the care and feeding of netperf with others who
124share your interest in network performance benchmarking. The
125netperf-talk mailing list is a closed list (to deal with spam) and you
126must first subscribe by sending email to
127@email{netperf-talk-request@@netperf.org,netperf-talk-request}.
128
129
130@menu
131* Conventions::                 
132@end menu
133
134@node Conventions,  , Introduction, Introduction
135@section Conventions
136
137A @dfn{sizespec} is a one or two item, comma-separated list used as an
138argument to a command-line option that can set one or two, related
139netperf parameters.  If you wish to set both parameters to separate
140values, items should be separated by a comma:
141
142@example
143parameter1,parameter2
144@end example
145
146If you wish to set the first parameter without altering the value of
147the second from its default, you should follow the first item with a
148comma:
149
150@example
151parameter1,
152@end example
153
154
155Likewise, precede the item with a comma if you wish to set only the
156second parameter:
157
158@example
159,parameter2
160@end example
161
162An item with no commas:
163
164@example
165parameter1and2
166@end example
167
168will set both parameters to the same value.  This last mode is one of
169the most frequently used.
170
171There is another variant of the comma-separated, two-item list called
172a @dfn{optionspec} which is like a sizespec with the exception that a
173single item with no comma:
174
175@example
176parameter1
177@end example
178
179will only set the value of the first parameter and will leave the
180second parameter at its default value.
181
182Netperf has two types of command-line options.  The first are global
183command line options.  They are essentially any option not tied to a
184particular test or group of tests.  An example of a global
185command-line option is the one which sets the test type - @option{-t}.
186
187The second type of options are test-specific options.  These are
188options which are only applicable to a particular test or set of
189tests.  An example of a test-specific option would be the send socket
190buffer size for a TCP_STREAM test.
191
192Global command-line options are specified first with test-specific
193options following after a @code{--} as in:
194
195@example
196netperf <global> -- <test-specific>
197@end example
198
199
200@node Installing Netperf, The Design of Netperf, Introduction, Top
201@chapter Installing Netperf
202
203@cindex Installation
204
205Netperf's primary form of distribution is source code.  This allows
206installation on systems other than those to which the authors have
207ready access and thus the ability to create binaries.  There are two
208styles of netperf installation.  The first runs the netperf server
209program - netserver - as a child of inetd.  This requires the
210installer to have sufficient privileges to edit the files
211@file{/etc/services} and @file{/etc/inetd.conf} or their
212platform-specific equivalents.
213
214The second style is to run netserver as a standalone daemon.  This
215second method does not require edit privileges on @file{/etc/services}
216and @file{/etc/inetd.conf} but does mean you must remember to run the
217netserver program explicitly after every system reboot.
218
219This manual assumes that those wishing to measure networking
220performance already know how to use anonymous FTP and/or a web
221browser. It is also expected that you have at least a passing
222familiarity with the networking protocols and interfaces involved. In
223all honesty, if you do not have such familiarity, likely as not you
224have some experience to gain before attempting network performance
225measurements.  The excellent texts by authors such as Stevens, Fenner
226and Rudoff and/or Stallings would be good starting points. There are
227likely other excellent sources out there as well.
228
229@menu
230* Getting Netperf Bits::        
231* Installing Netperf Bits::     
232* Verifying Installation::      
233@end menu
234
235@node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf
236@section Getting Netperf Bits
237
238Gzipped tar files of netperf sources can be retrieved via 
239@uref{ftp://ftp.netperf.org/netperf,anonymous FTP}
240for ``released'' versions of the bits.  Pre-release versions of the
241bits can be retrieved via anonymous FTP from the
242@uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory.
243
244For convenience and ease of remembering, a link to the download site
245is provided via the 
246@uref{http://www.netperf.org/, NetperfPage}
247
248The bits corresponding to each discrete release of netperf are
249@uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval
250via subversion.  For example, there is a tag for the first version
251corresponding to this version of the manual - 
252@uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0,netperf
2532.6.0}.  Those wishing to be on the bleeding edge of netperf
254development can use subversion to grab the
255@uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}.  When
256fixing bugs or making enhancements, patches against the top-of-trunk
257are preferred.
258
259There are likely other places around the Internet from which one can
260download netperf bits.  These may be simple mirrors of the main
261netperf site, or they may be local variants on netperf.  As with
262anything one downloads from the Internet, take care to make sure it is
263what you really wanted and isn't some malicious Trojan or whatnot.
264Caveat downloader.
265
266As a general rule, binaries of netperf and netserver are not
267distributed from ftp.netperf.org.  From time to time a kind soul or
268souls has packaged netperf as a Debian package available via the
269apt-get mechanism or as an RPM.  I would be most interested in
270learning how to enhance the makefiles to make that easier for people.
271
272@node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf
273@section Installing Netperf
274
275Once you have downloaded the tar file of netperf sources onto your
276system(s), it is necessary to unpack the tar file, cd to the netperf
277directory, run configure and then make.  Most of the time it should be
278sufficient to just:
279
280@example
281gzcat netperf-<version>.tar.gz | tar xf -
282cd netperf-<version>
283./configure
284make
285make install
286@end example
287
288Most of the ``usual'' configure script options should be present
289dealing with where to install binaries and whatnot.  
290@example
291./configure --help
292@end example
293should list all of those and more.  You may find the @code{--prefix}
294option helpful in deciding where the binaries and such will be put
295during the @code{make install}.
296
297@vindex --enable-cpuutil, Configure
298If the netperf configure script does not know how to automagically
299detect which CPU utilization mechanism to use on your platform you may
300want to add a @code{--enable-cpuutil=mumble} option to the configure
301command.   If you have knowledge and/or experience to contribute to
302that area, feel free to contact @email{netperf-feedback@@netperf.org}.
303
304@vindex --enable-xti, Configure
305@vindex --enable-unixdomain, Configure
306@vindex --enable-dlpi, Configure
307@vindex --enable-sctp, Configure
308Similarly, if you want tests using the XTI interface, Unix Domain
309Sockets, DLPI or SCTP it will be necessary to add one or more
310@code{--enable-[xti|unixdomain|dlpi|sctp]=yes} options to the configure
311command.  As of this writing, the configure script will not include
312those tests automagically.
313
314@vindex --enable-omni, Configure
315Starting with version 2.5.0, netperf began migrating most of the
316``classic'' netperf tests found in @file{src/nettest_bsd.c} to the
317so-called ``omni'' tests (aka ``two routines to run them all'') found
318in @file{src/nettest_omni.c}.  This migration enables a number of new
319features such as greater control over what output is included, and new
320things to output.  The ``omni'' test is enabled by default in 2.5.0
321and a number of the classic tests are migrated - you can tell if a
322test has been migrated
323from the presence of @code{MIGRATED} in the test banner.  If you
324encounter problems with either the omni or migrated tests, please
325first attempt to obtain resolution via
326@email{netperf-talk@@netperf.org} or
327@email{netperf-feedback@@netperf.org}.  If that is unsuccessful, you
328can add a @code{--enable-omni=no} to the configure command and the
329omni tests will not be compiled-in and the classic tests will not be
330migrated.
331
332Starting with version 2.5.0, netperf includes the ``burst mode''
333functionality in a default compilation of the bits.  If you encounter
334problems with this, please first attempt to obtain help via
335@email{netperf-talk@@netperf.org} or
336@email{netperf-feedback@@netperf.org}.  If that is unsuccessful, you
337can add a @code{--enable-burst=no} to the configure command and the
338burst mode functionality will not be compiled-in.
339
340On some platforms, it may be necessary to precede the configure
341command with a CFLAGS and/or LIBS variable as the netperf configure
342script is not yet smart enough to set them itself.  Whenever possible,
343these requirements will be found in @file{README.@var{platform}} files.
344Expertise and assistance in making that more automagic in the
345configure script would be most welcome.
346
347@cindex Limiting Bandwidth
348@cindex Bandwidth Limitation
349@vindex --enable-intervals, Configure
350@vindex --enable-histogram, Configure
351Other optional configure-time settings include
352@code{--enable-intervals=yes} to give netperf the ability to ``pace''
353its _STREAM tests and @code{--enable-histogram=yes} to have netperf
354keep a histogram of interesting times.  Each of these will have some
355effect on the measured result.  If your system supports
356@code{gethrtime()} the effect of the histogram measurement should be
357minimized but probably still measurable.  For example, the histogram
358of a netperf TCP_RR test will be of the individual transaction times:
359@example
360netperf -t TCP_RR -H lag -v 2
361TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram
362Local /Remote
363Socket Size   Request  Resp.   Elapsed  Trans.
364Send   Recv   Size     Size    Time     Rate         
365bytes  Bytes  bytes    bytes   secs.    per sec   
366
36716384  87380  1        1       10.00    3538.82   
36832768  32768 
369Alignment      Offset
370Local  Remote  Local  Remote
371Send   Recv    Send   Recv
372    8      0       0      0
373Histogram of request/response times
374UNIT_USEC     :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
375TEN_USEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
376HUNDRED_USEC  :    0: 34480:  111:   13:   12:    6:    9:    3:    4:    7
377UNIT_MSEC     :    0:   60:   50:   51:   44:   44:   72:  119:  100:  101
378TEN_MSEC      :    0:  105:    0:    0:    0:    0:    0:    0:    0:    0
379HUNDRED_MSEC  :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
380UNIT_SEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
381TEN_SEC       :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
382>100_SECS: 0
383HIST_TOTAL:      35391
384@end example
385
386The histogram you see above is basically a base-10 log histogram where
387we can see that most of the transaction times were on the order of one
388hundred to one-hundred, ninety-nine microseconds, but they were
389occasionally as long as ten to nineteen milliseconds
390
391The @option{--enable-demo=yes} configure option will cause code to be
392included to report interim results during a test run.  The rate at
393which interim results are reported can then be controlled via the
394global @option{-D} option.  Here is an example of @option{-D} output:
395
396@example
397$ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M
398MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo
399Interim result:    5.41 MBytes/s over 1.35 seconds ending at 1308789765.848
400Interim result:   11.07 MBytes/s over 1.36 seconds ending at 1308789767.206
401Interim result:   16.00 MBytes/s over 1.36 seconds ending at 1308789768.566
402Interim result:   20.66 MBytes/s over 1.36 seconds ending at 1308789769.922
403Interim result:   22.74 MBytes/s over 1.36 seconds ending at 1308789771.285
404Interim result:   23.07 MBytes/s over 1.36 seconds ending at 1308789772.647
405Interim result:   23.77 MBytes/s over 1.37 seconds ending at 1308789774.016
406Recv   Send    Send                          
407Socket Socket  Message  Elapsed              
408Size   Size    Size     Time     Throughput  
409bytes  bytes   bytes    secs.    MBytes/sec  
410
411 87380  16384  16384    10.06      17.81   
412@end example
413
414Notice how the units of the interim result track that requested by the
415@option{-f} option.  Also notice that sometimes the interval will be
416longer than the value specified in the @option{-D} option.  This is
417normal and stems from how demo mode is implemented not by relying on
418interval timers or frequent calls to get the current time, but by
419calculating how many units of work must be performed to take at least
420the desired interval.
421
422Those familiar with this option in earlier versions of netperf will
423note the addition of the ``ending at'' text.  This is the time as
424reported by a @code{gettimeofday()} call (or its emulation) with a
425@code{NULL} timezone pointer.  This addition is intended to make it
426easier to insert interim results into an
427@uref{http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html,rrdtool}
428Round-Robin Database (RRD).  A likely bug-riddled example of doing so
429can be found in @file{doc/examples/netperf_interim_to_rrd.sh}.  The
430time is reported out to milliseconds rather than microseconds because
431that is the most rrdtool understands as of the time of this writing.
432
433As of this writing, a @code{make install} will not actually update the
434files @file{/etc/services} and/or @file{/etc/inetd.conf} or their
435platform-specific equivalents.  It remains necessary to perform that
436bit of installation magic by hand.  Patches to the makefile sources to
437effect an automagic editing of the necessary files to have netperf
438installed as a child of inetd would be most welcome.
439
440Starting the netserver as a standalone daemon should be as easy as:
441@example
442$ netserver
443Starting netserver at port 12865
444Starting netserver at hostname 0.0.0.0 port 12865 and family 0
445@end example
446
447Over time the specifics of the messages netserver prints to the screen
448may change but the gist will remain the same.
449
450If the compilation of netperf or netserver happens to fail, feel free
451to contact @email{netperf-feedback@@netperf.org} or join and ask in
452@email{netperf-talk@@netperf.org}.  However, it is quite important
453that you include the actual compilation errors and perhaps even the
454configure log in your email.  Otherwise, it will be that much more
455difficult for someone to assist you.
456
457@node Verifying Installation,  , Installing Netperf Bits, Installing Netperf
458@section Verifying Installation
459
460Basically, once netperf is installed and netserver is configured as a
461child of inetd, or launched as a standalone daemon, simply typing:
462@example
463netperf
464@end example
465should result in output similar to the following:
466@example
467$ netperf
468TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET
469Recv   Send    Send                          
470Socket Socket  Message  Elapsed              
471Size   Size    Size     Time     Throughput  
472bytes  bytes   bytes    secs.    10^6bits/sec  
473
474 87380  16384  16384    10.00    2997.84   
475@end example
476
477
478@node The Design of Netperf, Global Command-line Options, Installing Netperf, Top
479@chapter The Design of Netperf
480
481@cindex Design of Netperf
482
483Netperf is designed around a basic client-server model.  There are
484two executables - netperf and netserver.  Generally you will only
485execute the netperf program, with the netserver program being invoked
486by the remote system's inetd or having been previously started as its
487own standalone daemon.
488
489When you execute netperf it will establish a ``control connection'' to
490the remote system.  This connection will be used to pass test
491configuration information and results to and from the remote system.
492Regardless of the type of test to be run, the control connection will
493be a TCP connection using BSD sockets.  The control connection can use
494either IPv4 or IPv6.
495
496Once the control connection is up and the configuration information
497has been passed, a separate ``data'' connection will be opened for the
498measurement itself using the API's and protocols appropriate for the
499specified test.  When the test is completed, the data connection will
500be torn-down and results from the netserver will be passed-back via the
501control connection and combined with netperf's result for display to
502the user.
503
504Netperf places no traffic on the control connection while a test is in
505progress.  Certain TCP options, such as SO_KEEPALIVE, if set as your
506systems' default, may put packets out on the control connection while
507a test is in progress.  Generally speaking this will have no effect on
508the results.
509
510@menu
511* CPU Utilization::             
512@end menu
513
514@node CPU Utilization,  , The Design of Netperf, The Design of Netperf
515@section CPU Utilization
516@cindex CPU Utilization
517
518CPU utilization is an important, and alas all-too infrequently
519reported component of networking performance.  Unfortunately, it can
520be one of the most difficult metrics to measure accurately and
521portably.  Netperf will do its level best to report accurate
522CPU utilization figures, but some combinations of processor, OS and
523configuration may make that difficult.
524
525CPU utilization in netperf is reported as a value between 0 and 100%
526regardless of the number of CPUs involved.  In addition to CPU
527utilization, netperf will report a metric called a @dfn{service
528demand}.  The service demand is the normalization of CPU utilization
529and work performed.  For a _STREAM test it is the microseconds of CPU
530time consumed to transfer on KB (K == 1024) of data.  For a _RR test
531it is the microseconds of CPU time consumed processing a single
532transaction.   For both CPU utilization and service demand, lower is
533better. 
534
535Service demand can be particularly useful when trying to gauge the
536effect of a performance change.  It is essentially a measure of
537efficiency, with smaller values being more efficient and thus
538``better.''
539
540Netperf is coded to be able to use one of several, generally
541platform-specific CPU utilization measurement mechanisms.  Single
542letter codes will be included in the CPU portion of the test banner to
543indicate which mechanism was used on each of the local (netperf) and
544remote (netserver) system.
545
546As of this writing those codes are:
547
548@table @code
549@item U
550The CPU utilization measurement mechanism was unknown to netperf or
551netperf/netserver was not compiled to include CPU utilization
552measurements. The code for the null CPU utilization mechanism can be
553found in @file{src/netcpu_none.c}.
554@item I
555An HP-UX-specific CPU utilization mechanism whereby the kernel
556incremented a per-CPU counter by one for each trip through the idle
557loop. This mechanism was only available on specially-compiled HP-UX
558kernels prior to HP-UX 10 and is mentioned here only for the sake of
559historical completeness and perhaps as a suggestion to those who might
560be altering other operating systems. While rather simple, perhaps even
561simplistic, this mechanism was quite robust and was not affected by
562the concerns of statistical methods, or methods attempting to track
563time in each of user, kernel, interrupt and idle modes which require
564quite careful accounting.  It can be thought-of as the in-kernel
565version of the looper @code{L} mechanism without the context switch
566overhead. This mechanism required calibration.
567@item P
568An HP-UX-specific CPU utilization mechanism whereby the kernel
569keeps-track of time (in the form of CPU cycles) spent in the kernel
570idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel keeps
571track of time spent in idle, user, kernel and interrupt processing
572(HP-UX 11.23 and later).  The former requires calibration, the latter
573does not.  Values in either case are retrieved via one of the pstat(2)
574family of calls, hence the use of the letter @code{P}.  The code for
575these mechanisms is found in @file{src/netcpu_pstat.c} and
576@file{src/netcpu_pstatnew.c} respectively.
577@item K
578A Solaris-specific CPU utilization mechanism whereby the kernel keeps
579track of ticks (eg HZ) spent in the idle loop.  This method is
580statistical and is known to be inaccurate when the interrupt rate is
581above epsilon as time spent processing interrupts is not subtracted
582from idle.  The value is retrieved via a kstat() call - hence the use
583of the letter @code{K}.  Since this mechanism uses units of ticks (HZ)
584the calibration value should invariably match HZ. (Eg 100) The code
585for this mechanism is implemented in @file{src/netcpu_kstat.c}.
586@item M
587A Solaris-specific mechanism available on Solaris 10 and latter which
588uses the new microstate accounting mechanisms.  There are two, alas,
589overlapping, mechanisms.  The first tracks nanoseconds spent in user,
590kernel, and idle modes. The second mechanism tracks nanoseconds spent
591in interrupt.  Since the mechanisms overlap, netperf goes through some
592hand-waving to try to ``fix'' the problem.  Since the accuracy of the
593handwaving cannot be completely determined, one must presume that
594while better than the @code{K} mechanism, this mechanism too is not
595without issues.  The values are retrieved via kstat() calls, but the
596letter code is set to @code{M} to distinguish this mechanism from the
597even less accurate @code{K} mechanism.  The code for this mechanism is
598implemented in @file{src/netcpu_kstat10.c}.
599@item L
600A mechanism based on ``looper''or ``soaker'' processes which sit in
601tight loops counting as fast as they possibly can. This mechanism
602starts a looper process for each known CPU on the system.  The effect
603of processor hyperthreading on the mechanism is not yet known.  This
604mechanism definitely requires calibration.  The code for the
605``looper''mechanism can be found in @file{src/netcpu_looper.c}
606@item N
607A Microsoft Windows-specific mechanism, the code for which can be
608found in @file{src/netcpu_ntperf.c}.  This mechanism too is based on
609what appears to be a form of micro-state accounting and requires no
610calibration.  On laptops, or other systems which may dynamically alter
611the CPU frequency to minimize power consumption, it has been suggested
612that this mechanism may become slightly confused, in which case using
613BIOS/uEFI settings to disable the power saving would be indicated.
614
615@item S
616This mechanism uses @file{/proc/stat} on Linux to retrieve time
617(ticks) spent in idle mode.  It is thought but not known to be
618reasonably accurate.  The code for this mechanism can be found in
619@file{src/netcpu_procstat.c}.
620@item C
621A mechanism somewhat similar to @code{S} but using the sysctl() call
622on BSD-like Operating systems (*BSD and MacOS X).  The code for this
623mechanism can be found in @file{src/netcpu_sysctl.c}.
624@item Others
625Other mechanisms included in netperf in the past have included using
626the times() and getrusage() calls.  These calls are actually rather
627poorly suited to the task of measuring CPU overhead for networking as
628they tend to be process-specific and much network-related processing
629can happen outside the context of a process, in places where it is not
630a given it will be charged to the correct, or even a process.  They
631are mentioned here as a warning to anyone seeing those mechanisms used
632in other networking benchmarks.  These mechanisms are not available in
633netperf 2.4.0 and later.
634@end table
635
636For many platforms, the configure script will chose the best available
637CPU utilization mechanism.  However, some platforms have no
638particularly good mechanisms.  On those platforms, it is probably best
639to use the ``LOOPER'' mechanism which is basically some number of
640processes (as many as there are processors) sitting in tight little
641loops counting as fast as they can.  The rate at which the loopers
642count when the system is believed to be idle is compared with the rate
643when the system is running netperf and the ratio is used to compute
644CPU utilization.
645
646In the past, netperf included some mechanisms that only reported CPU
647time charged to the calling process.  Those mechanisms have been
648removed from netperf versions 2.4.0 and later because they are
649hopelessly inaccurate.  Networking can and often results in CPU time
650being spent in places - such as interrupt contexts - that do not get
651charged to a or the correct process.
652
653In fact, time spent in the processing of interrupts is a common issue
654for many CPU utilization mechanisms.  In particular, the ``PSTAT''
655mechanism was eventually known to have problems accounting for certain
656interrupt time prior to HP-UX 11.11 (11iv1).  HP-UX 11iv2 and later
657are known/presumed to be good. The ``KSTAT'' mechanism is known to
658have problems on all versions of Solaris up to and including Solaris
65910.  Even the microstate accounting available via kstat in Solaris 10
660has issues, though perhaps not as bad as those of prior versions.
661
662The /proc/stat mechanism under Linux is in what the author would
663consider an ``uncertain'' category as it appears to be statistical,
664which may also have issues with time spent processing interrupts.
665
666In summary, be sure to ``sanity-check'' the CPU utilization figures
667with other mechanisms.  However, platform tools such as top, vmstat or
668mpstat are often based on the same mechanisms used by netperf.
669
670@menu
671* CPU Utilization in a Virtual Guest::  
672@end menu
673
674@node CPU Utilization in a Virtual Guest,  , CPU Utilization, CPU Utilization
675@subsection CPU Utilization in a Virtual Guest
676
677The CPU utilization mechanisms used by netperf are ``inline'' in that
678they are run by the same netperf or netserver process as is running
679the test itself.  This works just fine for ``bare iron'' tests but
680runs into a problem when using virtual machines.
681
682The relationship between virtual guest and hypervisor can be thought
683of as being similar to that between a process and kernel in a bare
684iron system.  As such, (m)any CPU utilization mechanisms used in the
685virtual guest are similar to ``process-local'' mechanisms in a bare
686iron situation.  However, just as with bare iron and process-local
687mechanisms, much networking processing happens outside the context of
688the virtual guest.  It takes place in the hypervisor, and is not
689visible to mechanisms running in the guest(s).  For this reason, one
690should not really trust CPU utilization figures reported by netperf or
691netserver when running in a virtual guest.
692
693If one is looking to measure the added overhead of a virtualization
694mechanism, rather than rely on CPU utilization, one can rely instead
695on netperf _RR tests - path-lengths and overheads can be a significant
696fraction of the latency, so increases in overhead should appear as
697decreases in transaction rate.  Whatever you do, @b{DO NOT} rely on
698the throughput of a _STREAM test.  Achieving link-rate can be done via
699a multitude of options that mask overhead rather than eliminate it.
700
701@node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top
702@chapter Global Command-line Options
703
704This section describes each of the global command-line options
705available in the netperf and netserver binaries.  Essentially, it is
706an expanded version of the usage information displayed by netperf or
707netserver when invoked with the @option{-h} global command-line
708option.
709
710@menu
711* Command-line Options Syntax::  
712* Global Options::              
713@end menu
714
715@node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options
716@comment  node-name,  next,  previous,  up
717@section Command-line Options Syntax
718
719Revision 1.8 of netperf introduced enough new functionality to overrun
720the English alphabet for mnemonic command-line option names, and the
721author was not and is not quite ready to switch to the contemporary
722@option{--mumble} style of command-line options. (Call him a Luddite
723if you wish :).
724
725For this reason, the command-line options were split into two parts -
726the first are the global command-line options.  They are options that
727affect nearly any and every test type of netperf.  The second type are
728the test-specific command-line options.  Both are entered on the same
729command line, but they must be separated from one another by a @code{--}
730for correct parsing.  Global command-line options come first, followed
731by the @code{--} and then test-specific command-line options.  If there
732are no test-specific options to be set, the @code{--} may be omitted.  If
733there are no global command-line options to be set, test-specific
734options must still be preceded by a @code{--}.  For example:
735@example
736netperf <global> -- <test-specific>
737@end example
738sets both global and test-specific options:
739@example
740netperf <global>
741@end example
742sets just global options and:
743@example
744netperf -- <test-specific>
745@end example
746sets just test-specific options.
747
748@node Global Options,  , Command-line Options Syntax, Global Command-line Options
749@comment  node-name,  next,  previous,  up
750@section Global Options
751
752@table @code
753@vindex -a, Global
754@item -a <sizespec>
755This option allows you to alter the alignment of the buffers used in
756the sending and receiving calls on the local system.. Changing the
757alignment of the buffers can force the system to use different copy
758schemes, which can have a measurable effect on performance.  If the
759page size for the system were 4096 bytes, and you want to pass
760page-aligned buffers beginning on page boundaries, you could use
761@samp{-a 4096}.  By default the units are bytes, but suffix of ``G,''
762``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or
7632^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify
764units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes]
765
766@vindex -A, Global
767@item -A <sizespec>
768This option is identical to the @option{-a} option with the difference
769being it affects alignments for the remote system.
770
771@vindex -b, Global
772@item -b <size>
773This option is only present when netperf has been configure with
774--enable-intervals=yes prior to compilation.  It sets the size of the
775burst of send calls in a _STREAM test.  When used in conjunction with
776the @option{-w} option it can cause the rate at which data is sent to
777be ``paced.''
778
779@vindex -B, Global
780@item -B <string>
781This option will cause @option{<string>} to be appended to the brief
782(see -P) output of netperf.
783
784@vindex -c, Global
785@item -c [rate]
786This option will ask that CPU utilization and service demand be
787calculated for the local system.  For those CPU utilization mechanisms
788requiring calibration, the options rate parameter may be specified to
789preclude running another calibration step, saving 40 seconds of time.
790For those CPU utilization mechanisms requiring no calibration, the
791optional rate parameter will be utterly and completely ignored.
792[Default: no CPU measurements]
793
794@vindex -C, Global
795@item -C [rate]
796This option requests CPU utilization and service demand calculations
797for the remote system.  It is otherwise identical to the @option{-c}
798option.
799
800@vindex -d, Global
801@item -d
802Each instance of this option will increase the quantity of debugging
803output displayed during a test.  If the debugging output level is set
804high enough, it may have a measurable effect on performance.
805Debugging information for the local system is printed to stdout.
806Debugging information for the remote system is sent by default to the
807file @file{/tmp/netperf.debug}. [Default: no debugging output]
808
809@vindex -D, Global
810@item -D [interval,units]
811This option is only available when netperf is configured with
812--enable-demo=yes.  When set, it will cause netperf to emit periodic
813reports of performance during the run.  [@var{interval},@var{units}]
814follow the semantics of an optionspec. If specified,
815@var{interval} gives the minimum interval in real seconds, it does not
816have to be whole seconds.  The @var{units} value can be used for the
817first guess as to how many units of work (bytes or transactions) must
818be done to take at least @var{interval} seconds. If omitted,
819@var{interval} defaults to one second and @var{units} to values
820specific to each test type.
821
822@vindex -f, Global
823@item -f G|M|K|g|m|k|x
824This option can be used to change the reporting units for _STREAM
825tests.  Arguments of ``G,'' ``M,'' or ``K'' will set the units to
8262^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or
827KB).  Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9,
82810^6 or 10^3 bits/s respectively.  An argument of ``x'' requests the
829units be transactions per second and is only meaningful for a
830request-response test. [Default: ``m'' or 10^6 bits/s]
831
832@vindex -F, Global
833@item -F <fillfile>
834This option specified the file from which send which buffers will be
835pre-filled .  While the buffers will contain data from the specified
836file, the file is not fully transferred to the remote system as the
837receiving end of the test will not write the contents of what it
838receives to a file.  This can be used to pre-fill the send buffers
839with data having different compressibility and so is useful when
840measuring performance over mechanisms which perform compression. 
841
842While previously required for a TCP_SENDFILE test, later versions of
843netperf removed that restriction, creating a temporary file as
844needed.  While the author cannot recall exactly when that took place,
845it is known to be unnecessary in version 2.5.0 and later.
846
847@vindex -h, Global
848@item -h
849This option causes netperf to display its ``global'' usage string and
850exit to the exclusion of all else.
851
852@vindex -H, Global
853@item -H <optionspec>
854This option will set the name of the remote system and or the address
855family used for the control connection.  For example:
856@example
857-H linger,4
858@end example
859will set the name of the remote system to ``linger'' and tells netperf to
860use IPv4 addressing only.
861@example
862-H ,6
863@end example
864will leave the name of the remote system at its default, and request
865that only IPv6 addresses be used for the control connection.
866@example
867-H lag
868@end example
869will set the name of the remote system to ``lag'' and leave the
870address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is
871left to the system's address resolution.  
872
873A value of ``inet'' can be used in place of ``4'' to request IPv4 only
874addressing.  Similarly, a value of ``inet6'' can be used in place of
875``6'' to request IPv6 only addressing.  A value of ``0'' can be used
876to request either IPv4 or IPv6 addressing as name resolution dictates.
877
878By default, the options set with the global @option{-H} option are
879inherited by the test for its data connection, unless a test-specific
880@option{-H} option is specified.
881
882If a @option{-H} option follows either the @option{-4} or @option{-6}
883options, the family setting specified with the -H option will override
884the @option{-4} or @option{-6} options for the remote address
885family. If no address family is specified, settings from a previous
886@option{-4} or @option{-6} option will remain.  In a nutshell, the
887last explicit global command-line option wins.
888
889[Default:  ``localhost'' for the remote name/IP address and ``0'' (eg
890AF_UNSPEC) for the remote address family.]
891
892@vindex -I, Global
893@item -I <optionspec>
894This option enables the calculation of confidence intervals and sets
895the confidence and width parameters with the first half of the
896optionspec being either 99 or 95 for 99% or 95% confidence
897respectively.  The second value of the optionspec specifies the width
898of the desired confidence interval.  For example
899@example
900-I 99,5
901@end example
902asks netperf to be 99% confident that the measured mean values for
903throughput and CPU utilization are within +/- 2.5% of the ``real''
904mean values.  If the @option{-i} option is specified and the
905@option{-I} option is omitted, the confidence defaults to 99% and the
906width to 5% (giving +/- 2.5%)
907
908If classic netperf test calculates that the desired confidence
909intervals have not been met, it emits a noticeable warning that cannot
910be suppressed with the @option{-P} or @option{-v} options:
911
912@example
913netperf -H tardy.cup -i 3 -I 99,5
914TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf.
915!!! WARNING
916!!! Desired confidence was not achieved within the specified iterations.
917!!! This implies that there was variability in the test environment that
918!!! must be investigated before going further.
919!!! Confidence intervals: Throughput      :  6.8%
920!!!                       Local CPU util  :  0.0%
921!!!                       Remote CPU util :  0.0%
922
923Recv   Send    Send                          
924Socket Socket  Message  Elapsed              
925Size   Size    Size     Time     Throughput  
926bytes  bytes   bytes    secs.    10^6bits/sec  
927
928 32768  16384  16384    10.01      40.23   
929@end example
930
931In the example above we see that netperf did not meet the desired
932confidence intervals.  Instead of being 99% confident it was within
933+/- 2.5% of the real mean value of throughput it is only confident it
934was within +/-3.4%.  In this example, increasing the @option{-i}
935option (described below) and/or increasing the iteration length with
936the @option{-l} option might resolve the situation.
937
938In an explicit ``omni'' test, failure to meet the confidence intervals
939will not result in netperf emitting a warning.  To verify the hitting,
940or not, of the confidence intervals one will need to include them as
941part of an @ref{Omni Output Selection,output selection} in the
942test-specific @option{-o}, @option{-O} or @option{k} output selection
943options.  The warning about not hitting the confidence intervals will
944remain in a ``migrated'' classic netperf test.
945
946@vindex -i, Global
947@item -i <sizespec>
948This option enables the calculation of confidence intervals and sets
949the minimum and maximum number of iterations to run in attempting to
950achieve the desired confidence interval.  The first value sets the
951maximum number of iterations to run, the second, the minimum.  The
952maximum number of iterations is silently capped at 30 and the minimum
953is silently floored at 3.  Netperf repeats the measurement the minimum
954number of iterations and continues until it reaches either the
955desired confidence interval, or the maximum number of iterations,
956whichever comes first.  A classic or migrated netperf test will not
957display the actual number of iterations run. An @ref{The Omni
958Tests,omni test} will emit the number of iterations run if the
959@code{CONFIDENCE_ITERATION} output selector is included in the
960@ref{Omni Output Selection,output selection}.
961
962If the @option{-I} option is specified and the @option{-i} option
963omitted the maximum number of iterations is set to 10 and the minimum
964to three.
965
966Output of a warning upon not hitting the desired confidence intervals
967follows the description provided for the @option{-I} option.
968
969The total test time will be somewhere between the minimum and maximum
970number of iterations multiplied by the test length supplied by the
971@option{-l} option.
972
973@vindex -j, Global
974@item -j
975This option instructs netperf to keep additional timing statistics
976when explicitly running an @ref{The Omni Tests,omni test}.  These can
977be output when the test-specific @option{-o}, @option{-O} or
978@option{-k} @ref{Omni Output Selectors,output selectors} include one
979or more of:
980
981@itemize
982@item MIN_LATENCY
983@item MAX_LATENCY
984@item P50_LATENCY
985@item P90_LATENCY
986@item P99_LATENCY
987@item MEAN_LATENCY
988@item STDDEV_LATENCY
989@end itemize
990
991These statistics will be based on an expanded (100 buckets per row
992rather than 10) histogram of times rather than a terribly long list of
993individual times.  As such, there will be some slight error thanks to
994the bucketing. However, the reduction in storage and processing
995overheads is well worth it.  When running a request/response test, one
996might get some idea of the error by comparing the @ref{Omni Output
997Selectors,@code{MEAN_LATENCY}} calculated from the histogram with the
998@code{RT_LATENCY} calculated from the number of request/response
999transactions and the test run time.
1000
1001In the case of a request/response test the latencies will be
1002transaction latencies.  In the case of a receive-only test they will
1003be time spent in the receive call.  In the case of a send-only test
1004they will be time spent in the send call. The units will be
1005microseconds. Added in netperf 2.5.0.
1006
1007@vindex -l, Global
1008@item -l testlen
1009This option controls the length of any @b{one} iteration of the requested
1010test.  A positive value for @var{testlen} will run each iteration of
1011the test for at least @var{testlen} seconds.  A negative value for
1012@var{testlen} will run each iteration for the absolute value of
1013@var{testlen} transactions for a _RR test or bytes for a _STREAM test.
1014Certain tests, notably those using UDP can only be timed, they cannot
1015be limited by transaction or byte count.  This limitation may be
1016relaxed in an @ref{The Omni Tests,omni} test.
1017
1018In some situations, individual iterations of a test may run for longer
1019for the number of seconds specified by the @option{-l} option.  In
1020particular, this may occur for those tests where the socket buffer
1021size(s) are significantly longer than the bandwidthXdelay product of
1022the link(s) over which the data connection passes, or those tests
1023where there may be non-trivial numbers of retransmissions.
1024
1025If confidence intervals are enabled via either @option{-I} or
1026@option{-i} the total length of the netperf test will be somewhere
1027between the minimum and maximum iteration count multiplied by
1028@var{testlen}.
1029
1030@vindex -L, Global
1031@item -L <optionspec>
1032This option is identical to the @option{-H} option with the difference
1033being it sets the _local_ hostname/IP and/or address family
1034information.  This option is generally unnecessary, but can be useful
1035when you wish to make sure that the netperf control and data
1036connections go via different paths.  It can also come-in handy if one
1037is trying to run netperf through those evil, end-to-end breaking
1038things known as firewalls.
1039
1040[Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the
1041local name.  AF_UNSPEC for the local address family.]
1042
1043@vindex -n, Global
1044@item -n numcpus
1045This option tells netperf how many CPUs it should ass-u-me are active
1046on the system running netperf.  In particular, this is used for the
1047@ref{CPU Utilization,CPU utilization} and service demand calculations.
1048On certain systems, netperf is able to determine the number of CPU's
1049automagically. This option will override any number netperf might be
1050able to determine on its own.
1051
1052Note that this option does _not_ set the number of CPUs on the system
1053running netserver.  When netperf/netserver cannot automagically
1054determine the number of CPUs that can only be set for netserver via a
1055netserver @option{-n} command-line option.
1056
1057As it is almost universally possible for netperf/netserver to
1058determine the number of CPUs on the system automagically, 99 times out
1059of 10 this option should not be necessary and may be removed in a
1060future release of netperf.
1061
1062@vindex -N, Global
1063@item -N
1064This option tells netperf to forgo establishing a control
1065connection. This makes it is possible to run some limited netperf
1066tests without a corresponding netserver on the remote system.
1067
1068With this option set, the test to be run is to get all the addressing
1069information it needs to establish its data connection from the command
1070line or internal defaults.  If not otherwise specified by
1071test-specific command line options, the data connection for a
1072``STREAM'' or ``SENDFILE'' test will be to the ``discard'' port, an
1073``RR'' test will be to the ``echo'' port, and a ``MEARTS'' test will
1074be to the chargen port.  
1075
1076The response size of an ``RR'' test will be silently set to be the
1077same as the request size.  Otherwise the test would hang if the
1078response size was larger than the request size, or would report an
1079incorrect, inflated transaction rate if the response size was less
1080than the request size.
1081
1082Since there is no control connection when this option is specified, it
1083is not possible to set ``remote'' properties such as socket buffer
1084size and the like via the netperf command line. Nor is it possible to
1085retrieve such interesting remote information as CPU utilization.
1086These items will be displayed as values which should make it
1087immediately obvious that was the case.
1088
1089The only way to change remote characteristics such as socket buffer
1090size or to obtain information such as CPU utilization is to employ
1091platform-specific methods on the remote system.  Frankly, if one has
1092access to the remote system to employ those methods one aught to be
1093able to run a netserver there.  However, that ability may not be
1094present in certain ``support'' situations, hence the addition of this
1095option.
1096
1097Added in netperf 2.4.3.
1098
1099@vindex -o, Global
1100@item -o <sizespec>
1101The value(s) passed-in with this option will be used as an offset
1102added to the alignment specified with the @option{-a} option.  For
1103example:
1104@example
1105-o 3 -a 4096
1106@end example
1107will cause the buffers passed to the local (netperf) send and receive
1108calls to begin three bytes past an address aligned to 4096
1109bytes. [Default: 0 bytes]
1110
1111@vindex -O, Global
1112@item -O <sizespec>
1113This option behaves just as the @option{-o} option but on the remote
1114(netserver) system and in conjunction with the @option{-A}
1115option. [Default: 0 bytes]
1116
1117@vindex -p, Global
1118@item -p <optionspec>
1119The first value of the optionspec passed-in with this option tells
1120netperf the port number at which it should expect the remote netserver
1121to be listening for control connections.  The second value of the
1122optionspec will request netperf to bind to that local port number
1123before establishing the control connection.  For example
1124@example
1125-p 12345
1126@end example
1127tells netperf that the remote netserver is listening on port 12345 and
1128leaves selection of the local port number for the control connection
1129up to the local TCP/IP stack whereas
1130@example
1131-p ,32109
1132@end example
1133leaves the remote netserver port at the default value of 12865 and
1134causes netperf to bind to the local port number 32109 before
1135connecting to the remote netserver.
1136
1137In general, setting the local port number is only necessary when one
1138is looking to run netperf through those evil, end-to-end breaking
1139things known as firewalls.
1140
1141@vindex -P, Global
1142@item -P 0|1
1143A value of ``1'' for the @option{-P} option will enable display of
1144the test banner.  A value of ``0'' will disable display of the test
1145banner. One might want to disable display of the test banner when
1146running the same basic test type (eg TCP_STREAM) multiple times in
1147succession where the test banners would then simply be redundant and
1148unnecessarily clutter the output. [Default: 1 - display test banners]
1149
1150@vindex -s, Global
1151@item -s <seconds>
1152This option will cause netperf to sleep @samp{<seconds>} before
1153actually transferring data over the data connection.  This may be
1154useful in situations where one wishes to start a great many netperf
1155instances and do not want the earlier ones affecting the ability of
1156the later ones to get established.
1157
1158Added somewhere between versions 2.4.3 and 2.5.0.
1159
1160@vindex -S, Global
1161@item -S
1162This option will cause an attempt to be made to set SO_KEEPALIVE on
1163the data socket of a test using the BSD sockets interface.  The
1164attempt will be made on the netperf side of all tests, and will be
1165made on the netserver side of an @ref{The Omni Tests,omni} or
1166@ref{Migrated Tests,migrated} test.  No indication of failure is given
1167unless debug output is enabled with the global @option{-d} option.
1168
1169Added in version 2.5.0.
1170
1171@vindex -t, Global
1172@item -t testname
1173This option is used to tell netperf which test you wish to run.  As of
1174this writing, valid values for @var{testname} include:
1175@itemize
1176@item
1177@ref{TCP_STREAM}, @ref{TCP_MAERTS}, @ref{TCP_SENDFILE}, @ref{TCP_RR}, @ref{TCP_CRR}, @ref{TCP_CC}
1178@item
1179@ref{UDP_STREAM}, @ref{UDP_RR}
1180@item
1181@ref{XTI_TCP_STREAM},  @ref{XTI_TCP_RR}, @ref{XTI_TCP_CRR}, @ref{XTI_TCP_CC}
1182@item
1183@ref{XTI_UDP_STREAM}, @ref{XTI_UDP_RR}
1184@item
1185@ref{SCTP_STREAM}, @ref{SCTP_RR}
1186@item
1187@ref{DLCO_STREAM}, @ref{DLCO_RR},  @ref{DLCL_STREAM}, @ref{DLCL_RR}
1188@item
1189@ref{Other Netperf Tests,LOC_CPU}, @ref{Other Netperf Tests,REM_CPU}
1190@item
1191@ref{The Omni Tests,OMNI}
1192@end itemize
1193Not all tests are always compiled into netperf.  In particular, the
1194``XTI,'' ``SCTP,'' ``UNIXDOMAIN,'' and ``DL*'' tests are only included in
1195netperf when configured with
1196@option{--enable-[xti|sctp|unixdomain|dlpi]=yes}.
1197
1198Netperf only runs one type of test no matter how many @option{-t}
1199options may be present on the command-line.  The last @option{-t}
1200global command-line option will determine the test to be
1201run. [Default: TCP_STREAM]
1202
1203@vindex -T, Global
1204@item -T <optionspec>
1205This option controls the CPU, and probably by extension memory,
1206affinity of netperf and/or netserver.
1207@example
1208netperf -T 1
1209@end example
1210will bind both netperf and netserver to ``CPU 1'' on their respective
1211systems.
1212@example
1213netperf -T 1,
1214@end example
1215will bind just netperf to ``CPU 1'' and will leave netserver unbound.
1216@example
1217netperf -T ,2
1218@end example
1219will leave netperf unbound and will bind netserver to ``CPU 2.''
1220@example
1221netperf -T 1,2
1222@end example
1223will bind netperf to ``CPU 1'' and netserver to ``CPU 2.''
1224
1225This can be particularly useful when investigating performance issues
1226involving where processes run relative to where NIC interrupts are
1227processed or where NICs allocate their DMA buffers.
1228
1229@vindex -v, Global
1230@item -v verbosity
1231This option controls how verbose netperf will be in its output, and is
1232often used in conjunction with the @option{-P} option. If the
1233verbosity is set to a value of ``0'' then only the test's SFM (Single
1234Figure of Merit) is displayed.  If local @ref{CPU Utilization,CPU
1235utilization} is requested via the @option{-c} option then the SFM is
1236the local service demand.  Othersise, if remote CPU utilization is
1237requested via the @option{-C} option then the SFM is the remote
1238service demand.  If neither local nor remote CPU utilization are
1239requested the SFM will be the measured throughput or transaction rate
1240as implied by the test specified with the @option{-t} option.
1241
1242If the verbosity level is set to ``1'' then the ``normal'' netperf
1243result output for each test is displayed.
1244
1245If the verbosity level is set to ``2'' then ``extra'' information will
1246be displayed.  This may include, but is not limited to the number of
1247send or recv calls made and the average number of bytes per send or
1248recv call, or a histogram of the time spent in each send() call or for
1249each transaction if netperf was configured with
1250@option{--enable-histogram=yes}. [Default: 1 - normal verbosity]
1251
1252In an @ref{The Omni Tests,omni} test the verbosity setting is largely
1253ignored, save for when asking for the time histogram to be displayed.
1254In version 2.5.0 and later there is no @ref{Omni Output Selectors,output
1255selector} for the histogram and so it remains displayed only when the
1256verbosity level is set to 2.
1257
1258@vindex -V, Global
1259@item -V
1260This option displays the netperf version and then exits.
1261
1262Added in netperf 2.4.4.
1263
1264@vindex -w, Global
1265@item -w time
1266If netperf was configured with @option{--enable-intervals=yes} then
1267this value will set the inter-burst time to time milliseconds, and the
1268@option{-b} option will set the number of sends per burst.  The actual
1269inter-burst time may vary depending on the system's timer resolution.
1270
1271@vindex -W, Global
1272@item -W <sizespec>
1273This option controls the number of buffers in the send (first or only
1274value) and or receive (second or only value) buffer rings.  Unlike
1275some benchmarks, netperf does not continuously send or receive from a
1276single buffer.  Instead it rotates through a ring of
1277buffers. [Default: One more than the size of the send or receive
1278socket buffer sizes (@option{-s} and/or @option{-S} options) divided
1279by the send @option{-m} or receive @option{-M} buffer size
1280respectively]
1281
1282@vindex -4, Global
1283@item -4
1284Specifying this option will set both the local and remote address
1285families to AF_INET - that is use only IPv4 addresses on the control
1286connection.  This can be overridden by a subsequent @option{-6},
1287@option{-H} or @option{-L} option.  Basically, the last option
1288explicitly specifying an address family wins.  Unless overridden by a
1289test-specific option, this will be inherited for the data connection
1290as well.
1291
1292@vindex -6, Global
1293@item -6
1294Specifying this option will set both local and and remote address
1295families to AF_INET6 - that is use only IPv6 addresses on the control
1296connection.  This can be overridden by a subsequent @option{-4},
1297@option{-H} or @option{-L} option.  Basically, the last address family
1298explicitly specified wins.  Unless overridden by a test-specific
1299option, this will be inherited for the data connection as well.
1300
1301@end table
1302
1303
1304@node Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Request/Response , Global Command-line Options, Top
1305@chapter Using Netperf to Measure Bulk Data Transfer
1306
1307The most commonly measured aspect of networked system performance is
1308that of bulk or unidirectional transfer performance.  Everyone wants
1309to know how many bits or bytes per second they can push across the
1310network. The classic netperf convention for a bulk data transfer test
1311name is to tack a ``_STREAM'' suffix to a test name.
1312
1313@menu
1314* Issues in Bulk Transfer::     
1315* Options common to TCP UDP and SCTP tests::  
1316@end menu
1317
1318@node Issues in Bulk Transfer, Options common to TCP UDP and SCTP tests, Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Bulk Data Transfer
1319@comment  node-name,  next,  previous,  up
1320@section Issues in Bulk Transfer
1321
1322There are any number of things which can affect the performance of a
1323bulk transfer test.  
1324
1325Certainly, absent compression, bulk-transfer tests can be limited by
1326the speed of the slowest link in the path from the source to the
1327destination.  If testing over a gigabit link, you will not see more
1328than a gigabit :) Such situations can be described as being
1329@dfn{network-limited} or @dfn{NIC-limited}.
1330
1331CPU utilization can also affect the results of a bulk-transfer test.
1332If the networking stack requires a certain number of instructions or
1333CPU cycles per KB of data transferred, and the CPU is limited in the
1334number of instructions or cycles it can provide, then the transfer can
1335be described as being @dfn{CPU-bound}.  
1336
1337A bulk-transfer test can be CPU bound even when netperf reports less
1338than 100% CPU utilization.  This can happen on an MP system where one
1339or more of the CPUs saturate at 100% but other CPU's remain idle.
1340Typically, a single flow of data, such as that from a single instance
1341of a netperf _STREAM test cannot make use of much more than the power
1342of one CPU. Exceptions to this generally occur when netperf and/or
1343netserver run on CPU(s) other than the CPU(s) taking interrupts from
1344the NIC(s). In that case, one might see as much as two CPUs' worth of
1345processing being used to service the flow of data.
1346
1347Distance and the speed-of-light can affect performance for a
1348bulk-transfer; often this can be mitigated by using larger windows.
1349One common limit to the performance of a transport using window-based
1350flow-control is:
1351@example
1352Throughput <= WindowSize/RoundTripTime
1353@end example
1354As the sender can only have a window's-worth of data outstanding on
1355the network at any one time, and the soonest the sender can receive a
1356window update from the receiver is one RoundTripTime (RTT).  TCP and
1357SCTP are examples of such protocols.
1358
1359Packet losses and their effects can be particularly bad for
1360performance.  This is especially true if the packet losses result in
1361retransmission timeouts for the protocol(s) involved.  By the time a
1362retransmission timeout has happened, the flow or connection has sat
1363idle for a considerable length of time.
1364
1365On many platforms, some variant on the @command{netstat} command can
1366be used to retrieve statistics about packet loss and
1367retransmission. For example:
1368@example
1369netstat -p tcp
1370@end example
1371will retrieve TCP statistics on the HP-UX Operating System.  On other
1372platforms, it may not be possible to retrieve statistics for a
1373specific protocol and something like:
1374@example
1375netstat -s
1376@end example
1377would be used instead.
1378
1379Many times, such network statistics are keep since the time the stack
1380started, and we are only really interested in statistics from when
1381netperf was running.  In such situations something along the lines of:
1382@example
1383netstat -p tcp > before
1384netperf -t TCP_mumble...
1385netstat -p tcp > after
1386@end example
1387is indicated.  The
1388@uref{ftp://ftp.cup.hp.com/dist/networking/tools/,beforeafter} utility
1389can be used to subtract the statistics in @file{before} from the
1390statistics in @file{after}:
1391@example
1392beforeafter before after > delta
1393@end example
1394and then one can look at the statistics in @file{delta}.  Beforeafter
1395is distributed in source form so one can compile it on the platform(s)
1396of interest. 
1397
1398If running a version 2.5.0 or later ``omni'' test under Linux one can
1399include either or both of:
1400@itemize
1401@item LOCAL_TRANSPORT_RETRANS
1402@item REMOTE_TRANSPORT_RETRANS
1403@end itemize
1404
1405in the values provided via a test-specific @option{-o}, @option{-O},
1406or @option{-k} output selction option and netperf will report the
1407retransmissions experienced on the data connection, as reported via a
1408@code{getsockopt(TCP_INFO)} call.  If confidence intervals have been
1409requested via the global @option{-I} or @option{-i} options, the
1410reported value(s) will be for the last iteration.  If the test is over
1411a protocol other than TCP, or on a platform other than Linux, the
1412results are undefined.
1413
1414While it was written with HP-UX's netstat in mind, the
1415@uref{ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt,annotated
1416netstat} writeup may be helpful with other platforms as well.
1417
1418@node Options common to TCP UDP and SCTP tests,  , Issues in Bulk Transfer, Using Netperf to Measure Bulk Data Transfer
1419@comment  node-name,  next,  previous,  up
1420@section Options common to TCP UDP and SCTP tests
1421
1422Many ``test-specific'' options are actually common across the
1423different tests.  For those tests involving TCP, UDP and SCTP, whether
1424using the BSD Sockets or the XTI interface those common options
1425include:
1426
1427@table @code
1428@vindex -h, Test-specific
1429@item -h
1430Display the test-suite-specific usage string and exit.  For a TCP_ or
1431UDP_ test this will be the usage string from the source file
1432nettest_bsd.c.  For an XTI_ test, this will be the usage string from
1433the source file nettest_xti.c.  For an SCTP test, this will be the
1434usage string from the source file nettest_sctp.c.
1435
1436@item -H <optionspec>
1437Normally, the remote hostname|IP and address family information is
1438inherited from the settings for the control connection (eg global
1439command-line @option{-H}, @option{-4} and/or @option{-6} options).
1440The test-specific @option{-H} will override those settings for the
1441data (aka test) connection only.  Settings for the control connection
1442are left unchanged.
1443
1444@vindex -L, Test-specific
1445@item -L <optionspec>
1446The test-specific @option{-L} option is identical to the test-specific
1447@option{-H} option except it affects the local hostname|IP and address
1448family information.  As with its global command-line counterpart, this
1449is generally only useful when measuring though those evil, end-to-end
1450breaking things called firewalls.
1451
1452@vindex -m, Test-specific
1453@item -m bytes
1454Set the size of the buffer passed-in to the ``send'' calls of a
1455_STREAM test.  Note that this may have only an indirect effect on the
1456size of the packets sent over the network, and certain Layer 4
1457protocols do _not_ preserve or enforce message boundaries, so setting
1458@option{-m} for the send size does not necessarily mean the receiver
1459will receive that many bytes at any one time. By default the units are
1460bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
1461be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,''
1462``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
1463respectively. For example:
1464@example
1465@code{-m 32K}
1466@end example
1467will set the size to 32KB or 32768 bytes. [Default: the local send
1468socket buffer size for the connection - either the system's default or
1469the value set via the @option{-s} option.]
1470
1471@vindex -M, Test-specific
1472@item -M bytes
1473Set the size of the buffer passed-in to the ``recv'' calls of a
1474_STREAM test.  This will be an upper bound on the number of bytes
1475received per receive call. By default the units are bytes, but suffix
1476of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20
1477(MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m'' or ``k''
1478will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
1479example:
1480@example
1481@code{-M 32K}
1482@end example
1483will set the size to 32KB or 32768 bytes. [Default: the remote receive
1484socket buffer size for the data connection - either the system's
1485default or the value set via the @option{-S} option.]
1486
1487@vindex -P, Test-specific
1488@item -P <optionspec>
1489Set the local and/or remote port numbers for the data connection.
1490
1491@vindex -s, Test-specific
1492@item -s <sizespec>
1493This option sets the local (netperf) send and receive socket buffer
1494sizes for the data connection to the value(s) specified.  Often, this
1495will affect the advertised and/or effective TCP or other window, but
1496on some platforms it may not. By default the units are bytes, but
1497suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
1498(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
1499or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
1500respectively. For example:
1501@example
1502@code{-s 128K}
1503@end example
1504Will request the local send and receive socket buffer sizes to be
1505128KB or 131072 bytes. 
1506
1507While the historic expectation is that setting the socket buffer size
1508has a direct effect on say the TCP window, today that may not hold
1509true for all stacks. Further, while the historic expectation is that
1510the value specified in a @code{setsockopt()} call will be the value returned
1511via a @code{getsockopt()} call, at least one stack is known to deliberately
1512ignore history.  When running under Windows a value of 0 may be used
1513which will be an indication to the stack the user wants to enable a
1514form of copy avoidance. [Default: -1 - use the system's default socket
1515buffer sizes]
1516
1517@vindex -S Test-specific
1518@item -S <sizespec>
1519This option sets the remote (netserver) send and/or receive socket
1520buffer sizes for the data connection to the value(s) specified.
1521Often, this will affect the advertised and/or effective TCP or other
1522window, but on some platforms it may not. By default the units are
1523bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
1524be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,''
1525``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
1526respectively.  For example:
1527@example
1528@code{-S 128K}
1529@end example
1530Will request the remote send and receive socket buffer sizes to be
1531128KB or 131072 bytes. 
1532
1533While the historic expectation is that setting the socket buffer size
1534has a direct effect on say the TCP window, today that may not hold
1535true for all stacks.  Further, while the historic expectation is that
1536the value specified in a @code{setsockopt()} call will be the value returned
1537via a @code{getsockopt()} call, at least one stack is known to deliberately
1538ignore history.  When running under Windows a value of 0 may be used
1539which will be an indication to the stack the user wants to enable a
1540form of copy avoidance. [Default: -1 - use the system's default socket
1541buffer sizes]
1542
1543@vindex -4, Test-specific
1544@item -4
1545Set the local and remote address family for the data connection to
1546AF_INET - ie use IPv4 addressing only.  Just as with their global
1547command-line counterparts the last of the @option{-4}, @option{-6},
1548@option{-H} or @option{-L} option wins for their respective address
1549families.
1550
1551@vindex -6, Test-specific
1552@item -6
1553This option is identical to its @option{-4} cousin, but requests IPv6
1554addresses for the local and remote ends of the data connection.
1555
1556@end table
1557
1558
1559@menu
1560* TCP_STREAM::                  
1561* TCP_MAERTS::                  
1562* TCP_SENDFILE::                
1563* UDP_STREAM::                  
1564* XTI_TCP_STREAM::              
1565* XTI_UDP_STREAM::              
1566* SCTP_STREAM::                 
1567* DLCO_STREAM::                 
1568* DLCL_STREAM::                 
1569* STREAM_STREAM::               
1570* DG_STREAM::                   
1571@end menu
1572
1573@node TCP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests, Options common to TCP UDP and SCTP tests
1574@subsection TCP_STREAM
1575
1576The TCP_STREAM test is the default test in netperf.  It is quite
1577simple, transferring some quantity of data from the system running
1578netperf to the system running netserver.  While time spent
1579establishing the connection is not included in the throughput
1580calculation, time spent flushing the last of the data to the remote at
1581the end of the test is.  This is how netperf knows that all the data
1582it sent was received by the remote.  In addition to the @ref{Options
1583common to TCP UDP and SCTP tests,options common to STREAM tests}, the
1584following test-specific options can be included to possibly alter the
1585behavior of the test:
1586
1587@table @code
1588@item -C
1589This option will set TCP_CORK mode on the data connection on those
1590systems where TCP_CORK is defined (typically Linux).  A full
1591description of TCP_CORK is beyond the scope of this manual, but in a
1592nutshell it forces sub-MSS sends to be buffered so every segment sent
1593is Maximum Segment Size (MSS) unless the application performs an
1594explicit flush operation or the connection is closed.  At present
1595netperf does not perform any explicit flush operations.  Setting
1596TCP_CORK may improve the bitrate of tests where the ``send size''
1597(@option{-m} option) is smaller than the MSS.  It should also improve
1598(make smaller) the service demand.
1599
1600The Linux tcp(7) manpage states that TCP_CORK cannot be used in
1601conjunction with TCP_NODELAY (set via the @option{-d} option), however
1602netperf does not validate command-line options to enforce that.
1603
1604@item -D
1605This option will set TCP_NODELAY on the data connection on those
1606systems where TCP_NODELAY is defined.  This disables something known
1607as the Nagle Algorithm, which is intended to make the segments TCP
1608sends as large as reasonably possible.  Setting TCP_NODELAY for a
1609TCP_STREAM test should either have no effect when the send size
1610(@option{-m} option) is larger than the MSS or will decrease reported
1611bitrate and increase service demand when the send size is smaller than
1612the MSS.  This stems from TCP_NODELAY causing each sub-MSS send to be
1613its own TCP segment rather than being aggregated with other small
1614sends.  This means more trips up and down the protocol stack per KB of
1615data transferred, which means greater CPU utilization.
1616
1617If setting TCP_NODELAY with @option{-D} affects throughput and/or
1618service demand for tests where the send size (@option{-m}) is larger
1619than the MSS it suggests the TCP/IP stack's implementation of the
1620Nagle Algorithm _may_ be broken, perhaps interpreting the Nagle
1621Algorithm on a segment by segment basis rather than the proper user
1622send by user send basis.  However, a better test of this can be
1623achieved with the @ref{TCP_RR} test.
1624
1625@end table
1626
1627Here is an example of a basic TCP_STREAM test, in this case from a
1628Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23)
1629system:
1630
1631@example
1632$ netperf -H lag
1633TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1634Recv   Send    Send                          
1635Socket Socket  Message  Elapsed              
1636Size   Size    Size     Time     Throughput  
1637bytes  bytes   bytes    secs.    10^6bits/sec  
1638
1639 32768  16384  16384    10.00      80.42   
1640@end example
1641
1642We see that the default receive socket buffer size for the receiver
1643(lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer
1644size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux
1645does ``auto tuning'' of socket buffer and TCP window sizes, which
1646means the send socket buffer size may be different at the end of the
1647test than it was at the beginning.  This is addressed in the @ref{The
1648Omni Tests,omni tests} added in version 2.5.0 and @ref{Omni Output
1649Selection,output selection}.  Throughput is expressed as 10^6 (aka
1650Mega) bits per second, and the test ran for 10 seconds.  IPv4
1651addresses (AF_INET) were used.
1652
1653@node TCP_MAERTS, TCP_SENDFILE, TCP_STREAM, Options common to TCP UDP and SCTP tests
1654@comment  node-name,  next,  previous,  up
1655@subsection TCP_MAERTS
1656
1657A TCP_MAERTS (MAERTS is STREAM backwards) test is ``just like'' a
1658@ref{TCP_STREAM} test except the data flows from the netserver to the
1659netperf. The global command-line @option{-F} option is ignored for
1660this test type.  The test-specific command-line @option{-C} option is
1661ignored for this test type.
1662
1663Here is an example of a TCP_MAERTS test between the same two systems
1664as in the example for the @ref{TCP_STREAM} test.  This time we request
1665larger socket buffers with @option{-s} and @option{-S} options:
1666
1667@example
1668$ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K
1669TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1670Recv   Send    Send                          
1671Socket Socket  Message  Elapsed              
1672Size   Size    Size     Time     Throughput  
1673bytes  bytes   bytes    secs.    10^6bits/sec  
1674
1675221184 131072 131072    10.03      81.14   
1676@end example
1677
1678Where we see that Linux, unlike HP-UX, may not return the same value
1679in a @code{getsockopt()} as was requested in the prior @code{setsockopt()}.
1680
1681This test is included more for benchmarking convenience than anything
1682else.
1683
1684@node TCP_SENDFILE, UDP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests
1685@comment  node-name,  next,  previous,  up
1686@subsection TCP_SENDFILE
1687
1688The TCP_SENDFILE test is ``just like'' a @ref{TCP_STREAM} test except
1689netperf the platform's @code{sendfile()} call instead of calling
1690@code{send()}.  Often this results in a @dfn{zero-copy} operation
1691where data is sent directly from the filesystem buffer cache.  This
1692_should_ result in lower CPU utilization and possibly higher
1693throughput.  If it does not, then you may want to contact your
1694vendor(s) because they have a problem on their hands.
1695
1696Zero-copy mechanisms may also alter the characteristics (size and
1697number of buffers per) of packets passed to the NIC.  In many stacks,
1698when a copy is performed, the stack can ``reserve'' space at the
1699beginning of the destination buffer for things like TCP, IP and Link
1700headers.  This then has the packet contained in a single buffer which
1701can be easier to DMA to the NIC.  When no copy is performed, there is
1702no opportunity to reserve space for headers and so a packet will be
1703contained in two or more buffers.
1704
1705As of some time before version 2.5.0, the @ref{Global Options,global
1706@option{-F} option} is no longer required for this test.  If it is not
1707specified, netperf will create a temporary file, which it will delete
1708at the end of the test.  If the @option{-F} option is specified it
1709must reference a file of at least the size of the send ring
1710(@xref{Global Options,the global @option{-W} option}.) multiplied by
1711the send size (@xref{Options common to TCP UDP and SCTP tests,the
1712test-specific @option{-m} option}.).  All other TCP-specific options
1713remain available and optional.
1714
1715In this first example:
1716@example
1717$ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K
1718TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1719alloc_sendfile_buf_ring: specified file too small.
1720file must be larger than send_width * send_size
1721@end example
1722
1723we see what happens when the file is too small.  Here:
1724
1725@example
1726$ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K
1727TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1728Recv   Send    Send                          
1729Socket Socket  Message  Elapsed              
1730Size   Size    Size     Time     Throughput  
1731bytes  bytes   bytes    secs.    10^6bits/sec  
1732
1733131072 221184 221184    10.02      81.83   
1734@end example
1735
1736we resolve that issue by selecting a larger file.
1737
1738
1739@node UDP_STREAM, XTI_TCP_STREAM, TCP_SENDFILE, Options common to TCP UDP and SCTP tests
1740@subsection UDP_STREAM
1741
1742A UDP_STREAM test is similar to a @ref{TCP_STREAM} test except UDP is
1743used as the transport rather than TCP.
1744
1745@cindex Limiting Bandwidth
1746A UDP_STREAM test has no end-to-end flow control - UDP provides none
1747and neither does netperf.  However, if you wish, you can configure
1748netperf with @code{--enable-intervals=yes} to enable the global
1749command-line @option{-b} and @option{-w} options to pace bursts of
1750traffic onto the network.
1751
1752This has a number of implications.
1753
1754The biggest of these implications is the data which is sent might not
1755be received by the remote.  For this reason, the output of a
1756UDP_STREAM test shows both the sending and receiving throughput.  On
1757some platforms, it may be possible for the sending throughput to be
1758reported as a value greater than the maximum rate of the link.  This
1759is common when the CPU(s) are faster than the network and there is no
1760@dfn{intra-stack} flow-control.
1761
1762Here is an example of a UDP_STREAM test between two systems connected
1763by a 10 Gigabit Ethernet link:
1764@example
1765$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768
1766UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
1767Socket  Message  Elapsed      Messages                
1768Size    Size     Time         Okay Errors   Throughput
1769bytes   bytes    secs            #      #   10^6bits/sec
1770
1771124928   32768   10.00      105672      0    2770.20
1772135168           10.00      104844           2748.50
1773
1774@end example
1775
1776The first line of numbers are statistics from the sending (netperf)
1777side. The second line of numbers are from the receiving (netserver)
1778side.  In this case, 105672 - 104844 or 828 messages did not make it
1779all the way to the remote netserver process.
1780
1781If the value of the @option{-m} option is larger than the local send
1782socket buffer size (@option{-s} option) netperf will likely abort with
1783an error message about how the send call failed:
1784
1785@example
1786netperf -t UDP_STREAM -H 192.168.2.125
1787UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
1788udp_send: data send error: Message too long
1789@end example
1790
1791If the value of the @option{-m} option is larger than the remote
1792socket receive buffer, the reported receive throughput will likely be
1793zero as the remote UDP will discard the messages as being too large to
1794fit into the socket buffer.
1795
1796@example
1797$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768
1798UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
1799Socket  Message  Elapsed      Messages                
1800Size    Size     Time         Okay Errors   Throughput
1801bytes   bytes    secs            #      #   10^6bits/sec
1802
1803124928   65000   10.00       53595      0    2786.99
1804 65536           10.00           0              0.00
1805@end example
1806
1807The example above was between a pair of systems running a ``Linux''
1808kernel. Notice that the remote Linux system returned a value larger
1809than that passed-in to the @option{-S} option.  In fact, this value
1810was larger than the message size set with the @option{-m} option.
1811That the remote socket buffer size is reported as 65536 bytes would
1812suggest to any sane person that a message of 65000 bytes would fit,
1813but the socket isn't _really_ 65536 bytes, even though Linux is
1814telling us so.  Go figure.
1815
1816@node XTI_TCP_STREAM, XTI_UDP_STREAM, UDP_STREAM, Options common to TCP UDP and SCTP tests
1817@subsection XTI_TCP_STREAM
1818
1819An XTI_TCP_STREAM test is simply a @ref{TCP_STREAM} test using the XTI
1820rather than BSD Sockets interface.  The test-specific @option{-X
1821<devspec>} option can be used to specify the name of the local and/or
1822remote XTI device files, which is required by the @code{t_open()} call
1823made by netperf XTI tests.
1824
1825The XTI_TCP_STREAM test is only present if netperf was configured with
1826@code{--enable-xti=yes}.  The remote netserver must have also been
1827configured with @code{--enable-xti=yes}.
1828
1829@node XTI_UDP_STREAM, SCTP_STREAM, XTI_TCP_STREAM, Options common to TCP UDP and SCTP tests
1830@subsection XTI_UDP_STREAM
1831
1832An XTI_UDP_STREAM test is simply a @ref{UDP_STREAM} test using the XTI
1833rather than BSD Sockets Interface.  The test-specific @option{-X
1834<devspec>} option can be used to specify the name of the local and/or
1835remote XTI device files, which is required by the @code{t_open()} call
1836made by netperf XTI tests.
1837
1838The XTI_UDP_STREAM test is only present if netperf was configured with
1839@code{--enable-xti=yes}. The remote netserver must have also been
1840configured with @code{--enable-xti=yes}.
1841
1842@node SCTP_STREAM, DLCO_STREAM, XTI_UDP_STREAM, Options common to TCP UDP and SCTP tests
1843@subsection SCTP_STREAM
1844
1845An SCTP_STREAM test is essentially a @ref{TCP_STREAM} test using the SCTP
1846rather than TCP.  The @option{-D} option will set SCTP_NODELAY, which
1847is much like the TCP_NODELAY option for TCP.  The @option{-C} option
1848is not applicable to an SCTP test as there is no corresponding
1849SCTP_CORK option.  The author is still figuring-out what the
1850test-specific @option{-N} option does :)
1851
1852The SCTP_STREAM test is only present if netperf was configured with
1853@code{--enable-sctp=yes}. The remote netserver must have also been
1854configured with @code{--enable-sctp=yes}.
1855
1856@node DLCO_STREAM, DLCL_STREAM, SCTP_STREAM, Options common to TCP UDP and SCTP tests
1857@subsection DLCO_STREAM
1858
1859A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar
1860in concept to a @ref{TCP_STREAM} test.  Both use reliable,
1861connection-oriented protocols.  The DLPI test differs from the TCP
1862test in that its protocol operates only at the link-level and does not
1863include TCP-style segmentation and reassembly.  This last difference
1864means that the value  passed-in  with the @option{-m} option must be
1865less than the interface MTU.  Otherwise, the @option{-m} and
1866@option{-M} options are just like their TCP/UDP/SCTP counterparts.
1867
1868Other DLPI-specific options include:
1869
1870@table @code
1871@item -D <devspec>
1872This option is used to provide the fully-qualified names for the local
1873and/or remote DLPI device files.  The syntax is otherwise identical to
1874that of a @dfn{sizespec}.
1875@item -p <ppaspec>
1876This option is used to specify the local and/or remote DLPI PPA(s).
1877The PPA is used to identify the interface over which traffic is to be
1878sent/received. The syntax of a @dfn{ppaspec} is otherwise the same as
1879a @dfn{sizespec}.
1880@item -s sap 
1881This option specifies the 802.2 SAP for the test.  A SAP is somewhat
1882like either the port field of a TCP or UDP header or the protocol
1883field of an IP header.  The specified SAP should not conflict with any
1884other active SAPs on the specified PPA's (@option{-p} option).
1885@item -w <sizespec>
1886This option specifies the local send and receive window sizes in units
1887of frames on those platforms which support setting such things.
1888@item -W <sizespec>
1889This option specifies the remote send and receive window sizes in
1890units of frames on those platforms which support setting such things.
1891@end table
1892
1893The DLCO_STREAM test is only present if netperf was configured with
1894@code{--enable-dlpi=yes}. The remote netserver must have also been
1895configured with @code{--enable-dlpi=yes}.
1896
1897
1898@node DLCL_STREAM, STREAM_STREAM, DLCO_STREAM, Options common to TCP UDP and SCTP tests
1899@subsection DLCL_STREAM
1900
1901A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a
1902@ref{UDP_STREAM} test in that both make use of unreliable/best-effort,
1903connection-less transports.  The DLCL_STREAM test differs from the
1904@ref{UDP_STREAM} test in that the message size (@option{-m} option) must
1905always be less than the link MTU as there is no IP-like fragmentation
1906and reassembly available and netperf does not presume to provide one.
1907
1908The test-specific command-line options for a DLCL_STREAM test are the
1909same as those for a @ref{DLCO_STREAM} test.
1910
1911The DLCL_STREAM test is only present if netperf was configured with
1912@code{--enable-dlpi=yes}. The remote netserver must have also been
1913configured with @code{--enable-dlpi=yes}.
1914
1915@node STREAM_STREAM, DG_STREAM, DLCL_STREAM, Options common to TCP UDP and SCTP tests
1916@comment  node-name,  next,  previous,  up
1917@subsection STREAM_STREAM
1918
1919A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in
1920concept to a @ref{TCP_STREAM} test, but using Unix Domain sockets.  It is,
1921naturally, limited to intra-machine traffic.  A STREAM_STREAM test
1922shares the @option{-m}, @option{-M}, @option{-s} and @option{-S}
1923options of the other _STREAM tests.  In a STREAM_STREAM test the
1924@option{-p} option sets the directory in which the pipes will be
1925created rather than setting a port number.  The default is to create
1926the pipes in the system default for the @code{tempnam()} call.
1927
1928The STREAM_STREAM test is only present if netperf was configured with
1929@code{--enable-unixdomain=yes}. The remote netserver must have also been
1930configured with @code{--enable-unixdomain=yes}.
1931
1932@node DG_STREAM,  , STREAM_STREAM, Options common to TCP UDP and SCTP tests
1933@comment  node-name,  next,  previous,  up
1934@subsection DG_STREAM
1935
1936A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much
1937like a @ref{TCP_STREAM} test except that message boundaries are preserved.
1938In this way, it may also be considered similar to certain flavors of
1939SCTP test which can also preserve message boundaries.
1940
1941All the options of a @ref{STREAM_STREAM} test are applicable to a DG_STREAM
1942test. 
1943
1944The DG_STREAM test is only present if netperf was configured with
1945@code{--enable-unixdomain=yes}. The remote netserver must have also been
1946configured with @code{--enable-unixdomain=yes}.
1947
1948
1949@node Using Netperf to Measure Request/Response , Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bulk Data Transfer, Top
1950@chapter Using Netperf to Measure Request/Response 
1951
1952Request/response performance is often overlooked, yet it is just as
1953important as bulk-transfer performance.  While things like larger
1954socket buffers and TCP windows, and stateless offloads like TSO and
1955LRO can cover a multitude of latency and even path-length sins, those
1956sins cannot easily hide from a request/response test.  The convention
1957for a request/response test is to have a _RR suffix.  There are
1958however a few ``request/response'' tests that have other suffixes.
1959
1960A request/response test, particularly synchronous, one transaction at
1961a time test such as those found by default in netperf, is particularly
1962sensitive to the path-length of the networking stack.  An _RR test can
1963also uncover those platforms where the NICs are strapped by default
1964with overbearing interrupt avoidance settings in an attempt to
1965increase the bulk-transfer performance (or rather, decrease the CPU
1966utilization of a bulk-transfer test).  This sensitivity is most acute
1967for small request and response sizes, such as the single-byte default
1968for a netperf _RR test.
1969
1970While a bulk-transfer test reports its results in units of bits or
1971bytes transferred per second, by default a mumble_RR test reports
1972transactions per second where a transaction is defined as the
1973completed exchange of a request and a response.  One can invert the
1974transaction rate to arrive at the average round-trip latency.  If one
1975is confident about the symmetry of the connection, the average one-way
1976latency can be taken as one-half the average round-trip latency. As of
1977version 2.5.0 (actually slightly before) netperf still does not do the
1978latter, but will do the former if one sets the verbosity to 2 for a
1979classic netperf test, or includes the appropriate @ref{Omni Output
1980Selectors,output selector} in an @ref{The Omni Tests,omni test}.  It
1981will also allow the user to switch the throughput units from
1982transactions per second to bits or bytes per second with the global
1983@option{-f} option.
1984
1985@menu
1986* Issues in Request/Response::  
1987* Options Common to TCP UDP and SCTP _RR tests::  
1988@end menu
1989
1990@node Issues in Request/Response, Options Common to TCP UDP and SCTP _RR tests, Using Netperf to Measure Request/Response , Using Netperf to Measure Request/Response
1991@comment  node-name,  next,  previous,  up
1992@section Issues in Request/Response
1993
1994Most if not all the @ref{Issues in Bulk Transfer} apply to
1995request/response.  The issue of round-trip latency is even more
1996important as netperf generally only has one transaction outstanding at
1997a time.
1998
1999A single instance of a one transaction outstanding _RR test should
2000_never_ completely saturate the CPU of a system.  If testing between
2001otherwise evenly matched systems, the symmetric nature of a _RR test
2002with equal request and response sizes should result in equal CPU
2003loading on both systems. However, this may not hold true on MP
2004systems, particularly if one CPU binds the netperf and netserver
2005differently via the global @option{-T} option.
2006
2007For smaller request and response sizes packet loss is a bigger issue
2008as there is no opportunity for a @dfn{fast retransmit} or
2009retransmission prior to a retransmission timer expiring.
2010
2011Virtualization may considerably increase the effective path length of
2012a networking stack.  While this may not preclude achieving link-rate
2013on a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM
2014test, it can show-up as measurably fewer transactions per second on an
2015_RR test.  However, this may still be masked by interrupt coalescing
2016in the NIC/driver.
2017
2018Certain NICs have ways to minimize the number of interrupts sent to
2019the host.  If these are strapped badly they can significantly reduce
2020the performance of something like a single-byte request/response test.
2021Such setups are distinguished by seriously low reported CPU utilization
2022and what seems like a low (even if in the thousands) transaction per
2023second rate.  Also, if you run such an OS/driver combination on faster
2024or slower hardware and do not see a corresponding change in the
2025transaction rate, chances are good that the driver is strapping the
2026NIC with aggressive interrupt avoidance settings.  Good for bulk
2027throughput, but bad for latency.
2028
2029Some drivers may try to automagically adjust the interrupt avoidance
2030settings.  If they are not terribly good at it, you will see
2031considerable run-to-run variation in reported transaction rates.
2032Particularly if you ``mix-up'' _STREAM and _RR tests.
2033
2034
2035@node Options Common to TCP UDP and SCTP _RR tests,  , Issues in Request/Response, Using Netperf to Measure Request/Response
2036@comment  node-name,  next,  previous,  up
2037@section Options Common to TCP UDP and SCTP _RR tests
2038
2039Many ``test-specific'' options are actually common across the
2040different tests.  For those tests involving TCP, UDP and SCTP, whether
2041using the BSD Sockets or the XTI interface those common options
2042include:
2043
2044@table @code
2045@vindex -h, Test-specific
2046@item -h
2047Display the test-suite-specific usage string and exit.  For a TCP_ or
2048UDP_ test this will be the usage string from the source file
2049@file{nettest_bsd.c}.  For an XTI_ test, this will be the usage string
2050from the source file @file{src/nettest_xti.c}.  For an SCTP test, this
2051will be the usage string from the source file
2052@file{src/nettest_sctp.c}.
2053
2054@vindex -H, Test-specific
2055@item -H <optionspec>
2056Normally, the remote hostname|IP and address family information is
2057inherited from the settings for the control connection (eg global
2058command-line @option{-H}, @option{-4} and/or @option{-6} options.
2059The test-specific @option{-H} will override those settings for the
2060data (aka test) connection only.  Settings for the control connection
2061are left unchanged.  This might be used to cause the control and data
2062connections to take different paths through the network.
2063
2064@vindex -L, Test-specific
2065@item -L <optionspec>
2066The test-specific @option{-L} option is identical to the test-specific
2067@option{-H} option except it affects the local hostname|IP and address
2068family information.  As with its global command-line counterpart, this
2069is generally only useful when measuring though those evil, end-to-end
2070breaking things called firewalls.
2071
2072@vindex -P, Test-specific
2073@item -P <optionspec>
2074Set the local and/or remote port numbers for the data connection.
2075
2076@vindex -r, Test-specific
2077@item -r <sizespec>
2078This option sets the request (first value) and/or response (second
2079value) sizes for an _RR test. By default the units are bytes, but a
2080suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
2081(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
2082or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
2083respectively. For example:
2084@example
2085@code{-r 128,16K}
2086@end example
2087Will set the request size to 128 bytes and the response size to 16 KB
2088or 16384 bytes. [Default: 1 - a single-byte request and response ]
2089
2090@vindex -s, Test-specific
2091@item -s <sizespec>
2092This option sets the local (netperf) send and receive socket buffer
2093sizes for the data connection to the value(s) specified.  Often, this
2094will affect the advertised and/or effective TCP or other window, but
2095on some platforms it may not. By default the units are bytes, but a
2096suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
2097(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
2098or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
2099respectively. For example:
2100@example
2101@code{-s 128K}
2102@end example
2103Will request the local send (netperf) and receive socket buffer sizes
2104to be 128KB or 131072 bytes.
2105
2106While the historic expectation is that setting the socket buffer size
2107has a direct effect on say the TCP window, today that may not hold
2108true for all stacks.  When running under Windows a value of 0 may be
2109used which will be an indication to the stack the user wants to enable
2110a form of copy avoidance. [Default: -1 - use the system's default
2111socket buffer sizes]
2112
2113@vindex -S, Test-specific
2114@item -S <sizespec>
2115This option sets the remote (netserver) send and/or receive socket
2116buffer sizes for the data connection to the value(s) specified.
2117Often, this will affect the advertised and/or effective TCP or other
2118window, but on some platforms it may not. By default the units are
2119bytes, but a suffix of ``G,'' ``M,'' or ``K'' will specify the units
2120to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of
2121``g,'' ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
2122respectively.  For example:
2123@example
2124@code{-S 128K}
2125@end example
2126Will request the remote (netserver) send and receive socket buffer
2127sizes to be 128KB or 131072 bytes.
2128
2129While the historic expectation is that setting the socket buffer size
2130has a direct effect on say the TCP window, today that may not hold
2131true for all stacks.  When running under Windows a value of 0 may be
2132used which will be an indication to the stack the user wants to enable
2133a form of copy avoidance.  [Default: -1 - use the system's default
2134socket buffer sizes]
2135
2136@vindex -4, Test-specific
2137@item -4
2138Set the local and remote address family for the data connection to
2139AF_INET - ie use IPv4 addressing only.  Just as with their global
2140command-line counterparts the last of the @option{-4}, @option{-6},
2141@option{-H} or @option{-L} option wins for their respective address
2142families.
2143
2144@vindex -6 Test-specific
2145@item -6
2146This option is identical to its @option{-4} cousin, but requests IPv6
2147addresses for the local and remote ends of the data connection.
2148
2149@end table
2150
2151@menu
2152* TCP_RR::                      
2153* TCP_CC::                      
2154* TCP_CRR::                     
2155* UDP_RR::                      
2156* XTI_TCP_RR::                  
2157* XTI_TCP_CC::                  
2158* XTI_TCP_CRR::                 
2159* XTI_UDP_RR::                  
2160* DLCL_RR::                     
2161* DLCO_RR::                     
2162* SCTP_RR::                     
2163@end menu
2164
2165@node TCP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests, Options Common to TCP UDP and SCTP _RR tests
2166@subsection TCP_RR
2167@cindex Measuring Latency
2168@cindex Latency, Request-Response
2169
2170A TCP_RR (TCP Request/Response) test is requested by passing a value
2171of ``TCP_RR'' to the global @option{-t} command-line option.  A TCP_RR
2172test can be thought-of as a user-space to user-space @code{ping} with
2173no think time - it is by default a synchronous, one transaction at a
2174time, request/response test.
2175
2176The transaction rate is the number of complete transactions exchanged
2177divided by the length of time it took to perform those transactions.
2178
2179If the two Systems Under Test are otherwise identical, a TCP_RR test
2180with the same request and response size should be symmetric - it
2181should not matter which way the test is run, and the CPU utilization
2182measured should be virtually the same on each system.  If not, it
2183suggests that the CPU utilization mechanism being used may have some,
2184well, issues measuring CPU utilization completely and accurately.
2185
2186Time to establish the TCP connection is not counted in the result.  If
2187you want connection setup overheads included, you should consider the
2188@ref{TCP_CC,TPC_CC} or @ref{TCP_CRR,TCP_CRR} tests.
2189
2190If specifying the @option{-D} option to set TCP_NODELAY and disable
2191the Nagle Algorithm increases the transaction rate reported by a
2192TCP_RR test, it implies the stack(s) over which the TCP_RR test is
2193running have a broken implementation of the Nagle Algorithm.  Likely
2194as not they are interpreting Nagle on a segment by segment basis
2195rather than a user send by user send basis.  You should contact your
2196stack vendor(s) to report the problem to them.
2197
2198Here is an example of two systems running a basic TCP_RR test over a
219910 Gigabit Ethernet link:
2200
2201@example
2202netperf -t TCP_RR -H 192.168.2.125
2203TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
2204Local /Remote
2205Socket Size   Request  Resp.   Elapsed  Trans.
2206Send   Recv   Size     Size    Time     Rate         
2207bytes  Bytes  bytes    bytes   secs.    per sec   
2208
220916384  87380  1        1       10.00    29150.15   
221016384  87380 
2211@end example
2212
2213In this example the request and response sizes were one byte, the
2214socket buffers were left at their defaults, and the test ran for all
2215of 10 seconds.  The transaction per second rate was rather good for
2216the time :)
2217
2218@node TCP_CC, TCP_CRR, TCP_RR, Options Common to TCP UDP and SCTP _RR tests
2219@subsection TCP_CC
2220@cindex Connection Latency
2221@cindex Latency, Connection Establishment
2222
2223A TCP_CC (TCP Connect/Close) test is requested by passing a value of
2224``TCP_CC'' to the global @option{-t} option.  A TCP_CC test simply
2225measures how fast the pair of systems can open and close connections
2226between one another in a synchronous (one at a time) manner.  While
2227this is considered an _RR test, no request or response is exchanged
2228over the connection.
2229
2230@cindex Port Reuse
2231@cindex TIME_WAIT
2232The issue of TIME_WAIT reuse is an important one for a TCP_CC test.
2233Basically, TIME_WAIT reuse is when a pair of systems churn through
2234connections fast enough that they wrap the 16-bit port number space in
2235less time than the length of the TIME_WAIT state.  While it is indeed
2236theoretically possible to ``reuse'' a connection in TIME_WAIT, the
2237conditions under which such reuse is possible are rather rare.  An
2238attempt to reuse a connection in TIME_WAIT can result in a non-trivial
2239delay in connection establishment.
2240
2241Basically, any time the connection churn rate approaches:
2242
2243Sizeof(clientportspace) / Lengthof(TIME_WAIT)
2244
2245there is the risk of TIME_WAIT reuse.  To minimize the chances of this
2246happening, netperf will by default select its own client port numbers
2247from the range of 5000 to 65535.  On systems with a 60 second
2248TIME_WAIT state, this should allow roughly 1000 transactions per
2249second.  The size of the client port space used by netperf can be
2250controlled via the test-specific @option{-p} option, which takes a
2251@dfn{sizespec} as a value setting the minimum (first value) and
2252maximum (second value) port numbers used by netperf at the client end.
2253
2254Since no requests or responses are exchanged during a TCP_CC test,
2255only the @option{-H}, @option{-L}, @option{-4} and @option{-6} of the
2256``common'' test-specific options are likely to have an effect, if any,
2257on the results.  The @option{-s} and @option{-S} options _may_ have
2258some effect if they alter the number and/or type of options carried in
2259the TCP SYNchronize segments, such as Window Scaling or Timestamps.
2260The @option{-P} and @option{-r} options are utterly ignored.
2261
2262Since connection establishment and tear-down for TCP is not symmetric,
2263a TCP_CC test is not symmetric in its loading of the two systems under
2264test.
2265
2266@node TCP_CRR, UDP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests
2267@subsection TCP_CRR
2268@cindex Latency, Connection Establishment
2269@cindex Latency, Request-Response
2270
2271The TCP Connect/Request/Response (TCP_CRR) test is requested by
2272passing a value of ``TCP_CRR'' to the global @option{-t} command-line
2273option.  A TCP_CRR test is like a merger of a @ref{TCP_RR} and
2274@ref{TCP_CC} test which measures the performance of establishing a
2275connection, exchanging a single request/response transaction, and
2276tearing-down that connection.  This is very much like what happens in
2277an HTTP 1.0 or HTTP 1.1 connection when HTTP Keepalives are not used.
2278In fact, the TCP_CRR test was added to netperf to simulate just that.
2279
2280Since a request and response are exchanged the @option{-r},
2281@option{-s} and @option{-S} options can have an effect on the
2282performance.
2283
2284The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it
2285does for the TCP_CC test.  Similarly, since connection establishment
2286and tear-down is not symmetric, a TCP_CRR test is not symmetric even
2287when the request and response sizes are the same.
2288
2289@node UDP_RR, XTI_TCP_RR, TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
2290@subsection UDP_RR
2291@cindex Latency, Request-Response
2292@cindex Packet Loss
2293
2294A UDP Request/Response (UDP_RR) test is requested by passing a value
2295of ``UDP_RR'' to a global @option{-t} option.  It is very much the
2296same as a TCP_RR test except UDP is used rather than TCP.
2297
2298UDP does not provide for retransmission of lost UDP datagrams, and
2299netperf does not add anything for that either.  This means that if
2300_any_ request or response is lost, the exchange of requests and
2301responses will stop from that point until the test timer expires.
2302Netperf will not really ``know'' this has happened - the only symptom
2303will be a low transaction per second rate.  If @option{--enable-burst}
2304was included in the @code{configure} command and a test-specific
2305@option{-b} option used, the UDP_RR test will ``survive'' the loss of
2306requests and responses until the sum is one more than the value passed
2307via the @option{-b} option. It will though almost certainly run more
2308slowly.
2309
2310The netperf side of a UDP_RR test will call @code{connect()} on its
2311data socket and thenceforth use the @code{send()} and @code{recv()}
2312socket calls.  The netserver side of a UDP_RR test will not call
2313@code{connect()} and will use @code{recvfrom()} and @code{sendto()}
2314calls.  This means that even if the request and response sizes are the
2315same, a UDP_RR test is _not_ symmetric in its loading of the two
2316systems under test.
2317
2318Here is an example of a UDP_RR test between two otherwise
2319identical two-CPU systems joined via a 1 Gigabit Ethernet network:
2320
2321@example
2322$ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C
2323UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET
2324Local /Remote
2325Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
2326Send   Recv   Size    Size   Time    Rate     local  remote local   remote
2327bytes  bytes  bytes   bytes  secs.   per sec  % I    % I    us/Tr   us/Tr
2328
232965535  65535  1       1      10.01   15262.48   13.90  16.11  18.221  21.116
233065535  65535 
2331@end example
2332
2333This example includes the @option{-c} and @option{-C} options to
2334enable CPU utilization reporting and shows the asymmetry in CPU
2335loading.  The @option{-T} option was used to make sure netperf and
2336netserver ran on a given CPU and did not move around during the test.
2337
2338@node XTI_TCP_RR, XTI_TCP_CC, UDP_RR, Options Common to TCP UDP and SCTP _RR tests
2339@subsection XTI_TCP_RR
2340@cindex Latency, Request-Response
2341
2342An XTI_TCP_RR test is essentially the same as a @ref{TCP_RR} test only
2343using the XTI rather than BSD Sockets interface. It is requested by
2344passing a value of ``XTI_TCP_RR'' to the @option{-t} global
2345command-line option.
2346
2347The test-specific options for an XTI_TCP_RR test are the same as those
2348for a TCP_RR test with the addition of the @option{-X <devspec>} option to
2349specify the names of the local and/or remote XTI device file(s).
2350
2351@node XTI_TCP_CC, XTI_TCP_CRR, XTI_TCP_RR, Options Common to TCP UDP and SCTP _RR tests
2352@comment  node-name,  next,  previous,  up
2353@subsection XTI_TCP_CC
2354@cindex Latency, Connection Establishment
2355
2356An XTI_TCP_CC test is essentially the same as a @ref{TCP_CC,TCP_CC}
2357test, only using the XTI rather than BSD Sockets interface.
2358
2359The test-specific options for an XTI_TCP_CC test are the same as those
2360for a TCP_CC test with the addition of the @option{-X <devspec>} option to
2361specify the names of the local and/or remote XTI device file(s).
2362
2363@node XTI_TCP_CRR, XTI_UDP_RR, XTI_TCP_CC, Options Common to TCP UDP and SCTP _RR tests
2364@comment  node-name,  next,  previous,  up
2365@subsection XTI_TCP_CRR
2366@cindex Latency, Connection Establishment
2367@cindex Latency, Request-Response
2368
2369The XTI_TCP_CRR test is essentially the same as a
2370@ref{TCP_CRR,TCP_CRR} test, only using the XTI rather than BSD Sockets
2371interface.
2372
2373The test-specific options for an XTI_TCP_CRR test are the same as those
2374for a TCP_RR test with the addition of the @option{-X <devspec>} option to
2375specify the names of the local and/or remote XTI device file(s).
2376
2377@node XTI_UDP_RR, DLCL_RR, XTI_TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
2378@subsection XTI_UDP_RR
2379@cindex Latency, Request-Response
2380
2381An XTI_UDP_RR test is essentially the same as a UDP_RR test only using
2382the XTI rather than BSD Sockets interface.  It is requested by passing
2383a value of ``XTI_UDP_RR'' to the @option{-t} global command-line
2384option.
2385
2386The test-specific options for an XTI_UDP_RR test are the same as those
2387for a UDP_RR test with the addition of the @option{-X <devspec>}
2388option to specify the name of the local and/or remote XTI device
2389file(s).
2390
2391@node DLCL_RR, DLCO_RR, XTI_UDP_RR, Options Common to TCP UDP and SCTP _RR tests
2392@comment  node-name,  next,  previous,  up
2393@subsection DLCL_RR
2394@cindex Latency, Request-Response
2395
2396@node DLCO_RR, SCTP_RR, DLCL_RR, Options Common to TCP UDP and SCTP _RR tests
2397@comment  node-name,  next,  previous,  up
2398@subsection DLCO_RR
2399@cindex Latency, Request-Response
2400
2401@node SCTP_RR,  , DLCO_RR, Options Common to TCP UDP and SCTP _RR tests
2402@comment  node-name,  next,  previous,  up
2403@subsection SCTP_RR
2404@cindex Latency, Request-Response
2405
2406@node Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Request/Response , Top
2407@comment  node-name,  next,  previous,  up
2408@chapter Using Netperf to Measure Aggregate Performance
2409@cindex Aggregate Performance
2410@vindex --enable-burst, Configure
2411
2412Ultimately, @ref{Netperf4,Netperf4} will be the preferred benchmark to
2413use when one wants to measure aggregate performance because netperf
2414has no support for explicit synchronization of concurrent tests. Until
2415netperf4 is ready for prime time, one can make use of the heuristics
2416and procedures mentioned here for the 85% solution.
2417
2418There are a few ways to measure aggregate performance with netperf.
2419The first is to run multiple, concurrent netperf tests and can be
2420applied to any of the netperf tests.  The second is to configure
2421netperf with @code{--enable-burst} and is applicable to the TCP_RR
2422test. The third is a variation on the first.
2423
2424@menu
2425* Running Concurrent Netperf Tests::  
2426* Using --enable-burst::        
2427* Using --enable-demo::         
2428@end menu
2429
2430@node  Running Concurrent Netperf Tests, Using --enable-burst, Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Aggregate Performance
2431@comment  node-name,  next,  previous,  up
2432@section Running Concurrent Netperf Tests
2433
2434@ref{Netperf4,Netperf4} is the preferred benchmark to use when one
2435wants to measure aggregate performance because netperf has no support
2436for explicit synchronization of concurrent tests.  This leaves
2437netperf2 results vulnerable to @dfn{skew} errors.
2438
2439However, since there are times when netperf4 is unavailable it may be
2440necessary to run netperf. The skew error can be minimized by making
2441use of the confidence interval functionality.  Then one simply
2442launches multiple tests from the shell using a @code{for} loop or the
2443like:
2444
2445@example
2446for i in 1 2 3 4
2447do
2448netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 &
2449done
2450@end example
2451
2452which will run four, concurrent @ref{TCP_STREAM,TCP_STREAM} tests from
2453the system on which it is executed to tardy.cup.hp.com.  Each
2454concurrent netperf will iterate 10 times thanks to the @option{-i}
2455option and will omit the test banners (option @option{-P}) for
2456brevity.  The output looks something like this:
2457
2458@example
2459 87380  16384  16384    10.03     235.15   
2460 87380  16384  16384    10.03     235.09   
2461 87380  16384  16384    10.03     235.38   
2462 87380  16384  16384    10.03     233.96
2463@end example
2464
2465We can take the sum of the results and be reasonably confident that
2466the aggregate performance was 940 Mbits/s.  This method does not need
2467to be limited to one system speaking to one other system.  It can be
2468extended to one system talking to N other systems.  It could be as simple as:
2469@example
2470for host in 'foo bar baz bing'
2471do
2472netperf -t TCP_STREAM -H $hosts -i 10 -P 0 &
2473done
2474@end example
2475A more complicated/sophisticated example can be found in
2476@file{doc/examples/runemomniagg2.sh} where.
2477
2478If you see warnings about netperf not achieving the confidence
2479intervals, the best thing to do is to increase the number of
2480iterations with @option{-i} and/or increase the run length of each
2481iteration with @option{-l}.
2482
2483You can also enable local (@option{-c}) and/or remote (@option{-C})
2484CPU utilization:
2485
2486@example
2487for i in 1 2 3 4
2488do
2489netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C &
2490done
2491
249287380  16384  16384    10.03       235.47   3.67     5.09     10.226  14.180 
249387380  16384  16384    10.03       234.73   3.67     5.09     10.260  14.225 
249487380  16384  16384    10.03       234.64   3.67     5.10     10.263  14.231 
249587380  16384  16384    10.03       234.87   3.67     5.09     10.253  14.215
2496@end example
2497
2498If the CPU utilizations reported for the same system are the same or
2499very very close you can be reasonably confident that skew error is
2500minimized.  Presumably one could then omit @option{-i} but that is
2501not advised, particularly when/if the CPU utilization approaches 100
2502percent.  In the example above we see that the CPU utilization on the
2503local system remains the same for all four tests, and is only off by
25040.01 out of 5.09 on the remote system.  As the number of CPUs in the
2505system increases, and so too the odds of saturating a single CPU, the
2506accuracy of similar CPU utilization implying little skew error is
2507diminished.  This is also the case for those increasingly rare single
2508CPU systems if the utilization is reported as 100% or very close to
2509it.
2510
2511@quotation
2512@b{NOTE: It is very important to remember that netperf is calculating
2513system-wide CPU utilization.  When calculating the service demand
2514(those last two columns in the output above) each netperf assumes it
2515is the only thing running on the system.  This means that for
2516concurrent tests the service demands reported by netperf will be
2517wrong.  One has to compute service demands for concurrent tests by
2518hand.}
2519@end quotation
2520
2521If you wish you can add a unique, global @option{-B} option to each
2522command line to append the given string to the output:
2523
2524@example
2525for i in 1 2 3 4
2526do
2527netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 &
2528done
2529
253087380  16384  16384    10.03     234.90   this is test 4
253187380  16384  16384    10.03     234.41   this is test 2
253287380  16384  16384    10.03     235.26   this is test 1
253387380  16384  16384    10.03     235.09   this is test 3
2534@end example
2535
2536You will notice that the tests completed in an order other than they
2537were started from the shell.  This underscores why there is a threat
2538of skew error and why netperf4 will eventually be the preferred tool
2539for aggregate tests.  Even if you see the Netperf Contributing Editor
2540acting to the contrary!-)
2541
2542@menu
2543* Issues in Running Concurrent Tests::  
2544@end menu
2545
2546@node Issues in Running Concurrent Tests,  , Running Concurrent Netperf Tests, Running Concurrent Netperf Tests
2547@subsection Issues in Running Concurrent Tests
2548
2549In addition to the aforementioned issue of skew error, there can be
2550other issues to consider when running concurrent netperf tests.
2551
2552For example, when running concurrent tests over multiple interfaces,
2553one is not always assured that the traffic one thinks went over a
2554given interface actually did so.  In particular, the Linux networking
2555stack takes a particularly strong stance on its following the so
2556called @samp{weak end system model}.  As such, it is willing to answer
2557ARP requests for any of its local IP addresses on any of its
2558interfaces.  If multiple interfaces are connected to the same
2559broadcast domain, then even if they are configured into separate IP
2560subnets there is no a priori way of knowing which interface was
2561actually used for which connection(s).  This can be addressed by
2562setting the @samp{arp_ignore} sysctl before configuring interfaces.
2563
2564As it is quite important, we will repeat that it is very important to
2565remember that each concurrent netperf instance is calculating
2566system-wide CPU utilization.  When calculating the service demand each
2567netperf assumes it is the only thing running on the system.  This
2568means that for concurrent tests the service demands reported by
2569netperf @b{will be wrong}.  One has to compute service demands for
2570concurrent tests by hand
2571
2572Running concurrent tests can also become difficult when there is no
2573one ``central'' node.  Running tests between pairs of systems may be
2574more difficult, calling for remote shell commands in the for loop
2575rather than netperf commands.  This introduces more skew error, which
2576the confidence intervals may not be able to sufficiently mitigate.
2577One possibility is to actually run three consecutive netperf tests on
2578each node - the first being a warm-up, the last being a cool-down.
2579The idea then is to ensure that the time it takes to get all the
2580netperfs started is less than the length of the first netperf command
2581in the sequence of three.  Similarly, it assumes that all ``middle''
2582netperfs will complete before the first of the ``last'' netperfs
2583complete.
2584
2585@node  Using --enable-burst, Using --enable-demo, Running Concurrent Netperf Tests, Using Netperf to Measure Aggregate Performance
2586@comment  node-name,  next,  previous,  up
2587@section Using - -enable-burst
2588
2589Starting in version 2.5.0 @code{--enable-burst=yes} is the default,
2590which means one no longer must:
2591
2592@example
2593configure --enable-burst
2594@end example
2595
2596To have burst-mode functionality present in netperf.  This enables a
2597test-specific @option{-b num} option in @ref{TCP_RR,TCP_RR},
2598@ref{UDP_RR,UDP_RR} and @ref{The Omni Tests,omni} tests.
2599
2600Normally, netperf will attempt to ramp-up the number of outstanding
2601requests to @option{num} plus one transactions in flight at one time.
2602The ramp-up is to avoid transactions being smashed together into a
2603smaller number of segments when the transport's congestion window (if
2604any) is smaller at the time than what netperf wants to have
2605outstanding at one time. If, however, the user specifies a negative
2606value for @option{num} this ramp-up is bypassed and the burst of sends
2607is made without consideration of transport congestion window.
2608
2609This burst-mode is used as an alternative to or even in conjunction
2610with multiple-concurrent _RR tests and as a way to implement a
2611single-connection, bidirectional bulk-transfer test.  When run with
2612just a single instance of netperf, increasing the burst size can
2613determine the maximum number of transactions per second which can be
2614serviced by a single process:
2615
2616@example
2617for b in 0 1 2 4 8 16 32
2618do 
2619 netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b
2620done
2621
26229457.59 -b 0
26239975.37 -b 1
262410000.61 -b 2
262520084.47 -b 4
262629965.31 -b 8
262771929.27 -b 16
2628109718.17 -b 32
2629@end example
2630
2631The global @option{-v} and @option{-P} options were used to minimize
2632the output to the single figure of merit which in this case the
2633transaction rate.  The global @code{-B} option was used to more
2634clearly label the output, and the test-specific @option{-b} option
2635enabled by @code{--enable-burst} increase the number of transactions
2636in flight at one time.
2637
2638Now, since the test-specific @option{-D} option was not specified to
2639set TCP_NODELAY, the stack was free to ``bundle'' requests and/or
2640responses into TCP segments as it saw fit, and since the default
2641request and response size is one byte, there could have been some
2642considerable bundling even in the absence of transport congestion
2643window issues.  If one wants to try to achieve a closer to
2644one-to-one correspondence between a request and response and a TCP
2645segment, add the test-specific @option{-D} option:
2646
2647@example
2648for b in 0 1 2 4 8 16 32
2649do
2650 netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D
2651done
2652
2653 8695.12 -b 0 -D
2654 19966.48 -b 1 -D
2655 20691.07 -b 2 -D
2656 49893.58 -b 4 -D
2657 62057.31 -b 8 -D
2658 108416.88 -b 16 -D
2659 114411.66 -b 32 -D
2660@end example
2661
2662You can see that this has a rather large effect on the reported
2663transaction rate.  In this particular instance, the author believes it
2664relates to interactions between the test and interrupt coalescing
2665settings in the driver for the NICs used.
2666
2667@quotation
2668@b{NOTE: Even if you set the @option{-D} option that is still not a
2669guarantee that each transaction is in its own TCP segments.  You
2670should get into the habit of verifying the relationship between the
2671transaction rate and the packet rate via other means.}
2672@end quotation
2673
2674You can also combine @code{--enable-burst} functionality with
2675concurrent netperf tests.  This would then be an ``aggregate of
2676aggregates'' if you like:
2677
2678@example
2679
2680for i in 1 2 3 4
2681do
2682 netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
2683done
2684
2685 46668.38 aggregate 4 -b 8 -D
2686 44890.64 aggregate 2 -b 8 -D
2687 45702.04 aggregate 1 -b 8 -D
2688 46352.48 aggregate 3 -b 8 -D
2689
2690@end example
2691
2692Since each netperf did hit the confidence intervals, we can be
2693reasonably certain that the aggregate transaction per second rate was
2694the sum of all four concurrent tests, or something just shy of 184,000
2695transactions per second.  To get some idea if that was also the packet
2696per second rate, we could bracket that @code{for} loop with something
2697to gather statistics and run the results through
2698@uref{ftp://ftp.cup.hp.com/dist/networking/tools,beforeafter}:
2699
2700@example
2701/usr/sbin/ethtool -S eth2 > before
2702for i in 1 2 3 4
2703do
2704 netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
2705done
2706wait
2707/usr/sbin/ethtool -S eth2 > after
2708
2709 52312.62 aggregate 2 -b 8 -D
2710 50105.65 aggregate 4 -b 8 -D
2711 50890.82 aggregate 1 -b 8 -D
2712 50869.20 aggregate 3 -b 8 -D
2713
2714beforeafter before after > delta
2715
2716grep packets delta
2717     rx_packets: 12251544
2718     tx_packets: 12251550
2719
2720@end example
2721
2722This example uses @code{ethtool} because the system being used is
2723running Linux.  Other platforms have other tools - for example HP-UX
2724has lanadmin:
2725
2726@example
2727lanadmin -g mibstats <ppa>
2728@end example
2729
2730and of course one could instead use @code{netstat}.
2731
2732The @code{wait} is important because we are launching concurrent
2733netperfs in the background.  Without it, the second ethtool command
2734would be run before the tests finished and perhaps even before the
2735last of them got started!
2736
2737The sum of the reported transaction rates is 204178 over 60 seconds,
2738which is a total of 12250680 transactions.  Each transaction is the
2739exchange of a request and a response, so we multiply that by 2 to
2740arrive at 24501360.
2741
2742The sum of the ethtool stats is 24503094 packets which matches what
2743netperf was reporting very well. 
2744
2745Had the request or response size differed, we would need to know how
2746it compared with the @dfn{MSS} for the connection.
2747
2748Just for grins, here is the exercise repeated, using @code{netstat}
2749instead of @code{ethtool}
2750
2751@example
2752netstat -s -t > before
2753for i in 1 2 3 4
2754do
2755 netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done
2756wait
2757netstat -s -t > after
2758
2759 51305.88 aggregate 4 -b 8 -D
2760 51847.73 aggregate 2 -b 8 -D
2761 50648.19 aggregate 3 -b 8 -D
2762 53605.86 aggregate 1 -b 8 -D
2763
2764beforeafter before after > delta
2765
2766grep segments delta
2767    12445708 segments received
2768    12445730 segments send out
2769    1 segments retransmited
2770    0 bad segments received.
2771@end example
2772
2773The sums are left as an exercise to the reader :)
2774
2775Things become considerably more complicated if there are non-trvial
2776packet losses and/or retransmissions.
2777
2778Of course all this checking is unnecessary if the test is a UDP_RR
2779test because UDP ``never'' aggregates multiple sends into the same UDP
2780datagram, and there are no ACKnowledgements in UDP.  The loss of a
2781single request or response will not bring a ``burst'' UDP_RR test to a
2782screeching halt, but it will reduce the number of transactions
2783outstanding at any one time.  A ``burst'' UDP_RR test @b{will} come to a
2784halt if the sum of the lost requests and responses reaches the value
2785specified in the test-specific @option{-b} option.
2786
2787@node Using --enable-demo,  , Using --enable-burst, Using Netperf to Measure Aggregate Performance
2788@section Using - -enable-demo
2789
2790One can
2791@example
2792configure --enable-demo
2793@end example
2794and compile netperf to enable netperf to emit ``interim results'' at
2795semi-regular intervals.  This enables a global @code{-D} option which
2796takes a reporting interval as an argument.  With that specified, the
2797output of netperf will then look something like
2798
2799@example
2800$ src/netperf -D 1.25
2801MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo
2802Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405
2803Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655
2804Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905
2805Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155
2806Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429
2807Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679
2808Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932
2809Recv   Send    Send                          
2810Socket Socket  Message  Elapsed              
2811Size   Size    Size     Time     Throughput  
2812bytes  bytes   bytes    secs.    10^6bits/sec  
2813
2814 87380  16384  16384    10.00    25375.66   
2815@end example
2816The units of the ``Interim result'' lines will follow the units
2817selected via the global @code{-f} option.  If the test-specific
2818@code{-o} option is specified on the command line, the format will be
2819CSV:
2820@example
2821...
28222978.81,MBytes/s,1.25,1327962298.035
2823...
2824@end example
2825If the test-specific @code{-k} option is used the format will be
2826keyval with each keyval being given an index:
2827@example
2828...
2829NETPERF_INTERIM_RESULT[2]=25.00
2830NETPERF_UNITS[2]=10^9bits/s
2831NETPERF_INTERVAL[2]=1.25
2832NETPERF_ENDING[2]=1327962357.249
2833...
2834@end example
2835The expectation is it may be easier to utilize the keyvals if they
2836have indices.
2837
2838But how does this help with aggregate tests?  Well, what one can do is
2839start the netperfs via a script, giving each a Very Long (tm) run
2840time.  Direct the output to a file per instance.  Then, once all the
2841netperfs have been started, take a timestamp and wait for some desired
2842test interval.  Once that interval expires take another timestamp and
2843then start terminating the netperfs by sending them a SIGALRM signal
2844via the likes of the @code{kill} or @code{pkill} command.  The
2845netperfs will terminate and emit the rest of the ``usual'' output, and
2846you can then bring the files to a central location for post
2847processing to find the aggregate performance over the ``test interval.''  
2848
2849This method has the advantage that it does not require advance
2850knowledge of how long it takes to get netperf tests started and/or
2851stopped.  It does though require sufficiently synchronized clocks on
2852all the test systems.
2853
2854While calls to get the current time can be inexpensive, that neither
2855has been nor is universally true.  For that reason netperf tries to
2856minimize the number of such ``timestamping'' calls (eg
2857@code{gettimeofday}) calls it makes when in demo mode.  Rather than
2858take a timestamp after each @code{send} or @code{recv} call completes
2859netperf tries to guess how many units of work will be performed over
2860the desired interval.  Only once that many units of work have been
2861completed will netperf check the time.  If the reporting interval has
2862passed, netperf will emit an ``interim result.''  If the interval has
2863not passed, netperf will update its estimate for units and continue.
2864
2865After a bit of thought one can see that if things ``speed-up'' netperf
2866will still honor the interval.  However, if things ``slow-down''
2867netperf may be late with an ``interim result.''  Here is an example of
2868both of those happening during a test - with the interval being
2869honored while throughput increases, and then about half-way through
2870when another netperf (not shown) is started we see things slowing down
2871and netperf not hitting the interval as desired.
2872@example
2873$ src/netperf -D 2 -H tardy.hpl.hp.com -l 20
2874MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo
2875Interim result:   36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565
2876Interim result:   59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569
2877Interim result:   73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576
2878Interim result:   84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603
2879Interim result:   75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814
2880Interim result:   55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538
2881Interim result:   70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650
2882Interim result:   80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777
2883Interim result:   86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901
2884Recv   Send    Send                          
2885Socket Socket  Message  Elapsed              
2886Size   Size    Size     Time     Throughput  
2887bytes  bytes   bytes    secs.    10^6bits/sec  
2888
2889 87380  16384  16384    20.34      68.87   
2890@end example
2891So long as your post-processing mechanism can account for that, there
2892should be no problem.  As time passes there may be changes to try to
2893improve the netperf's honoring the interval but one should not
2894ass-u-me it will always do so.  One should not assume the precision
2895will remain fixed - future versions may change it - perhaps going
2896beyond tenths of seconds in reporting the interval length etc.
2897
2898@node Using Netperf to Measure Bidirectional Transfer, The Omni Tests, Using Netperf to Measure Aggregate Performance, Top
2899@comment  node-name,  next,  previous,  up
2900@chapter Using Netperf to Measure Bidirectional Transfer
2901
2902There are two ways to use netperf to measure the performance of
2903bidirectional transfer.  The first is to run concurrent netperf tests
2904from the command line.  The second is to configure netperf with
2905@code{--enable-burst} and use a single instance of the
2906@ref{TCP_RR,TCP_RR} test.
2907
2908While neither method is more ``correct'' than the other, each is doing
2909so in different ways, and that has possible implications.  For
2910instance, using the concurrent netperf test mechanism means that
2911multiple TCP connections and multiple processes are involved, whereas
2912using the single instance of TCP_RR there is only one TCP connection
2913and one process on each end.  They may behave differently, especially
2914on an MP system.
2915
2916@menu
2917* Bidirectional Transfer with Concurrent Tests::  
2918* Bidirectional Transfer with TCP_RR::  
2919* Implications of Concurrent Tests vs Burst Request/Response::  
2920@end menu
2921
2922@node  Bidirectional Transfer with Concurrent Tests, Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Bidirectional Transfer
2923@comment  node-name,  next,  previous,  up
2924@section Bidirectional Transfer with Concurrent Tests
2925
2926If we had two hosts Fred and Ethel, we could simply run a netperf
2927@ref{TCP_STREAM,TCP_STREAM} test on Fred pointing at Ethel, and a
2928concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but
2929since there are no mechanisms to synchronize netperf tests and we
2930would be starting tests from two different systems, there is a
2931considerable risk of skew error.
2932
2933Far better would be to run simultaneous TCP_STREAM and
2934@ref{TCP_MAERTS,TCP_MAERTS} tests from just @b{one} system, using the
2935concepts and procedures outlined in @ref{Running Concurrent Netperf
2936Tests,Running Concurrent Netperf Tests}. Here then is an example:
2937
2938@example
2939for i in 1
2940do
2941 netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \
2942   -- -s 256K -S 256K &
2943 netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound"  -i 10 -P 0 -v 0 \
2944   -- -s 256K -S 256K &
2945done
2946
2947 892.66 outbound
2948 891.34 inbound
2949@end example
2950
2951We have used a @code{for} loop in the shell with just one iteration
2952because that will be @b{much} easier to get both tests started at more or
2953less the same time than doing it by hand.  The global @option{-P} and
2954@option{-v} options are used because we aren't interested in anything
2955other than the throughput, and the global @option{-B} option is used
2956to tag each output so we know which was inbound and which outbound
2957relative to the system on which we were running netperf.  Of course
2958that sense is switched on the system running netserver :)  The use of
2959the global @option{-i} option is explained in @ref{Running Concurrent
2960Netperf Tests,Running Concurrent Netperf Tests}.
2961
2962Beginning with version 2.5.0 we can accomplish a similar result with
2963the @ref{The Omni Tests,the omni tests} and @ref{Omni Output
2964Selectors,output selectors}:
2965
2966@example
2967for i in 1
2968do
2969  netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
2970    -d stream -s 256K -S 256K -o throughput,direction &
2971  netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
2972    -d maerts -s 256K -S 256K -o throughput,direction &
2973done
2974
2975805.26,Receive
2976828.54,Send
2977@end example
2978
2979@node  Bidirectional Transfer with TCP_RR, Implications of Concurrent Tests vs Burst Request/Response, Bidirectional Transfer with Concurrent Tests, Using Netperf to Measure Bidirectional Transfer
2980@comment  node-name,  next,  previous,  up
2981@section Bidirectional Transfer with TCP_RR
2982
2983Starting with version 2.5.0 the @code{--enable-burst} configure option
2984defaults to @code{yes}, and starting some time before version 2.5.0
2985but after 2.4.0 the global @option{-f} option would affect the
2986``throughput'' reported by request/response tests.  If one uses the
2987test-specific @option{-b} option to have several ``transactions'' in
2988flight at one time and the test-specific @option{-r} option to
2989increase their size, the test looks more and more like a
2990single-connection bidirectional transfer than a simple
2991request/response test.
2992
2993So, putting it all together one can do something like:
2994
2995@example
2996netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K
2997MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6
2998Local /Remote
2999Socket Size   Request  Resp.   Elapsed  
3000Send   Recv   Size     Size    Time     Throughput 
3001bytes  Bytes  bytes    bytes   secs.    10^6bits/sec   
3002
300316384  87380  32768    32768   10.00    1821.30   
3004524288 524288
3005Alignment      Offset         RoundTrip  Trans    Throughput
3006Local  Remote  Local  Remote  Latency    Rate     10^6bits/s
3007Send   Recv    Send   Recv    usec/Tran  per sec  Outbound   Inbound
3008    8      0       0      0   2015.402   3473.252 910.492    910.492
3009@end example
3010
3011to get a bidirectional bulk-throughput result. As one can see, the -v
30122 output will include a number of interesting, related values.
3013
3014@quotation
3015@b{NOTE: The logic behind @code{--enable-burst} is very simple, and there
3016are no calls to @code{poll()} or @code{select()} which means we want
3017to make sure that the @code{send()} calls will never block, or we run
3018the risk of deadlock with each side stuck trying to call @code{send()}
3019and neither calling @code{recv()}.}
3020@end quotation
3021
3022Fortunately, this is easily accomplished by setting a ``large enough''
3023socket buffer size with the test-specific @option{-s} and @option{-S}
3024options.  Presently this must be performed by the user.  Future
3025versions of netperf might attempt to do this automagically, but there
3026are some issues to be worked-out. 
3027
3028@node Implications of Concurrent Tests vs Burst Request/Response,  , Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer
3029@section Implications of Concurrent Tests vs Burst Request/Response
3030
3031There are perhaps subtle but important differences between using
3032concurrent unidirectional tests vs a burst-mode request to measure
3033bidirectional performance.
3034
3035Broadly speaking, a single ``connection'' or ``flow'' of traffic
3036cannot make use of the services of more than one or two CPUs at either
3037end.  Whether one or two CPUs will be used processing a flow will
3038depend on the specifics of the stack(s) involved and whether or not
3039the global @option{-T} option has been used to bind netperf/netserver
3040to specific CPUs.
3041
3042When using concurrent tests there will be two concurrent connections
3043or flows, which means that upwards of four CPUs will be employed
3044processing the packets (global @option{-T} used, no more than two if
3045not), however, with just a single, bidirectional request/response test
3046no more than two CPUs will be employed (only one if the global
3047@option{-T} is not used).
3048
3049If there is a CPU bottleneck on either system this may result in
3050rather different results between the two methods.
3051
3052Also, with a bidirectional request/response test there is something of
3053a natural balance or synchronization between inbound and outbound - a
3054response will not be sent until a request is received, and (once the
3055burst level is reached) a subsequent request will not be sent until a
3056response is received.  This may mask favoritism in the NIC between
3057inbound and outbound processing.
3058
3059With two concurrent unidirectional tests there is no such
3060synchronization or balance and any favoritism in the NIC may be exposed.
3061
3062@node The Omni Tests, Other Netperf Tests, Using Netperf to Measure Bidirectional Transfer, Top
3063@chapter The Omni Tests
3064
3065Beginning with version 2.5.0, netperf begins a migration to the
3066@samp{omni} tests or ``Two routines to measure them all.''  The code for
3067the omni tests can be found in @file{src/nettest_omni.c} and the goal
3068is to make it easier for netperf to support multiple protocols and
3069report a great many additional things about the systems under test.
3070Additionally, a flexible output selection mechanism is present which
3071allows the user to chose specifically what values she wishes to have
3072reported and in what format.
3073
3074The omni tests are included by default in version 2.5.0.  To disable
3075them, one must:
3076@example
3077./configure --enable-omni=no ...
3078@end example
3079
3080and remake netperf.  Remaking netserver is optional because even in
30812.5.0 it has ``unmigrated'' netserver side routines for the classic
3082(eg @file{src/nettest_bsd.c}) tests.
3083
3084@menu
3085* Native Omni Tests::           
3086* Migrated Tests::              
3087* Omni Output Selection::       
3088@end menu
3089
3090@node Native Omni Tests, Migrated Tests, The Omni Tests, The Omni Tests
3091@section Native Omni Tests
3092
3093One access the omni tests ``natively'' by using a value of ``OMNI''
3094with the global @option{-t} test-selection option.  This will then
3095cause netperf to use the code in @file{src/nettest_omni.c} and in
3096particular the test-specific options parser for the omni tests.  The
3097test-specific options for the omni tests are a superset of those for
3098``classic'' tests.  The options added by the omni tests are:
3099
3100@table @code
3101@vindex -c, Test-specific
3102@item -c
3103This explicitly declares that the test is to include connection
3104establishment and tear-down as in either a TCP_CRR or TCP_CC test.
3105
3106@vindex -d, Test-specific
3107@item -d <direction>
3108This option sets the direction of the test relative to the netperf
3109process.  As of version 2.5.0 one can use the following in a
3110case-insensitive manner:
3111
3112@table @code
3113@item send, stream, transmit, xmit or 2 
3114Any of which will cause netperf to send to the netserver.
3115@item recv, receive, maerts or 4
3116Any of which will cause netserver to send to netperf.
3117@item rr or 6
3118Either of which will cause a request/response test.
3119@end table
3120
3121Additionally, one can specify two directions separated by a '|'
3122character and they will be OR'ed together.  In this way one can use
3123the ''Send|Recv'' that will be emitted by the @ref{Omni Output
3124Selectors,DIRECTION} @ref{Omni Output Selection,output selector} when
3125used with a request/response test.
3126
3127@vindex -k, Test-specific
3128@item -k [@ref{Omni Output Selection,output selector}]
3129This option sets the style of output to ``keyval'' where each line of
3130output has the form:
3131@example
3132key=value
3133@end example
3134For example:
3135@example
3136$ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS"
3137OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3138THROUGHPUT=59092.65
3139THROUGHPUT_UNITS=Trans/s
3140@end example
3141
3142Using the @option{-k} option will override any previous, test-specific
3143@option{-o} or @option{-O} option.
3144
3145@vindex -o, Test-specific
3146@item -o [@ref{Omni Output Selection,output selector}]
3147This option sets the style of output to ``CSV'' where there will be
3148one line of comma-separated values, preceded by one line of column
3149names unless the global @option{-P} option is used with a value of 0:
3150@example
3151$ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS"
3152OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3153Throughput,Throughput Units
315460999.07,Trans/s
3155@end example
3156
3157Using the @option{-o} option will override any previous, test-specific
3158@option{-k} or @option{-O} option.
3159
3160@vindex -O, Test-specific
3161@item -O [@ref{Omni Output Selection,output selector}]
3162This option sets the style of output to ``human readable'' which will
3163look quite similar to classic netperf output:
3164@example
3165$ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS"
3166OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3167Throughput Throughput 
3168           Units      
3169                      
3170                      
317160492.57   Trans/s
3172@end example
3173
3174Using the @option{-O} option will override any previous, test-specific
3175@option{-k} or @option{-o} option.
3176
3177@vindex -t, Test-specific
3178@item -t
3179This option explicitly sets the socket type for the test's data
3180connection. As of version 2.5.0 the known socket types include
3181``stream'' and ``dgram'' for SOCK_STREAM and SOCK_DGRAM respectively.
3182
3183@vindex -T, Test-specific
3184@item -T <protocol>
3185This option is used to explicitly set the protocol used for the
3186test. It is case-insensitive. As of version 2.5.0 the protocols known
3187to netperf include:
3188@table @code
3189@item TCP
3190Select the Transmission Control Protocol
3191@item UDP
3192Select the User Datagram Protocol
3193@item SDP
3194Select the Sockets Direct Protocol
3195@item DCCP
3196Select the Datagram Congestion Control Protocol
3197@item SCTP
3198Select the Stream Control Transport Protocol
3199@item udplite
3200Select UDP Lite
3201@end table
3202
3203The default is implicit based on other settings.
3204@end table
3205
3206The omni tests also extend the interpretation of some of the classic,
3207test-specific options for the BSD Sockets tests:
3208
3209@table @code
3210@item -m <optionspec>
3211This can set the send size for either or both of the netperf and
3212netserver sides of the test:
3213@example
3214-m 32K
3215@end example
3216sets only the netperf-side send size to 32768 bytes, and or's-in
3217transmit for the direction. This is effectively the same behaviour as
3218for the classic tests.
3219@example
3220-m ,32K
3221@end example
3222sets only the netserver side send size to 32768 bytes and or's-in
3223receive for the direction.
3224@example
3225-m 16K,32K
3226sets the netperf side send size to 16284 bytes, the netserver side
3227send size to 32768 bytes and the direction will be "Send|Recv."
3228@end example
3229@item -M <optionspec>
3230This can set the receive size for either or both of the netperf and
3231netserver sides of the test:
3232@example
3233-M 32K
3234@end example
3235sets only the netserver side receive size to 32768 bytes and or's-in
3236send for the test direction.
3237@example
3238-M ,32K
3239@end example
3240sets only the netperf side receive size to 32768 bytes and or's-in
3241receive for the test direction.
3242@example
3243-M 16K,32K
3244@end example
3245sets the netserver side receive size to 16384 bytes and the netperf
3246side receive size to 32768 bytes and the direction will be "Send|Recv."
3247@end table
3248
3249@node Migrated Tests, Omni Output Selection, Native Omni Tests, The Omni Tests
3250@section Migrated Tests
3251
3252As of version 2.5.0 several tests have been migrated to use the omni
3253code in @file{src/nettest_omni.c} for the core of their testing.  A
3254migrated test retains all its previous output code and so should still
3255``look and feel'' just like a pre-2.5.0 test with one exception - the
3256first line of the test banners will include the word ``MIGRATED'' at
3257the beginning as in:
3258
3259@example
3260$ netperf
3261MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3262Recv   Send    Send                          
3263Socket Socket  Message  Elapsed              
3264Size   Size    Size     Time     Throughput  
3265bytes  bytes   bytes    secs.    10^6bits/sec  
3266
3267 87380  16384  16384    10.00    27175.27   
3268@end example
3269
3270The tests migrated in version 2.5.0 are:
3271@itemize
3272@item TCP_STREAM
3273@item TCP_MAERTS
3274@item TCP_RR
3275@item TCP_CRR
3276@item UDP_STREAM
3277@item UDP_RR
3278@end itemize
3279
3280It is expected that future releases will have additional tests
3281migrated to use the ``omni'' functionality.
3282
3283If one uses ``omni-specific'' test-specific options in conjunction
3284with a migrated test, instead of using the classic output code, the
3285new omni output code will be used. For example if one uses the
3286@option{-k} test-specific option with a value of
3287``MIN_LATENCY,MAX_LATENCY'' with a migrated TCP_RR test one will see:
3288
3289@example
3290$ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS
3291MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3292THROUGHPUT=60074.74
3293THROUGHPUT_UNITS=Trans/s
3294@end example
3295rather than:
3296@example
3297$ netperf -t tcp_rr
3298MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3299Local /Remote
3300Socket Size   Request  Resp.   Elapsed  Trans.
3301Send   Recv   Size     Size    Time     Rate         
3302bytes  Bytes  bytes    bytes   secs.    per sec   
3303
330416384  87380  1        1       10.00    59421.52   
330516384  87380 
3306@end example
3307
3308@node Omni Output Selection,  , Migrated Tests, The Omni Tests
3309@section Omni Output Selection
3310
3311The omni test-specific @option{-k}, @option{-o} and @option{-O}
3312options take an optional @code{output selector} by which the user can
3313configure what values are reported.  The output selector can take
3314several forms:
3315
3316@table @code
3317@item @file{filename}
3318The output selections will be read from the named file. Within the
3319file there can be up to four lines of comma-separated output
3320selectors. This controls how many multi-line blocks of output are emitted
3321when the @option{-O} option is used.  This output, while not identical to
3322``classic'' netperf output, is inspired by it.  Multiple lines have no
3323effect for @option{-k} and @option{-o} options.  Putting output
3324selections in a file can be useful when the list of selections is long.
3325@item comma and/or semi-colon-separated list
3326The output selections will be parsed from a comma and/or
3327semi-colon-separated list of output selectors. When the list is given
3328to a @option{-O} option a semi-colon specifies a new output block
3329should be started.  Semi-colons have the same meaning as commas when
3330used with the @option{-k} or @option{-o} options.  Depending on the
3331command interpreter being used, the semi-colon may have to be escaped
3332somehow to keep it from being interpreted by the command interpreter.
3333This can often be done by enclosing the entire list in quotes.
3334@item all
3335If the keyword @b{all} is specified it means that all known output
3336values should be displayed at the end of the test.  This can be a
3337great deal of output.  As of version 2.5.0 there are 157 different
3338output selectors.
3339@item ?
3340If a ``?'' is given as the output selection, the list of all known
3341output selectors will be displayed and no test actually run.  When
3342passed to the @option{-O} option they will be listed one per
3343line. Otherwise they will be listed as a comma-separated list.  It may
3344be necessary to protect the ``?'' from the command interpreter by
3345escaping it or enclosing it in quotes.
3346@item no selector
3347If nothing is given to the @option{-k}, @option{-o} or @option{-O}
3348option then the code selects a default set of output selectors
3349inspired by classic netperf output. The format will be the @samp{human
3350readable} format emitted by the test-specific @option{-O} option.
3351@end table
3352
3353The order of evaluation will first check for an output selection.  If
3354none is specified with the @option{-k}, @option{-o} or @option{-O}
3355option netperf will select a default based on the characteristics of the
3356test.  If there is an output selection, the code will first check for
3357@samp{?}, then check to see if it is the magic @samp{all} keyword.
3358After that it will check for either @samp{,} or @samp{;} in the
3359selection and take that to mean it is a comma and/or
3360semi-colon-separated list. If none of those checks match, netperf will then
3361assume the output specification is a filename and attempt to open and
3362parse the file.
3363
3364@menu
3365* Omni Output Selectors::       
3366@end menu
3367
3368@node Omni Output Selectors,  , Omni Output Selection, Omni Output Selection
3369@subsection Omni Output Selectors
3370
3371As of version 2.5.0 the output selectors are:
3372
3373@table @code
3374@item OUTPUT_NONE
3375This is essentially a null output.  For @option{-k} output it will
3376simply add a line that reads ``OUTPUT_NONE='' to the output. For
3377@option{-o} it will cause an empty ``column'' to be included. For
3378@option{-O} output it will cause extra spaces to separate ``real'' output.
3379@item SOCKET_TYPE
3380This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for the
3381data connection to be output.
3382@item PROTOCOL
3383This will cause the protocol used for the data connection to be displayed.
3384@item DIRECTION
3385This will display the data flow direction relative to the netperf
3386process. Units: Send or Recv for a unidirectional bulk-transfer test,
3387or Send|Recv for a request/response test.
3388@item ELAPSED_TIME
3389This will display the elapsed time in seconds for the test.
3390@item THROUGHPUT
3391This will display the throughput for the test. Units: As requested via
3392the global @option{-f} option and displayed by the THROUGHPUT_UNITS
3393output selector.
3394@item THROUGHPUT_UNITS
3395This will display the units for what is displayed by the
3396@code{THROUGHPUT} output selector.
3397@item LSS_SIZE_REQ
3398This will display the local (netperf) send socket buffer size (aka
3399SO_SNDBUF) requested via the command line. Units: Bytes.
3400@item LSS_SIZE
3401This will display the local (netperf) send socket buffer size
3402(SO_SNDBUF) immediately after the data connection socket was created.
3403Peculiarities of different networking stacks may lead to this
3404differing from the size requested via the command line. Units: Bytes.
3405@item LSS_SIZE_END
3406This will display the local (netperf) send socket buffer size
3407(SO_SNDBUF) immediately before the data connection socket is closed.
3408Peculiarities of different networking stacks may lead this to differ
3409from the size requested via the command line and/or the size
3410immediately after the data connection socket was created. Units: Bytes.
3411@item LSR_SIZE_REQ
3412This will display the local (netperf) receive socket buffer size (aka
3413SO_RCVBUF) requested via the command line. Units: Bytes.
3414@item LSR_SIZE
3415This will display the local (netperf) receive socket buffer size
3416(SO_RCVBUF) immediately after the data connection socket was created.
3417Peculiarities of different networking stacks may lead to this
3418differing from the size requested via the command line. Units: Bytes.
3419@item LSR_SIZE_END
3420This will display the local (netperf) receive socket buffer size
3421(SO_RCVBUF) immediately before the data connection socket is closed.
3422Peculiarities of different networking stacks may lead this to differ
3423from the size requested via the command line and/or the size
3424immediately after the data connection socket was created. Units: Bytes.
3425@item RSS_SIZE_REQ
3426This will display the remote (netserver) send socket buffer size (aka
3427SO_SNDBUF) requested via the command line. Units: Bytes.
3428@item RSS_SIZE
3429This will display the remote (netserver) send socket buffer size
3430(SO_SNDBUF) immediately after the data connection socket was created.
3431Peculiarities of different networking stacks may lead to this
3432differing from the size requested via the command line. Units: Bytes.
3433@item RSS_SIZE_END
3434This will display the remote (netserver) send socket buffer size
3435(SO_SNDBUF) immediately before the data connection socket is closed.
3436Peculiarities of different networking stacks may lead this to differ
3437from the size requested via the command line and/or the size
3438immediately after the data connection socket was created. Units: Bytes.
3439@item RSR_SIZE_REQ
3440This will display the remote (netserver) receive socket buffer size (aka
3441SO_RCVBUF) requested via the command line. Units: Bytes.
3442@item RSR_SIZE
3443This will display the remote (netserver) receive socket buffer size
3444(SO_RCVBUF) immediately after the data connection socket was created.
3445Peculiarities of different networking stacks may lead to this
3446differing from the size requested via the command line. Units: Bytes.
3447@item RSR_SIZE_END
3448This will display the remote (netserver) receive socket buffer size
3449(SO_RCVBUF) immediately before the data connection socket is closed.
3450Peculiarities of different networking stacks may lead this to differ
3451from the size requested via the command line and/or the size
3452immediately after the data connection socket was created. Units: Bytes.
3453@item LOCAL_SEND_SIZE
3454This will display the size of the buffers netperf passed in any
3455``send'' calls it made on the data connection for a
3456non-request/response test. Units: Bytes.
3457@item LOCAL_RECV_SIZE
3458This will display the size of the buffers netperf passed in any
3459``receive'' calls it made on the data connection for a
3460non-request/response test. Units: Bytes.
3461@item REMOTE_SEND_SIZE
3462This will display the size of the buffers netserver passed in any
3463``send'' calls it made on the data connection for a
3464non-request/response test. Units: Bytes.
3465@item REMOTE_RECV_SIZE
3466This will display the size of the buffers netserver passed in any
3467``receive'' calls it made on the data connection for a
3468non-request/response test. Units: Bytes.
3469@item REQUEST_SIZE
3470This will display the size of the requests netperf sent in a
3471request-response test. Units: Bytes.
3472@item RESPONSE_SIZE
3473This will display the size of the responses netserver sent in a
3474request-response test. Units: Bytes.
3475@item LOCAL_CPU_UTIL
3476This will display the overall CPU utilization during the test as
3477measured by netperf. Units: 0 to 100 percent.
3478@item LOCAL_CPU_PERCENT_USER
3479This will display the CPU fraction spent in user mode during the test
3480as measured by netperf. Only supported by netcpu_procstat. Units: 0 to
3481100 percent.
3482@item LOCAL_CPU_PERCENT_SYSTEM
3483This will display the CPU fraction spent in system mode during the test
3484as measured by netperf. Only supported by netcpu_procstat. Units: 0 to
3485100 percent.
3486@item LOCAL_CPU_PERCENT_IOWAIT
3487This will display the fraction of time waiting for I/O to complete
3488during the test as measured by netperf. Only supported by
3489netcpu_procstat. Units: 0 to 100 percent.
3490@item LOCAL_CPU_PERCENT_IRQ
3491This will display the fraction of time servicing interrupts during the
3492test as measured by netperf. Only supported by netcpu_procstat. Units:
34930 to 100 percent.
3494@item LOCAL_CPU_PERCENT_SWINTR
3495This will display the fraction of time servicing softirqs during the
3496test as measured by netperf. Only supported by netcpu_procstat. Units:
34970 to 100 percent.
3498@item LOCAL_CPU_METHOD
3499This will display the method used by netperf to measure CPU
3500utilization. Units: single character denoting method.
3501@item LOCAL_SD
3502This will display the service demand, or units of CPU consumed per
3503unit of work, as measured by netperf. Units: microseconds of CPU
3504consumed per either KB (K==1024) of data transferred or request/response
3505transaction. 
3506@item REMOTE_CPU_UTIL
3507This will display the overall CPU utilization during the test as
3508measured by netserver. Units 0 to 100 percent.
3509@item REMOTE_CPU_PERCENT_USER
3510This will display the CPU fraction spent in user mode during the test
3511as measured by netserver. Only supported by netcpu_procstat. Units: 0 to
3512100 percent.
3513@item REMOTE_CPU_PERCENT_SYSTEM
3514This will display the CPU fraction spent in system mode during the test
3515as measured by netserver. Only supported by netcpu_procstat. Units: 0 to
3516100 percent.
3517@item REMOTE_CPU_PERCENT_IOWAIT
3518This will display the fraction of time waiting for I/O to complete
3519during the test as measured by netserver. Only supported by
3520netcpu_procstat. Units: 0 to 100 percent.
3521@item REMOTE_CPU_PERCENT_IRQ
3522This will display the fraction of time servicing interrupts during the
3523test as measured by netserver. Only supported by netcpu_procstat. Units:
35240 to 100 percent.
3525@item REMOTE_CPU_PERCENT_SWINTR
3526This will display the fraction of time servicing softirqs during the
3527test as measured by netserver. Only supported by netcpu_procstat. Units:
35280 to 100 percent.
3529@item REMOTE_CPU_METHOD
3530This will display the method used by netserver to measure CPU
3531utilization. Units: single character denoting method.
3532@item REMOTE_SD
3533This will display the service demand, or units of CPU consumed per
3534unit of work, as measured by netserver. Units: microseconds of CPU
3535consumed per either KB (K==1024) of data transferred or
3536request/response transaction.
3537@item SD_UNITS
3538This will display the units for LOCAL_SD and REMOTE_SD
3539@item CONFIDENCE_LEVEL
3540This will display the confidence level requested by the user either
3541explicitly via the global @option{-I} option, or implicitly via the
3542global @option{-i} option.  The value will be either 95 or 99 if
3543confidence intervals have been requested or 0 if they were not. Units:
3544Percent
3545@item CONFIDENCE_INTERVAL
3546This will display the width of the confidence interval requested
3547either explicitly via the global @option{-I} option or implicitly via
3548the global @option{-i} option.  Units: Width in percent of mean value
3549computed. A value of -1.0 means that confidence intervals were not requested.
3550@item CONFIDENCE_ITERATION
3551This will display the number of test iterations netperf undertook,
3552perhaps while attempting to achieve the requested confidence interval
3553and level. If confidence intervals were requested via the command line
3554then the value will be between 3 and 30.  If confidence intervals were
3555not requested the value will be 1.  Units: Iterations
3556@item THROUGHPUT_CONFID
3557This will display the width of the confidence interval actually
3558achieved for @code{THROUGHPUT} during the test.  Units: Width of
3559interval as percentage of reported throughput value.
3560@item LOCAL_CPU_CONFID
3561This will display the width of the confidence interval actually
3562achieved for overall CPU utilization on the system running netperf
3563(@code{LOCAL_CPU_UTIL}) during the test, if CPU utilization measurement
3564was enabled.  Units: Width of interval as percentage of reported CPU
3565utilization.
3566@item REMOTE_CPU_CONFID
3567This will display the width of the confidence interval actually
3568achieved for overall CPU utilization on the system running netserver
3569(@code{REMOTE_CPU_UTIL}) during the test, if CPU utilization
3570measurement was enabled. Units: Width of interval as percentage of
3571reported CPU utilization.
3572@item TRANSACTION_RATE
3573This will display the transaction rate in transactions per second for
3574a request/response test even if the user has requested a throughput in
3575units of bits or bytes per second via the global @option{-f}
3576option. It is undefined for a non-request/response test. Units:
3577Transactions per second.
3578@item RT_LATENCY
3579This will display the average round-trip latency for a
3580request/response test, accounting for number of transactions in flight
3581at one time. It is undefined for a non-request/response test. Units:
3582Microseconds per transaction
3583@item BURST_SIZE
3584This will display the ``burst size'' or added transactions in flight
3585in a request/response test as requested via a test-specific
3586@option{-b} option.  The number of transactions in flight at one time
3587will be one greater than this value.  It is undefined for a
3588non-request/response test. Units: added Transactions in flight.
3589@item LOCAL_TRANSPORT_RETRANS
3590This will display the number of retransmissions experienced on the
3591data connection during the test as determined by netperf.  A value of
3592-1 means the attempt to determine the number of retransmissions failed
3593or the concept was not valid for the given protocol or the mechanism
3594is not known for the platform. A value of -2 means it was not
3595attempted. As of version 2.5.0 the meaning of values are in flux and
3596subject to change.  Units: number of retransmissions.
3597@item REMOTE_TRANSPORT_RETRANS
3598This will display the number of retransmissions experienced on the
3599data connection during the test as determined by netserver.  A value
3600of -1 means the attempt to determine the number of retransmissions
3601failed or the concept was not valid for the given protocol or the
3602mechanism is not known for the platform. A value of -2 means it was
3603not attempted. As of version 2.5.0 the meaning of values are in flux
3604and subject to change.  Units: number of retransmissions.
3605@item TRANSPORT_MSS
3606This will display the Maximum Segment Size (aka MSS) or its equivalent
3607for the protocol being used during the test.  A value of -1 means
3608either the concept of an MSS did not apply to the protocol being used,
3609or there was an error in retrieving it. Units: Bytes.
3610@item LOCAL_SEND_THROUGHPUT
3611The throughput as measured by netperf for the successful ``send''
3612calls it made on the data connection. Units: as requested via the
3613global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
3614output selector.
3615@item LOCAL_RECV_THROUGHPUT
3616The throughput as measured by netperf for the successful ``receive''
3617calls it made on the data connection. Units: as requested via the
3618global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
3619output selector.
3620@item REMOTE_SEND_THROUGHPUT
3621The throughput as measured by netserver for the successful ``send''
3622calls it made on the data connection. Units: as requested via the
3623global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
3624output selector.
3625@item REMOTE_RECV_THROUGHPUT
3626The throughput as measured by netserver for the successful ``receive''
3627calls it made on the data connection. Units: as requested via the
3628global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
3629output selector.
3630@item LOCAL_CPU_BIND
3631The CPU to which netperf was bound, if at all, during the test. A
3632value of -1 means that netperf was not explicitly bound to a CPU
3633during the test. Units: CPU ID
3634@item LOCAL_CPU_COUNT
3635The number of CPUs (cores, threads) detected by netperf. Units: CPU count.
3636@item LOCAL_CPU_PEAK_UTIL
3637The utilization of the CPU most heavily utilized during the test, as
3638measured by netperf. This can be used to see if any one CPU of a
3639multi-CPU system was saturated even though the overall CPU utilization
3640as reported by @code{LOCAL_CPU_UTIL} was low. Units: 0 to 100% 
3641@item LOCAL_CPU_PEAK_ID
3642The id of the CPU most heavily utilized during the test as determined
3643by netperf. Units: CPU ID.
3644@item LOCAL_CPU_MODEL
3645Model information for the processor(s) present on the system running
3646netperf. Assumes all processors in the system (as perceived by
3647netperf) on which netperf is running are the same model. Units: Text
3648@item LOCAL_CPU_FREQUENCY
3649The frequency of the processor(s) on the system running netperf, at
3650the time netperf made the call.  Assumes that all processors present
3651in the system running netperf are running at the same
3652frequency. Units: MHz
3653@item REMOTE_CPU_BIND
3654The CPU to which netserver was bound, if at all, during the test. A
3655value of -1 means that netperf was not explicitly bound to a CPU
3656during the test. Units: CPU ID
3657@item REMOTE_CPU_COUNT
3658The number of CPUs (cores, threads) detected by netserver. Units: CPU
3659count.
3660@item REMOTE_CPU_PEAK_UTIL
3661The utilization of the CPU most heavily utilized during the test, as
3662measured by netserver. This can be used to see if any one CPU of a
3663multi-CPU system was saturated even though the overall CPU utilization
3664as reported by @code{REMOTE_CPU_UTIL} was low. Units: 0 to 100%
3665@item REMOTE_CPU_PEAK_ID
3666The id of the CPU most heavily utilized during the test as determined
3667by netserver. Units: CPU ID.
3668@item REMOTE_CPU_MODEL
3669Model information for the processor(s) present on the system running
3670netserver. Assumes all processors in the system (as perceived by
3671netserver) on which netserver is running are the same model. Units:
3672Text
3673@item REMOTE_CPU_FREQUENCY
3674The frequency of the processor(s) on the system running netserver, at
3675the time netserver made the call.  Assumes that all processors present
3676in the system running netserver are running at the same
3677frequency. Units: MHz
3678@item SOURCE_PORT
3679The port ID/service name to which the data socket created by netperf
3680was bound.  A value of 0 means the data socket was not explicitly
3681bound to a port number. Units: ASCII text.
3682@item SOURCE_ADDR
3683The name/address to which the data socket created by netperf was
3684bound. A value of 0.0.0.0 means the data socket was not explicitly
3685bound to an address. Units: ASCII text.
3686@item SOURCE_FAMILY
3687The address family to which the data socket created by netperf was
3688bound.  A value of 0 means the data socket was not explicitly bound to
3689a given address family. Units: ASCII text.
3690@item DEST_PORT
3691The port ID to which the data socket created by netserver was bound. A
3692value of 0 means the data socket was not explicitly bound to a port
3693number.  Units: ASCII text.
3694@item DEST_ADDR
3695The name/address of the data socket created by netserver.  Units:
3696ASCII text.
3697@item DEST_FAMILY
3698The address family to which the data socket created by netserver was
3699bound. A value of 0 means the data socket was not explicitly bound to
3700a given address family. Units: ASCII text.
3701@item LOCAL_SEND_CALLS
3702The number of successful ``send'' calls made by netperf against its
3703data socket. Units: Calls.
3704@item LOCAL_RECV_CALLS
3705The number of successful ``receive'' calls made by netperf against its
3706data socket. Units: Calls.
3707@item LOCAL_BYTES_PER_RECV
3708The average number of bytes per ``receive'' call made by netperf
3709against its data socket. Units: Bytes.
3710@item LOCAL_BYTES_PER_SEND
3711The average number of bytes per ``send'' call made by netperf against
3712its data socket. Units: Bytes.
3713@item LOCAL_BYTES_SENT
3714The number of bytes successfully sent by netperf through its data
3715socket. Units: Bytes.
3716@item LOCAL_BYTES_RECVD
3717The number of bytes successfully received by netperf through its data
3718socket. Units: Bytes.
3719@item LOCAL_BYTES_XFERD
3720The sum of bytes sent and received by netperf through its data
3721socket. Units: Bytes.
3722@item LOCAL_SEND_OFFSET
3723The offset from the alignment of the buffers passed by netperf in its
3724``send'' calls. Specified via the global @option{-o} option and
3725defaults to 0. Units: Bytes.
3726@item LOCAL_RECV_OFFSET
3727The offset from the alignment of the buffers passed by netperf in its
3728``receive'' calls. Specified via the global @option{-o} option and
3729defaults to 0. Units: Bytes.
3730@item LOCAL_SEND_ALIGN
3731The alignment of the buffers passed by netperf in its ``send'' calls
3732as specified via the global @option{-a} option. Defaults to 8. Units:
3733Bytes.
3734@item LOCAL_RECV_ALIGN
3735The alignment of the buffers passed by netperf in its ``receive''
3736calls as specified via the global @option{-a} option. Defaults to
37378. Units: Bytes.
3738@item LOCAL_SEND_WIDTH
3739The ``width'' of the ring of buffers through which netperf cycles as
3740it makes its ``send'' calls.  Defaults to one more than the local send
3741socket buffer size divided by the send size as determined at the time
3742the data socket is created. Can be used to make netperf more processor
3743data cache unfriendly. Units: number of buffers.
3744@item LOCAL_RECV_WIDTH
3745The ``width'' of the ring of buffers through which netperf cycles as
3746it makes its ``receive'' calls.  Defaults to one more than the local
3747receive socket buffer size divided by the receive size as determined
3748at the time the data socket is created. Can be used to make netperf
3749more processor data cache unfriendly. Units: number of buffers.
3750@item LOCAL_SEND_DIRTY_COUNT
3751The number of bytes to ``dirty'' (write to) before netperf makes a
3752``send'' call. Specified via the global @option{-k} option, which
3753requires that --enable-dirty=yes was specified with the configure
3754command prior to building netperf. Units: Bytes.
3755@item LOCAL_RECV_DIRTY_COUNT
3756The number of bytes to ``dirty'' (write to) before netperf makes a
3757``recv'' call. Specified via the global @option{-k} option which
3758requires that --enable-dirty was specified with the configure command
3759prior to building netperf. Units: Bytes.
3760@item LOCAL_RECV_CLEAN_COUNT
3761The number of bytes netperf should read ``cleanly'' before making a
3762``receive'' call. Specified via the global @option{-k} option which
3763requires that --enable-dirty was specified with configure command
3764prior to building netperf.  Clean reads start were dirty writes ended.
3765Units: Bytes.
3766@item LOCAL_NODELAY
3767Indicates whether or not setting the test protocol-specific ``no
3768delay'' (eg TCP_NODELAY) option on the data socket used by netperf was
3769requested by the test-specific @option{-D} option and
3770successful. Units: 0 means no, 1 means yes.
3771@item LOCAL_CORK
3772Indicates whether or not TCP_CORK was set on the data socket used by
3773netperf as requested via the test-specific @option{-C} option. 1 means
3774yes, 0 means no/not applicable.
3775@item REMOTE_SEND_CALLS
3776@item REMOTE_RECV_CALLS
3777@item REMOTE_BYTES_PER_RECV
3778@item REMOTE_BYTES_PER_SEND
3779@item REMOTE_BYTES_SENT
3780@item REMOTE_BYTES_RECVD
3781@item REMOTE_BYTES_XFERD
3782@item REMOTE_SEND_OFFSET
3783@item REMOTE_RECV_OFFSET
3784@item REMOTE_SEND_ALIGN
3785@item REMOTE_RECV_ALIGN
3786@item REMOTE_SEND_WIDTH
3787@item REMOTE_RECV_WIDTH
3788@item REMOTE_SEND_DIRTY_COUNT
3789@item REMOTE_RECV_DIRTY_COUNT
3790@item REMOTE_RECV_CLEAN_COUNT
3791@item REMOTE_NODELAY
3792@item REMOTE_CORK
3793These are all like their ``LOCAL_'' counterparts only for the
3794netserver rather than netperf.
3795@item LOCAL_SYSNAME
3796The name of the OS (eg ``Linux'') running on the system on which
3797netperf was running. Units: ASCII Text
3798@item LOCAL_SYSTEM_MODEL
3799The model name of the system on which netperf was running. Units:
3800ASCII Text.
3801@item LOCAL_RELEASE
3802The release name/number of the OS running on the system on which
3803netperf  was running. Units: ASCII Text
3804@item LOCAL_VERSION
3805The version number of the OS running on the system on which netperf
3806was running. Units: ASCII Text
3807@item LOCAL_MACHINE
3808The machine architecture of the machine on which netperf was
3809running. Units: ASCII Text.
3810@item REMOTE_SYSNAME
3811@item REMOTE_SYSTEM_MODEL
3812@item REMOTE_RELEASE
3813@item REMOTE_VERSION
3814@item REMOTE_MACHINE
3815These are all like their ``LOCAL_'' counterparts only for the
3816netserver rather than netperf.
3817@item LOCAL_INTERFACE_NAME
3818The name of the probable egress interface through which the data
3819connection went on the system running netperf. Example: eth0. Units:
3820ASCII Text.
3821@item LOCAL_INTERFACE_VENDOR
3822The vendor ID of the probable egress interface through which traffic
3823on the data connection went on the system running netperf. Units:
3824Hexadecimal IDs as might be found in a @file{pci.ids} file or at
3825@uref{http://pciids.sourceforge.net/,the PCI ID Repository}.
3826@item LOCAL_INTERFACE_DEVICE
3827The device ID of the probable egress interface through which traffic
3828on the data connection went on the system running netperf. Units:
3829Hexadecimal IDs as might be found in a @file{pci.ids} file or at
3830@uref{http://pciids.sourceforge.net/,the PCI ID Repository}.
3831@item LOCAL_INTERFACE_SUBVENDOR
3832The sub-vendor ID of the probable egress interface through which
3833traffic on the data connection went on the system running
3834netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids}
3835file or at @uref{http://pciids.sourceforge.net/,the PCI ID
3836Repository}.
3837@item LOCAL_INTERFACE_SUBDEVICE
3838The sub-device ID of the probable egress interface through which
3839traffic on the data connection went on the system running
3840netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids}
3841file or at @uref{http://pciids.sourceforge.net/,the PCI ID
3842Repository}.
3843@item LOCAL_DRIVER_NAME
3844The name of the driver used for the probable egress interface through
3845which traffic on the data connection went on the system running
3846netperf. Units: ASCII Text.
3847@item LOCAL_DRIVER_VERSION
3848The version string for the driver used for the probable egress
3849interface through which traffic on the data connection went on the
3850system running netperf. Units: ASCII Text.
3851@item LOCAL_DRIVER_FIRMWARE
3852The firmware version for the driver used for the probable egress
3853interface through which traffic on the data connection went on the
3854system running netperf. Units: ASCII Text.
3855@item LOCAL_DRIVER_BUS
3856The bus address of the probable egress interface through which traffic
3857on the data connection went on the system running netperf. Units:
3858ASCII Text.
3859@item LOCAL_INTERFACE_SLOT
3860The slot ID of the probable egress interface through which traffic
3861on the data connection went on the system running netperf. Units:
3862ASCII Text.
3863@item REMOTE_INTERFACE_NAME
3864@item REMOTE_INTERFACE_VENDOR
3865@item REMOTE_INTERFACE_DEVICE
3866@item REMOTE_INTERFACE_SUBVENDOR
3867@item REMOTE_INTERFACE_SUBDEVICE
3868@item REMOTE_DRIVER_NAME
3869@item REMOTE_DRIVER_VERSION
3870@item REMOTE_DRIVER_FIRMWARE
3871@item REMOTE_DRIVER_BUS
3872@item REMOTE_INTERFACE_SLOT
3873These are all like their ``LOCAL_'' counterparts only for the
3874netserver rather than netperf.
3875@item LOCAL_INTERVAL_USECS
3876The interval at which bursts of operations (sends, receives,
3877transactions) were attempted by netperf.  Specified by the
3878global @option{-w} option which requires --enable-intervals to have
3879been specified with the configure command prior to building
3880netperf. Units: Microseconds (though specified by default in
3881milliseconds on the command line)
3882@item LOCAL_INTERVAL_BURST
3883The number of operations (sends, receives, transactions depending on
3884the test) which were attempted by netperf each LOCAL_INTERVAL_USECS
3885units of time. Specified by the global @option{-b} option which
3886requires --enable-intervals to have been specified with the configure
3887command prior to building netperf.  Units: number of operations per burst.
3888@item REMOTE_INTERVAL_USECS
3889The interval at which bursts of operations (sends, receives,
3890transactions) were attempted by netserver.  Specified by the
3891global @option{-w} option which requires --enable-intervals to have
3892been specified with the configure command prior to building
3893netperf. Units: Microseconds (though specified by default in
3894milliseconds on the command line)
3895@item REMOTE_INTERVAL_BURST
3896The number of operations (sends, receives, transactions depending on
3897the test) which were attempted by netperf each LOCAL_INTERVAL_USECS
3898units of time. Specified by the global @option{-b} option which
3899requires --enable-intervals to have been specified with the configure
3900command prior to building netperf.  Units: number of operations per burst.
3901@item LOCAL_SECURITY_TYPE_ID
3902@item LOCAL_SECURITY_TYPE
3903@item LOCAL_SECURITY_ENABLED_NUM
3904@item LOCAL_SECURITY_ENABLED
3905@item LOCAL_SECURITY_SPECIFIC
3906@item REMOTE_SECURITY_TYPE_ID
3907@item REMOTE_SECURITY_TYPE
3908@item REMOTE_SECURITY_ENABLED_NUM
3909@item REMOTE_SECURITY_ENABLED
3910@item REMOTE_SECURITY_SPECIFIC
3911A bunch of stuff related to what sort of security mechanisms (eg
3912SELINUX) were enabled on the systems during the test.
3913@item RESULT_BRAND
3914The string specified by the user with the global @option{-B}
3915option. Units: ASCII Text.
3916@item UUID
3917The universally unique identifier associated with this test, either
3918generated automagically by netperf, or passed to netperf via an omni
3919test-specific @option{-u} option. Note: Future versions may make this
3920a global command-line option. Units: ASCII Text.
3921@item MIN_LATENCY
3922The minimum ``latency'' or operation time (send, receive or
3923request/response exchange depending on the test) as measured on the
3924netperf side when the global @option{-j} option was specified. Units:
3925Microseconds.
3926@item MAX_LATENCY
3927The maximum ``latency'' or operation time (send, receive or
3928request/response exchange depending on the test) as measured on the
3929netperf side when the global @option{-j} option was specified. Units:
3930Microseconds.
3931@item P50_LATENCY
3932The 50th percentile value of ``latency'' or operation time (send, receive or
3933request/response exchange depending on the test) as measured on the
3934netperf side when the global @option{-j} option was specified. Units:
3935Microseconds.
3936@item P90_LATENCY
3937The 90th percentile value of ``latency'' or operation time (send, receive or
3938request/response exchange depending on the test) as measured on the
3939netperf side when the global @option{-j} option was specified. Units:
3940Microseconds.
3941@item P99_LATENCY
3942The 99th percentile value of ``latency'' or operation time (send, receive or
3943request/response exchange depending on the test) as measured on the
3944netperf side when the global @option{-j} option was specified. Units:
3945Microseconds.
3946@item MEAN_LATENCY
3947The average ``latency'' or operation time (send, receive or
3948request/response exchange depending on the test) as measured on the
3949netperf side when the global @option{-j} option was specified. Units:
3950Microseconds.
3951@item STDDEV_LATENCY
3952The standard deviation of ``latency'' or operation time (send, receive or
3953request/response exchange depending on the test) as measured on the
3954netperf side when the global @option{-j} option was specified. Units:
3955Microseconds.
3956@item COMMAND_LINE
3957The full command line used when invoking netperf. Units: ASCII Text.
3958@item OUTPUT_END
3959While emitted with the list of output selectors, it is ignored when
3960specified as an output selector.
3961@end table
3962
3963@node Other Netperf Tests, Address Resolution, The Omni Tests, Top
3964@chapter Other Netperf Tests
3965
3966Apart from the typical performance tests, netperf contains some tests
3967which can be used to streamline measurements and reporting.  These
3968include CPU rate calibration (present) and host identification (future
3969enhancement).
3970
3971@menu
3972* CPU rate calibration::        
3973* UUID Generation::             
3974@end menu
3975
3976@node CPU rate calibration, UUID Generation, Other Netperf Tests, Other Netperf Tests
3977@section CPU rate calibration
3978
3979Some of the CPU utilization measurement mechanisms of netperf work by
3980comparing the rate at which some counter increments when the system is
3981idle with the rate at which that same counter increments when the
3982system is running a netperf test.  The ratio of those rates is used to
3983arrive at a CPU utilization percentage.
3984
3985This means that netperf must know the rate at which the counter
3986increments when the system is presumed to be ``idle.''  If it does not
3987know the rate, netperf will measure it before starting a data transfer
3988test.  This calibration step takes 40 seconds for each of the local or
3989remote systems, and if repeated for each netperf test would make taking
3990repeated measurements rather slow.
3991
3992Thus, the netperf CPU utilization options @option{-c} and and
3993@option{-C} can take an optional calibration value.  This value is
3994used as the ``idle rate'' and the calibration step is not
3995performed. To determine the idle rate, netperf can be used to run
3996special tests which only report the value of the calibration - they
3997are the LOC_CPU and REM_CPU tests.  These return the calibration value
3998for the local and remote system respectively.  A common way to use
3999these tests is to store their results into an environment variable and
4000use that in subsequent netperf commands:
4001
4002@example
4003LOC_RATE=`netperf -t LOC_CPU`
4004REM_RATE=`netperf -H <remote> -t REM_CPU`
4005netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
4006...
4007netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
4008@end example
4009
4010If you are going to use netperf to measure aggregate results, it is
4011important to use the LOC_CPU and REM_CPU tests to get the calibration
4012values first to avoid issues with some of the aggregate netperf tests
4013transferring data while others are ``idle'' and getting bogus
4014calibration values.  When running aggregate tests, it is very
4015important to remember that any one instance of netperf does not know
4016about the other instances of netperf.  It will report global CPU
4017utilization and will calculate service demand believing it was the
4018only thing causing that CPU utilization.  So, you can use the CPU
4019utilization reported by netperf in an aggregate test, but you have to
4020calculate service demands by hand.
4021
4022@node UUID Generation,  , CPU rate calibration, Other Netperf Tests
4023@section UUID Generation
4024
4025Beginning with version 2.5.0 netperf can generate Universally Unique
4026IDentifiers (UUIDs).  This can be done explicitly via the ``UUID''
4027test:
4028@example
4029$ netperf -t UUID
40302c8561ae-9ebd-11e0-a297-0f5bfa0349d0
4031@end example
4032
4033In and of itself, this is not terribly useful, but used in conjunction
4034with the test-specific @option{-u} option of an ``omni'' test to set
4035the UUID emitted by the @ref{Omni Output Selectors,UUID} output
4036selector, it can be used to tie-together the separate instances of an
4037aggregate netperf test.  Say, for instance if they were inserted into
4038a database of some sort.
4039
4040@node Address Resolution, Enhancing Netperf, Other Netperf Tests, Top
4041@comment  node-name,  next,  previous,  up
4042@chapter Address Resolution
4043
4044Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so
4045the functionality of the tests in @file{src/nettest_ipv6.c} has been
4046subsumed into the tests in @file{src/nettest_bsd.c}  This has been
4047accomplished in part by switching from @code{gethostbyname()}to
4048@code{getaddrinfo()} exclusively.  While it was theoretically possible
4049to get multiple results for a hostname from @code{gethostbyname()} it
4050was generally unlikely and netperf's ignoring of the second and later
4051results was not much of an issue.
4052
4053Now with @code{getaddrinfo} and particularly with AF_UNSPEC it is
4054increasingly likely that a given hostname will have multiple
4055associated addresses.  The @code{establish_control()} routine of
4056@file{src/netlib.c} will indeed attempt to chose from among all the
4057matching IP addresses when establishing the control connection.
4058Netperf does not _really_ care if the control connection is IPv4 or
4059IPv6 or even mixed on either end.
4060
4061However, the individual tests still ass-u-me that the first result in
4062the address list is the one to be used.  Whether or not this will
4063turn-out to be an issue has yet to be determined.
4064
4065If you do run into problems with this, the easiest workaround is to
4066specify IP addresses for the data connection explicitly in the
4067test-specific @option{-H} and @option{-L} options.  At some point, the
4068netperf tests _may_ try to be more sophisticated in their parsing of
4069returns from @code{getaddrinfo()} - straw-man patches to
4070@email{netperf-feedback@@netperf.org} would of course be most welcome
4071:)
4072
4073Netperf has leveraged code from other open-source projects with
4074amenable licensing to provide a replacement @code{getaddrinfo()} call
4075on those platforms where the @command{configure} script believes there
4076is no native getaddrinfo call.  As of this writing, the replacement
4077@code{getaddrinfo()} as been tested on HP-UX 11.0 and then presumed to
4078run elsewhere.
4079
4080@node Enhancing Netperf, Netperf4, Address Resolution, Top
4081@comment  node-name,  next,  previous,  up
4082@chapter Enhancing Netperf
4083
4084Netperf is constantly evolving.  If you find you want to make
4085enhancements to netperf, by all means do so.  If you wish to add a new
4086``suite'' of tests to netperf the general idea is to:
4087
4088@enumerate
4089@item
4090Add files @file{src/nettest_mumble.c} and @file{src/nettest_mumble.h}
4091where mumble is replaced with something meaningful for the test-suite.
4092@item
4093Add support for an appropriate @option{--enable-mumble} option in
4094@file{configure.ac}.
4095@item
4096Edit @file{src/netperf.c}, @file{netsh.c}, and @file{netserver.c} as
4097required, using #ifdef WANT_MUMBLE.
4098@item
4099Compile and test
4100@end enumerate
4101
4102However, with the addition of the ``omni'' tests in version 2.5.0 it
4103is preferred that one attempt to make the necessary changes to
4104@file{src/nettest_omni.c} rather than adding new source files, unless
4105this would make the omni tests entirely too complicated.
4106
4107If you wish to submit your changes for possible inclusion into the
4108mainline sources, please try to base your changes on the latest
4109available sources. (@xref{Getting Netperf Bits}.) and then send email
4110describing the changes at a high level to
4111@email{netperf-feedback@@netperf.org} or perhaps
4112@email{netperf-talk@@netperf.org}.  If the consensus is positive, then
4113sending context @command{diff} results to
4114@email{netperf-feedback@@netperf.org} is the next step.  From that
4115point, it is a matter of pestering the Netperf Contributing Editor
4116until he gets the changes incorporated :)
4117
4118@node  Netperf4, Concept Index, Enhancing Netperf, Top
4119@comment  node-name,  next,  previous,  up
4120@chapter Netperf4
4121
4122Netperf4 is the shorthand name given to version 4.X.X of netperf.
4123This is really a separate benchmark more than a newer version of
4124netperf, but it is a descendant of netperf so the netperf name is
4125kept.  The facetious way to describe netperf4 is to say it is the
4126egg-laying-woolly-milk-pig version of netperf :)  The more respectful
4127way to describe it is to say it is the version of netperf with support
4128for synchronized, multiple-thread, multiple-test, multiple-system,
4129network-oriented benchmarking.
4130
4131Netperf4 is still undergoing evolution. Those wishing to work with or
4132on netperf4 are encouraged to join the
4133@uref{http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev,netperf-dev}
4134mailing list and/or peruse the
4135@uref{http://www.netperf.org/svn/netperf4/trunk,current sources}.
4136
4137@node Concept Index, Option Index, Netperf4, Top
4138@unnumbered Concept Index
4139
4140@printindex cp
4141
4142@node Option Index,  , Concept Index, Top
4143@comment  node-name,  next,  previous,  up
4144@unnumbered Option Index
4145
4146@printindex vr
4147@bye                                      
4148
4149@c  LocalWords:  texinfo setfilename settitle titlepage vskip pt filll ifnottex
4150@c  LocalWords:  insertcopying cindex dfn uref printindex cp
4151