Iperf

Iperf is a tool to measure TCP throughput and available bandwidth, allowing the tuning of various parameters and UDP characteristics. Iperf reports bandwidth, delay variation, and datagram loss.

The popular Iperf 2 releases were developed by NLANR/DAST (http://dast.nlanr.net/Projects/Iperf/) and maintained at http://sourceforge.net/projects/iperf. As of April 2014, the last released version was 2.0.5, from July 2010.

A script that automates the starting and then stopping iperf servers is here . This can be invoked from a remote machine (say a NOC workstation) to simplify starting, and more importantly stopping, an iperf server.

Iperf 3

ESnet and the Lawrence Berkeley National Laboratory have developed a from-scratch reimplementation of Iperf called Iperf 3. It has a Github repository. It is not compatible with iperf 2, and has additional interesting features such as a zero-copy TCP mode (-Z flag), JSON output (-J), and reporting of TCP retransmission counts and CPU utilization (-V). It also supports SCTP in addition to UDP and TCP. Since December 2013, various public releases were made on http://stats.es.net/software/.

Usage Examples

TCP Throughput Test

The following shows a TCP throughput test, which is iperf's default action. The following options are given:

  • -s - server mode. In iperf, the server will receive the test data stream.
  • -c server - client mode. The name (or IP address) of the server should be given. The client will transmit the test stream.
  • -i interval - display interval. Without this option, iperf will run the test silently, and only write a summary after the test has finished. With -i, the program will report intermediate results at given intervals (in seconds).
  • -w windowsize - select a non-default TCP window size. To achieve high rates over paths with a large bandwidth-delay product, it is often necessary to select a larger TCP window size than the (operating system) default.
  • -l buffer length - specify the length of send or receive buffer. In UDP, this sets the packet size. In TCP, this sets the send/receive buffer length (possibly using system defaults). Using this may be important especially if the operating system default send buffer is too small (e.g. in Windows XP).

NOTE -c and -s arguments must be given first. Otherwise some configuration options are ignored.

The -i 1 option was given to obtain intermediate reports every second, in addition to the final report at the end of the ten-second test. The TCP buffer size was set to 2 Megabytes (4 Megabytes effective, see below) in order to permit close to line-rate transfers. The systems haven't been fully tuned, otherwise up to 7 Gb/s of TCP throughput should be possible. Normal background traffic on the 10 Gb/s backbone is on the order of 30-100 Mb/s. Note that in iperf, by default it is the client that transmits to the server.

Server Side:

welti@ezmp3:~$ iperf -s -w 2M -i 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 4.00 MByte (WARNING: requested 2.00 MByte)
------------------------------------------------------------
[  4] local 130.59.35.106 port 5001 connected with 130.59.35.82 port 41143
[  4]  0.0- 1.0 sec    405 MBytes  3.40 Gbits/sec
[  4]  1.0- 2.0 sec    424 MBytes  3.56 Gbits/sec
[  4]  2.0- 3.0 sec    425 MBytes  3.56 Gbits/sec
[  4]  3.0- 4.0 sec    422 MBytes  3.54 Gbits/sec
[  4]  4.0- 5.0 sec    424 MBytes  3.56 Gbits/sec
[  4]  5.0- 6.0 sec    422 MBytes  3.54 Gbits/sec
[  4]  6.0- 7.0 sec    424 MBytes  3.56 Gbits/sec
[  4]  7.0- 8.0 sec    423 MBytes  3.55 Gbits/sec
[  4]  8.0- 9.0 sec    424 MBytes  3.56 Gbits/sec
[  4]  9.0-10.0 sec    413 MBytes  3.47 Gbits/sec
[  4]  0.0-10.0 sec  4.11 GBytes  3.53 Gbits/sec

Client Side:

welti@mamp1:~$ iperf -c ezmp3 -w 2M -i 1
------------------------------------------------------------
Client connecting to ezmp3, TCP port 5001
TCP window size: 4.00 MByte (WARNING: requested 2.00 MByte)
------------------------------------------------------------
[  3] local 130.59.35.82 port 41143 connected with 130.59.35.106 port 5001
[  3]  0.0- 1.0 sec    405 MBytes  3.40 Gbits/sec
[  3]  1.0- 2.0 sec    424 MBytes  3.56 Gbits/sec
[  3]  2.0- 3.0 sec    425 MBytes  3.56 Gbits/sec
[  3]  3.0- 4.0 sec    422 MBytes  3.54 Gbits/sec
[  3]  4.0- 5.0 sec    424 MBytes  3.56 Gbits/sec
[  3]  5.0- 6.0 sec    422 MBytes  3.54 Gbits/sec
[  3]  6.0- 7.0 sec    424 MBytes  3.56 Gbits/sec
[  3]  7.0- 8.0 sec    423 MBytes  3.55 Gbits/sec
[  3]  8.0- 9.0 sec    424 MBytes  3.56 Gbits/sec
[  3]  0.0-10.0 sec  4.11 GBytes  3.53 Gbits/sec

UDP Test

In the following example, we send a 300 Mb/s UDP test stream. No packets were lost along the path, although one arrived out-of-order. Another interesting result is jitter, which is displayed as 27 or 28 microseconds (apparently there is some rounding error or other impreciseness that prevents the client and server from agreeing on the value). According to the documentation, "Jitter is the smoothed mean of differences between consecutive transit times."

Server Side

: leinen@mamp1[leinen]; iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size: 64.0 KByte (default)
------------------------------------------------------------
[  3] local 130.59.35.82 port 5001 connected with 130.59.35.106 port 38750
[  3]  0.0-10.0 sec    359 MBytes    302 Mbits/sec  0.028 ms    0/256410 (0%)
[  3]  0.0-10.0 sec  1 datagrams received out-of-order

Client Side

: leinen@ezmp3[leinen]; iperf -c mamp1-eth0 -u -b 300M
------------------------------------------------------------
Client connecting to mamp1-eth0, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 64.0 KByte (default)
------------------------------------------------------------
[  3] local 130.59.35.106 port 38750 connected with 130.59.35.82 port 5001
[  3]  0.0-10.0 sec    359 MBytes    302 Mbits/sec
[  3] Sent 256411 datagrams
[  3] Server Report:
[  3]  0.0-10.0 sec    359 MBytes    302 Mbits/sec  0.027 ms    0/256410 (0%)
[  3]  0.0-10.0 sec  1 datagrams received out-of-order

As you would expect, during a UDP test traffic is only sent from the client to the server see here for an example with tcpdump.

Problem isolation procedures using iperf

TCP Throughput measurements

Typically end users are reporting throughput problems as they see on with their applications, like unexpected slow file transfer times. Some users may already report TCP throughput results as measured with iperf. In any case, network administrators should validate the throughput problem. It is recommended this be done using iperf end-to-end measurements in TCP mode between the end systems' memory. The window size of the TCP measurement should follow the bandwidth*delay product rule, and should therefore be set to at least the measured round trip time multiplied by the path's bottle-neck speed. If the actual bottleneck is not known (because of lack on knowledge of the end-to-end path) then it should be assumed the bottleneck is the slowest of the two end-systems' network interface cards.

For instance if one system is connected with Gigabit Ethernet, but the other one with Fast Ethernet and the measured round trip time is 150ms, then the window size should be set to 100 Mbit/s * 0.150s / 8 = 1875000 bytes, so setting the TCP window to a value of 2 MBytes would be a good choice.

In theory the TCP throughput could reach, but not exceed, the available bandwidth on an end-to-end path. The knowledge of that network metric is therefore important for distinguishing between issues with the end system's TCP stacks, or network related problems.

Available bandwidth measurements

Iperf could be used in UDP mode for measuring the available bandwidth. Only short duration measurements in the range of 10 seconds should be done so as not to disturb other production flows. The goal of UDP measurements is to find the maximum UDP sending rate that results in almost no packet loss on the end-to-end path, in good practice the packet loss threshold is 1%. UDP data transfers that results in higher packet losses are likely to disturb TCP production flows and therefore should be avoided. A practicable procedure to find the available bandwidth value is to start with UDP data transfers with a 10s duration and with interim result reports at one second intervals. The data rate to start with should be slightly below the reported TCP throughput. If the measured packet loss values are below the threshold then a new measurement with slightly increased data rate could be started. This procedure of small UDP data transfers with increasing data rate should be repeated until the packet loss threshold is exceeded. Depending on the required result's accuracy further tests can be started beginning with the maximum data rate causing packet losses below the threshold and with smaller data rate increasing intervals. At the end the maximum data rate that caused packet losses below the threshold could be seen as a good measurement of the available bandwidth on the end to end path.

By comparing the reported applications throughput with the measured TCP throughput and the measured available bandwidth, it is possible to distinguish between applications problems, TCP stack problems, or network issues. Note however that differing nature of UDP and TCP flows means that it their measurements should not be directly compared. Iperf sends UDP datagrams are a constant steady rate, whereas TPC tends to send packet trains. This means that TCP is likely to suffer from congestion effects at a lower data rate than UDP.

In case of unexpected low available bandwidth measurements on the end-to-end path, network administrators are interested on the bandwidth bottleneck. The best way to get this value is to retrieve it from passively measured link utilisations and provided capacities on all links along the path. However, if the path is crossing multiple administrative domains this is often not possible because of restrictions in getting those values from other domains. Therefore, it is common practice to use measurement workstations along the end-to-end path, and thus separate the end-to-end path in segments on which available bandwidth measurements are done. This way it is possible to identify the segment on which the bottleneck occurs and to concentrate on that during further troubleshooting procedures.

Other iperf use cases

Besides the capability of measuring TCP throughput and available bandwidth, in UDP mode iperf can report on packet reordering and delay jitter.

Other use cases for measurements using iperf are IPv6 bandwidth measurements and IP multicast performance measurements. More information of the iperf features, its source and binary code for different UNIXes and Microsoft Windows operating systems can be retrieved from the Iperf Web site.

Caveats and Known Issues

Impact on other traffic

As Iperf sends real full data streams it can reduce the available bandwidth on a given path. In TCP mode, the effect to the co-existing production flows should be negligible, assuming the number of production flows is much greater than the number of test data flows, which is normally a valid assumption on paths through a WAN. However, in UDP mode iperf has the potential to disturb production traffic, and in particular TCP streams, if the sender's data rate exceeds the available bandwidth on a path. Therefore, one should take particular care whenever running iperf tests in UDP mode.

TCP buffer allocation

On Linux systems, if you request a specific TCP buffer size with the "-w" option, the kernel will always try to allocate double as much bytes as you specified.

Example: when you request 2MB window size you'll receive 4MB:

welti@mamp1:~$ iperf -c ezmp3 -w 2M -i 1
------------------------------------------------------------
Client connecting to ezmp3, TCP port 5001
TCP window size: 4.00 MByte (WARNING: requested 2.00 MByte)    <<<<<<
------------------------------------------------------------

Counter overflow

Some versions seem to suffer from a 32-bit integer overflow which will lead to wrong results.

e.g.:

[ 14]  0.0-10.0 sec  953315416 Bytes  762652333 bits/sec
[ 14] 10.0-20.0 sec  1173758936 Bytes  939007149 bits/sec
[ 14] 20.0-30.0 sec  1173783552 Bytes  939026842 bits/sec
[ 14] 30.0-40.0 sec  1173769072 Bytes  939015258 bits/sec
[ 14] 40.0-50.0 sec  1173783552 Bytes  939026842 bits/sec
[ 14] 50.0-60.0 sec  1173751696 Bytes  939001357 bits/sec
[ 14]  0.0-60.0 sec  2531115008 Bytes  337294201 bits/sec
[ 14] MSS size 1448 bytes (MTU 1500 bytes, ethernet)

As you can see the summary 0-60 seconds doesn't match the average that one would expect. This is due to the fact that the total number of Bytes is not correct as a result of a counter wrap.

If you're experiencing this kind of effects, upgrade to the latest version of iperf, which should have this bug fixed.

UDP buffer sizing

The UDP buffer sizing (the -w parameter) is also required with high-speed UDP transmissions. Otherwise (typically) the UDP receive buffer will overflow and this will look like packet loss (but looking at tcpdumps or counters reveals you got all the data). This will show in UDP statistics (for example, on Linux with 'netstat -s' under Udp: ... receive packet errors). See more information at: http://www.29west.com/docs/THPM/udp-buffer-sizing.html

Control of measurements

There are two typical deployment scenarios which differ in the kind of access the operator has to the sender and receiver instances. A measurement between well-located measurement workstations within an administrative domain e.g. a campus network allow network administrators full control on the server and client configurations (including test schedules), and allows them to retrieve full measurement results. Measurements on paths that extend beyond the administrative domain borders require access or collaboration with administrators of the far-end systems. Iperf has two features implemented that simplify its use in this scenario, such that the operator does not to need of have an interactive login account on the far-end system:

  • The server instance may run as a daemon (option -D) listening on a configurable transport protocol port, and
  • It is possible to bi-directional tests, either one after the other (option -r) , or simultaneously (option -d).

Screen

Another method of running iperf on a *NIX device is to use 'screen'. Screen is a utility that lets you keep a session running even once you have logged out. It is described more fully here in its man pages, but a simple sequence applicable to iperf would be as follows:

[user@host]$screen -d -m iperf -s -p 5002 
This starts iperf -s -p 5002 as a 'detached' session

[user@host]$screen -ls
There is a screen on:
        25278..mp1      (Detached)
1 Socket in /tmp/uscreens/S-admin.
'screen -ls' shows open sessions.

'screen -r' reconnects to a running session . when in that session keying 'CNTL+a', then 'd' detaches the screen. You can if you wish log out, log back in again, and re-attach. To end the iperf session (and a screen) just hit 'CNTL+c' whilst attached.

Note that BWCTL offers additional control and resource limitation features that make it more suitable for use over administrative domains.

Related Work

Public iperf servers

There used to be some public iperf servers available, but at present none is known anymore. Similar services are provided by BWCTL (see below) and by public NDT servers.

BWCTL

BWCTL (BandWidth test ConTroLler) is a wrapper around iperf that provides scheduling and remote control of measurements.

Instrumented iperf (iperf100)

The iperf code provided by NLANR/DAST was instrumented in order to provide more information to the user. Iperf100 displays various web100 variables at the end of a transfer.

Patches are available at http://www.csm.ornl.gov/~dunigan/netperf/download/

The Instrumented iperf requires machine running a kernel.org linux-2.X.XX kernel with the latest web100 patches applied (http://www.web100.org)

Jperf

Jperf is a Java-based graphical front-end to iperf. It is now included as part of the iperf project on SourceForge. This was once available as a separate project called xjperf, but that seems to have been given up in favor of iperf/SourceForge integration.

iPerf for Android

An Android version of iperf appeared on Google Play (formerly the Android Market) in 2010.

nuttcp

Similar to iperf, but with an additional control connection that makes it somewhat more versatile.

-- FrancoisXavierAndreu & SimonMuyal - 06 Jun 2005
-- HankNussbacher - 10 Jul 2005 (Great Plains server)
-- AnnHarding & OrlaMcGann - Aug 2005 (DS3.3.2 content)
-- SimonLeinen - 08 Feb 2006 (OpenSS7 variant, BWCTL pointer)
-- BartoszBelter - 28 Mar 2006 (iperf100)
-- ChrisWelti - 11 Apr 2006 (examples, 32-bit overflows, buffer allocation)
-- SimonLeinen - 01 Jun 2006 (integrated DS3.3.2 text from Ann and Orla)
-- SimonLeinen - 17 Sep 2006 (tracked iperf100 pointer)
-- PekkaSavola - 26 Mar 2008 (added warning about -c/s having to be first, a common gotcha)
-- PekkaSavola - 05 Jun 2008 (added discussion of '-l' parameter and its significance
-- PekkaSavola - 30 Apr 2009 (added discussion of UDP (receive) buffer sizing significance
-- SimonLeinen - 23 Feb 2012 (removed installation instructions, obsolete public iperf servers)
-- SimonLeinen - 22 May 2012 (added notes about Iperf 3, Jperf and iPerf for Android)
-- SimonLeinen - 01 Feb 2013 (added pointer to Android app; cross-reference to Nuttcp)
-- SimonLeinen - 06 April 2014 (updated Iperf 3 section: now on Github and with releases)
-- SimonLeinen - 05 May 2014 (updated Iperf 3 section: documented more new features)

Edit | Attach | Watch | Print version | History: r31 < r30 < r29 < r28 < r27 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r31 - 2014-05-05 - SimonLeinen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2004-2009 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.