tools/perf/Documentation/perf-bench.txt

   1 perf-bench(1)
   2 =============
   3
   4 NAME
   5 ----
   6 perf-bench - General framework for benchmark suites
   7
   8 SYNOPSIS
   9 --------
  10 [verse]
  11 'perf bench' [<common options>] <subsystem> <suite> [<options>]
  12
  13 DESCRIPTION
  14 -----------
  15 This 'perf bench' command is a general framework for benchmark suites.
  16
  17 COMMON OPTIONS
  18 --------------
  19 -r::
  20 --repeat=::
  21 Specify amount of times to repeat the run (default 10).
  22
  23 -f::
  24 --format=::
  25 Specify format style.
  26 Current available format styles are:
  27
  28 'default'::
  29 Default style. This is mainly for human reading.
  30 ---------------------
  31 % perf bench sched pipe                      # with no style specified
  32 (executing 1000000 pipe operations between two tasks)
  33         Total time:5.855 sec
  34                 5.855061 usecs/op
  35                 170792 ops/sec
  36 ---------------------
  37
  38 'simple'::
  39 This simple style is friendly for automated
  40 processing by scripts.
  41 ---------------------
  42 % perf bench --format=simple sched pipe      # specified simple
  43 5.988
  44 ---------------------
  45
  46 SUBSYSTEM
  47 ---------
  48
  49 'sched'::
  50         Scheduler and IPC mechanisms.
  51
  52 'syscall'::
  53         System call performance (throughput).
  54
  55 'mem'::
  56         Memory access performance.
  57
  58 'numa'::
  59         NUMA scheduling and MM benchmarks.
  60
  61 'futex'::
  62         Futex stressing benchmarks.
  63
  64 'epoll'::
  65         Eventpoll (epoll) stressing benchmarks.
  66
  67 'internals'::
  68         Benchmark internal perf functionality.
  69
  70 'all'::
  71         All benchmark subsystems.
  72
  73 SUITES FOR 'sched'
  74 ~~~~~~~~~~~~~~~~~~
  75 *messaging*::
  76 Suite for evaluating performance of scheduler and IPC mechanisms.
  77 Based on hackbench by Rusty Russell.
  78
  79 Options of *messaging*
  80 ^^^^^^^^^^^^^^^^^^^^^^
  81 -p::
  82 --pipe::
  83 Use pipe() instead of socketpair()
  84
  85 -t::
  86 --thread::
  87 Be multi thread instead of multi process
  88
  89 -g::
  90 --group=::
  91 Specify number of groups
  92
  93 -l::
  94 --nr_loops=::
  95 Specify number of loops
  96
  97 Example of *messaging*
  98 ^^^^^^^^^^^^^^^^^^^^^^
  99
 100 ---------------------
 101 % perf bench sched messaging                 # run with default
 102 options (20 sender and receiver processes per group)
 103 (10 groups == 400 processes run)
 104
 105       Total time:0.308 sec
 106
 107 % perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
 108 (20 sender and receiver threads per group)
 109 (20 groups == 800 threads run)
 110
 111       Total time:0.582 sec
 112 ---------------------
 113
 114 *pipe*::
 115 Suite for pipe() system call.
 116 Based on pipe-test-1m.c by Ingo Molnar.
 117
 118 Options of *pipe*
 119 ^^^^^^^^^^^^^^^^^
 120 -l::
 121 --loop=::
 122 Specify number of loops.
 123
 124 Example of *pipe*
 125 ^^^^^^^^^^^^^^^^^
 126
 127 ---------------------
 128 % perf bench sched pipe
 129 (executing 1000000 pipe operations between two tasks)
 130
 131         Total time:8.091 sec
 132                 8.091833 usecs/op
 133                 123581 ops/sec
 134
 135 % perf bench sched pipe -l 1000              # loop 1000
 136 (executing 1000 pipe operations between two tasks)
 137
 138         Total time:0.016 sec
 139                 16.948000 usecs/op
 140                 59004 ops/sec
 141 ---------------------
 142
 143 SUITES FOR 'syscall'
 144 ~~~~~~~~~~~~~~~~~~
 145 *basic*::
 146 Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics).
 147 This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not
 148 cached by glibc.
 149
 150
 151 SUITES FOR 'mem'
 152 ~~~~~~~~~~~~~~~~
 153 *memcpy*::
 154 Suite for evaluating performance of simple memory copy in various ways.
 155
 156 Options of *memcpy*
 157 ^^^^^^^^^^^^^^^^^^^
 158 -l::
 159 --size::
 160 Specify size of memory to copy (default: 1MB).
 161 Available units are B, KB, MB, GB and TB (case insensitive).
 162
 163 -f::
 164 --function::
 165 Specify function to copy (default: default).
 166 Available functions are depend on the architecture.
 167 On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
 168
 169 -l::
 170 --nr_loops::
 171 Repeat memcpy invocation this number of times.
 172
 173 -c::
 174 --cycles::
 175 Use perf's cpu-cycles event instead of gettimeofday syscall.
 176
 177 *memset*::
 178 Suite for evaluating performance of simple memory set in various ways.
 179
 180 Options of *memset*
 181 ^^^^^^^^^^^^^^^^^^^
 182 -l::
 183 --size::
 184 Specify size of memory to set (default: 1MB).
 185 Available units are B, KB, MB, GB and TB (case insensitive).
 186
 187 -f::
 188 --function::
 189 Specify function to set (default: default).
 190 Available functions are depend on the architecture.
 191 On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.
 192
 193 -l::
 194 --nr_loops::
 195 Repeat memset invocation this number of times.
 196
 197 -c::
 198 --cycles::
 199 Use perf's cpu-cycles event instead of gettimeofday syscall.
 200
 201 SUITES FOR 'numa'
 202 ~~~~~~~~~~~~~~~~~
 203 *mem*::
 204 Suite for evaluating NUMA workloads.
 205
 206 SUITES FOR 'futex'
 207 ~~~~~~~~~~~~~~~~~~
 208 *hash*::
 209 Suite for evaluating hash tables.
 210
 211 *wake*::
 212 Suite for evaluating wake calls.
 213
 214 *wake-parallel*::
 215 Suite for evaluating parallel wake calls.
 216
 217 *requeue*::
 218 Suite for evaluating requeue calls.
 219
 220 *lock-pi*::
 221 Suite for evaluating futex lock_pi calls.
 222
 223 SUITES FOR 'epoll'
 224 ~~~~~~~~~~~~~~~~~~
 225 *wait*::
 226 Suite for evaluating concurrent epoll_wait calls.
 227
 228 *ctl*::
 229 Suite for evaluating multiple epoll_ctl calls.
 230
 231 SUITES FOR 'internals'
 232 ~~~~~~~~~~~~~~~~~~~~~~
 233 *synthesize*::
 234 Suite for evaluating perf's event synthesis performance.
 235
 236 SEE ALSO
 237 --------
 238 linkperf:perf[1]