git.monstr.eu Git - linux-2.6-microblaze.git/log

perf test script: Add perl script testing support

Basic coverage of perl script support from `perf script`. This is
disabled by default and so the test will most normally skip.

Committer testing:

  $ perf test 'perf script perl'
  106: perf script perl tests                                          : Skip
  $ perf test -vv 'perf script perl'
  106: perf script perl tests:
  --- start ---
  test child forked, pid 578323
  perf script perl test [Skipped: no libperl support]
  ---- end(-2) ----
  106: perf script perl tests                                          : Skip
  $ perf check feature libperl
                 libperl: [ OFF ]  # HAVE_LIBPERL_SUPPORT ( tip: Deprecated, use LIBPERL=1 and install perl-ExtUtils-Embed/libperl-dev to build with it )
  $

Install perl-ExtUtils-Embed, build with LIBPERL=1, rebuild:

  $ perf check feature libperl
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
  $ perf test 'perf script perl'
  106: perf script perl tests                                          : Ok
  $ perf test -vv 'perf script perl'
  106: perf script perl tests:
  --- start ---
  test child forked, pid 588206
  Testing event: sched:sched_switch
  perf script perl test [Skipped: failed to record sched:sched_switch]
  Testing event: task-clock
  Generating perl script...
  generated Perl script: /tmp/__perf_test_script.RpMn5.pl
  Executing perl script...
  perf script perl test [Success: task-clock triggered $VAR1]
  ---- end(0) ----
  106: perf script perl tests                                          : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf script: Allow the generated script to be a path

Allow the script generated by "perf script -g <language>" to be a file
path and the language determined by the file extension.

This is useful in testing so that the generated script file can be
written to a test directory.

Committer testing:

  $ perf record ls a.a
  ls: cannot access 'a.a': No such file or directory
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
  $ perf script -g python
  generated Python script: perf-script.py
  $ perf script -g myscript.py
  generated Python script: myscript.py
  $ diff -u perf-script.py myscript.py
  $ tail myscript.py
  def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict):
   print(get_dict_as_string(event_fields_dict))
   print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}')

  def print_header(event_name, cpu, secs, nsecs, pid, comm):
   print("%-20s %5u %05u.%09u %8u %-20s " % \
   (event_name, cpu, secs, nsecs, pid, comm), end="")

  def get_dict_as_string(a_dict, delimiter=' '):
   return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())])
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test: perf data --to-ctf testing

If babeltrace is detected check that --to-ctf functions with a data
file and in pipe mode.

Committer testing:

  $ perf test 'perf data convert --to-ctf'
  124: 'perf data convert --to-ctf' command test                       : Ok
  $ perf test -vv 'perf data convert --to-ctf'
  124: 'perf data convert --to-ctf' command test:
  --- start ---
  test child forked, pid 556008
           libbabeltrace: [ on  ]  # HAVE_LIBBABELTRACE_SUPPORT
  Testing Perf Data Conversion Command to CTF (File input)
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.021 MB /tmp/__perf_test.perf.data.9TxzZ (115 samples) ]
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ]
  [ perf data convert: Converted and wrote 0.012 MB (115 samples) ]
  Perf Data Converter Command to CTF (File input) [SUCCESS]
  Testing Perf Data Conversion Command to CTF (Pipe mode)
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.047 MB - ]
  Failed to setup all events.
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ]
  [ perf data convert: Converted and wrote 0.000 MB (0 samples) ]
  Perf Data Converter Command to CTF (Pipe mode) [SUCCESS]
  Unexpected signal in main
  ---- end(0) ----
  124: 'perf data convert --to-ctf' command test                       : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test: Test pipe mode with data conversion --to-json

Add pipe mode test for json data conversion. Tidy up exit and cleanup
code.

Committer testing:

  $ perf test 'perf data convert --to-json'
  124: 'perf data convert --to-json' command test                      : Ok
  $ perf test -vv 'perf data convert --to-json'
  124: 'perf data convert --to-json' command test:
  --- start ---
  test child forked, pid 548738
  Testing Perf Data Conversion Command to JSON
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB /tmp/__perf_test.perf.data.krxvl (104 samples) ]
  [ perf data convert: Converted '/tmp/__perf_test.perf.data.krxvl' into JSON data '/tmp/__perf_test.output.json.0z60p' ]
  [ perf data convert: Converted and wrote 0.075 MB (104 samples) ]
  Perf Data Converter Command to JSON [SUCCESS]
  Validating Perf Data Converted JSON file
  The file contains valid JSON format [SUCCESS]
  Testing Perf Data Conversion Command to JSON (Pipe mode)
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.046 MB - ]
  [ perf data convert: Converted '-' into JSON data '/tmp/__perf_test.output.json.0z60p' ]
  [ perf data convert: Converted and wrote 0.081 MB (110 samples) ]
  Perf Data Converter Command to JSON (Pipe mode) [SUCCESS]
  Validating Perf Data Converted JSON file
  The file contains valid JSON format [SUCCESS]
  ---- end(0) ----
  124: 'perf data convert --to-json' command test                      : Ok
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf json: Pipe mode --to-ctf support

In pipe mode the environment may not be fully initialized so be robust
to fields being NULL.

Add default handling of attr events, use the feature events to populate
the ctf writer environment.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf json: Pipe mode --to-json support

In pipe mode the environment may not be fully initialized so be robust
to fields being NULL. Add default handling of feature and attr events.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf check: Add libbabeltrace to the listed features

This enables scripts to more easily determine if `perf data --to-ctf`
is supported.

Committer testing:

  $ perf check feature libbabeltrace
         libbabeltrace: [ on  ]  # HAVE_LIBBABELTRACE_SUPPORT
  $ perf check feature -q libbabeltrace && echo have libbabeltrace support
  have libbabeltrace support
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS

Add support for EXTRA_BPF_FLAGS in the eBPF skeleton build, allowing
users to pass additional clang options such as --sysroot or custom
include paths when cross-compiling perf.

This is primarily intended for cross-build scenarios where the default
host include paths do not match the target kernel version.

Example usage:

make perf ARCH="arm64" EXTRA_BPF_FLAGS="--sysroot=..."

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: hupu <hupu.gm@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing

Namhyung suggested skipping only the rust tests when the code_with_type
'perf test' workload is not built into perf, do it so that we can
continue to test the C based workloads:

With rust:

  root@number:/# perf test -vv "data type"
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 2645245
  Basic Rust perf annotate test
  Basic annotate test [Success]
  Pipe Rust perf annotate test
  Pipe annotate test [Success]
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:/#

Without:

  root@number:/# perf test "data type"
   83: perf data type profiling tests                                  : Ok
  root@number:/# perf test -vv "data type"
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 2634759
  Basic Rust perf annotate test
  Skip: code_with_type workload not built in 'perf test'
  Pipe Rust perf annotate test
  Skip: code_with_type workload not built in 'perf test'
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:/#

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools build: Fix feature test for rust compiler

Currently a dummy rust code is compiled to detect if the rust feature
could be enabled. It turns out that in this case rust emits a dependency
file without any external references:

    /perf/feature/test-rust.d: test-rust.rs

    /perf/feature/test-rust.bin: test-rust.rs

    test-rust.rs:

This can lead to a situation, when rustc was removed after a successful build,
but the build process still thinks it's there and the feature is enabled on
subsequent runs.

Instead simply check the compiler presence to detect the feature, as
suggested by Arnaldo.

This way no actual test-rust.bin will be created, meaning the feature
check will not be cached and always performed. That's exactly what we
want, and the overhead of doing this every time is minimal.

Tested with multiple rounds of install/remove of the rust package.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf libunwind: Fix calls to thread__e_machine()

Add the missing 'e_flags' option to fix the build.

Fixes: 4e66527f8859a661 ("perf thread: Add optional e_flags output argument to thread__e_machine")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat: Add no-affinity flag

Add flag that disables affinity behavior.

Using sched_setaffinity() to place a perf thread on a CPU can avoid
certain interprocessor interrupts but may introduce a delay due to the
scheduling, particularly on loaded machines.

Add a command line option to disable the behavior.

This behavior is less present in other tools like `perf record`, as it
uses a ring buffer and doesn't make repeated system calls.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Reduce affinity use and move into iterator, fix no affinity

The evlist__for_each_cpu iterator will call sched_setaffitinity when
moving between CPUs to avoid IPIs.

If only 1 IPI is saved then this may be unprofitable as the delay to get
scheduled may be considerable.

This may be particularly true if reading an event group in `perf stat`
in interval mode.

Move the affinity handling completely into the iterator so that a single
evlist__use_affinity can determine whether CPU affinities will be used.

For `perf record` the change is minimal as the dummy event and the real
event will always make the use of affinities the thing to do.

In `perf stat`, tool events are ignored and affinities only used if >1
event on the same CPU occur.

Determining if affinities are useful is done by evlist__use_affinity
which tests per-event whether the event's PMU benefits from affinity use
- it is assumed only perf event using PMUs do.

Fix a bug where when there are no affinities that the CPU map iterator
may reference a CPU not present in the initial evsel. Fix by making the
iterator and non-iterator code common.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Missing TPEBS close in evlist__close()

The libperf evsel close won't close TPEBS events properly.

Add a test to do this. The libperf close routine is used in
evlist__close() for affinity reasons.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf evlist: Special map propagation for tool events that read on 1 CPU

Tool events like duration_time don't need a perf_cpu_map that contains
all online CPUs.

Having such a perf_cpu_map causes overheads when iterating between
events for CPU affinity.

During parsing mark events that just read on a single CPU map index as
such, then during map propagation set up the evsel's CPUs and thereby
the evlists accordingly.

The setting cannot be done early in parsing as user CPUs are only fully
known when evlist__create_maps is called.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel

The aggr value is setup to always be non-null creating a redundant
guard for reading from it. Switch to using the perf_stat_evsel (ps)
and narrow the scope of aggr so that it is known valid when used.

Fixes: 3d65f6445fd93e3e ("perf stat-shadow: Read tool events directly")
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Revert "perf tool_pmu: More accurately set the cpus for tool events"

This reverts commit d8d8a0b3603a9a8fa207cf9e4f292e81dc5d1008.

The setting of a user CPU map can cause an empty intersection when
combined with CPU 0 and the event removed. This later triggers a segv in
the stat-shadow logic. Let's put back a full online CPU map for now by
reverting this patch.

Closes: https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools build: Emit dependencies file for test-rust.bin

Test it first by having rust installed, then removing it and building again.

Fixes: 6a32fa5ccd33da5d ("tools build: Add a feature test for rust compiler")
Signed-off-by: Dmitry Dolgov <9erthalion6@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools build: Make test-rust.bin be removed by the 'clean' target

test-rust.bin is missing from the list of FILES, and thus is not removed by the
clean target. This could lead to a false feature detection, since the binary
stays there. Fix it.

Fixes: 6a32fa5ccd33da5d ("tools build: Add a feature test for rust compiler")
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test: Fix test case perftool-testsuite_report for s390

Test case perftool-testsuite_report fails on s390 for some time
now.

Root cause is a time out which is too tight for large s390 machines.
The time out value addr2line_timeout_ms is per default set to 1 second.

This is the maximum time the function read_addr2line_record() waits for
a reply from the forked off tool addr2line, which is started as a child
in interactive mode.

It reads stdin (an address in hexadecimal) and replies on stdout with
function name, file name and line number. This might take more than one
second.

However one second is not always enough and the reply from addr2line
tool is not received. Function read_addr2line_record() fails and emits
a warning, which is not expected by the test case. It fails.

Output before:

# perf test -F 133
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data file
==================
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.087 MB \
/tmp/perftool-testsuite_report.FHz/perf_report/perf.data.1 \
(207 samples) ]
==================
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file
## [ PASS ] ## perf_report :: setup SUMMARY
-- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped
Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/
6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/
vmlinux: could not read first record"
Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/
6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/
vmlinux: could not read first record"
-- [ FAIL ] -- perf_report :: test_basic :: basic execution
(output regexp parsing)
....
133: perftool-testsuite_report : FAILED!

Output after:

# ./perf test -F 133
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data file
==================
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.087 MB \
/tmp/perftool-testsuite_report.Mlp/perf_report/perf.data.1
(188 samples) ]
==================
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file
## [ PASS ] ## perf_report :: setup SUMMARY
-- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped
-- [ PASS ] -- perf_report :: test_basic :: basic execution
-- [ PASS ] -- perf_report :: test_basic :: number of samples
-- [ PASS ] -- perf_report :: test_basic :: header
-- [ PASS ] -- perf_report :: test_basic :: header timestamp
-- [ PASS ] -- perf_report :: test_basic :: show CPU utilization
-- [ PASS ] -- perf_report :: test_basic :: pid
-- [ PASS ] -- perf_report :: test_basic :: non-existing symbol
-- [ PASS ] -- perf_report :: test_basic :: symbol filter
-- [ PASS ] -- perf_report :: test_basic :: latency header
-- [ PASS ] -- perf_report :: test_basic :: default report for latency profile
-- [ PASS ] -- perf_report :: test_basic :: latency report for latency profile
-- [ PASS ] -- perf_report :: test_basic :: parallelism histogram
## [ PASS ] ## perf_report :: test_basic SUMMARY
133: perftool-testsuite_report : Ok
#

Fixes: 257046a36750a6db ("perf srcline: Fallback between addr2line implementations")
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: linux-s390@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test code_with_type.sh: Skip test if rust wasn't available at build time

  $ perf test 'perf data type profiling tests'
   83: perf data type profiling tests                         : Skip
  $ perf test -vv 'perf data type profiling tests'
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 977213
  Skip: code_with_type workload not built in 'perf test'
  ---- end(-2) ----
   83: perf data type profiling tests                         : Skip
  $

Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tests workload: Formatting for code_with_type.rs

One part of the rust code for code_with_type workload wasn't properly
formatted.

Pass it through rustfmt to fix that.

Closes: https://lore.kernel.org/oe-kbuild-all/202602091357.oyRv6hgQ-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools build: Fix rust feature detection

Features in FEATURE_TESTS_BASIC will be set as being available if
test-all.c builds, so since the rust test isn't included in test-all.c,
we can't have 'rust' in there, remove it from FEATURE_TESTS_BASIC and
use feature-check so that it tries to build test-rust.bin, doing the
actual feature detection.

On a system lacking a rust compiler:

  Makefile.config:1158: Rust is not found. Test workloads with rust are disabled.

  Auto-detecting system features:
  ...                                   libdw: [ on  ]
  ...                                   glibc: [ on  ]
  ...                                  libelf: [ on  ]
  ...                                 libnuma: [ on  ]
  ...                  numa_num_possible_cpus: [ on  ]
  ...                               libpython: [ on  ]
  ...                             libcapstone: [ on  ]
  ...                               llvm-perf: [ on  ]
  ...                                    zlib: [ on  ]
  ...                                    lzma: [ on  ]
  ...                                     bpf: [ on  ]
  ...                                  libaio: [ on  ]
  ...                                 libzstd: [ on  ]
  ...                              libopenssl: [ on  ]
  ...                                    rust: [ OFF ]

  $ cat /tmp/build/perf-tools-next/feature/test-rust.make.output
  /bin/sh: line 1: rustc: command not found
  $ file /tmp/build/perf-tools-next/feature/test-rust.bin
  /tmp/build/perf-tools-next/feature/test-rust.bin: cannot open `/tmp/build/perf-tools-next/feature/test-rust.bin' (No such file or directory)
  $
  $ perf -vv | grep RUST
                  rust: [ OFF ]  # HAVE_RUST_SUPPORT
  $

And after installing it:

  ...                                    rust: [ on  ]

  $ cat /tmp/build/perf-tools-next/feature/test-rust.make.output
  $ file /tmp/build/perf-tools-next/feature/test-rust.bin
/tmp/build/perf-tools-next/feature/test-rust.bin: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9c416edf673ee3705b97bae893a99a6fcf1ee258, for GNU/Linux 3.2.0, with debug_info, not stripped
  $
  $ perf -vv | grep RUST
                  rust: [ on  ]  # HAVE_RUST_SUPPORT
  $

Fixes: 6a32fa5ccd33da5d ("tools build: Add a feature test for rust compiler")
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tests: Test annotate with data type profiling and C

Exercise the annotate command with data type profiling feature with C.

For that extend the existing data type profiling shell test to profile
the datasym workload, then annotate the result expecting to see some
data structures from the C code.

Committer testing:

  root@number:~# perf test 'perf data type profiling tests'
   83: perf data type profiling tests                                  : Ok
  root@number:~# perf test -vv 'perf data type profiling tests'
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 125028
  Basic Rust perf annotate test
  Basic annotate test [Success]
  Pipe Rust perf annotate test
  Pipe annotate test [Success]
  Basic C perf annotate test
  Basic annotate test [Success]
  Pipe C perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                                  : Ok
  root@number:~#

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tests: Test annotate with data type profiling and rust

Exercise the annotate command with data type profiling feature on the
rust runtime. For that add a new shell test, which will profile the
code_with_type workload, then annotate the result expecting to see some
data structures from the rust code.

Committer testing:

  root@number:~# perf test 'perf data type profiling tests'
   83: perf data type profiling tests                            : Ok
  root@number:~# perf test -v 'perf data type profiling tests'
   83: perf data type profiling tests                            : Ok
  root@number:~# perf test -vv 'perf data type profiling tests'
   83: perf data type profiling tests:
  --- start ---
  test child forked, pid 111044
  Basic perf annotate test
  Basic annotate test [Success]
  Pipe perf annotate test
  Pipe annotate test [Success]
  ---- end(0) ----
   83: perf data type profiling tests                            : Ok
  root@number:~#

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test workload: Add code_with_type test workload

The purpose of the workload is to gather samples of rust runtime. To
achieve that it has a dummy rust library linked with it.

Per recommendations for such scenarios [1], the rust library is
statically linked.

An example:

$ perf record perf test -w code_with_type
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.160 MB perf.data (4074 samples) ]

$ perf report --stdio --dso perf -s srcfile,srcline
    45.16%  ub_checks.rs       ub_checks.rs:72
     6.72%  code_with_type.rs  code_with_type.rs:15
     6.64%  range.rs           range.rs:767
     4.26%  code_with_type.rs  code_with_type.rs:21
     4.23%  range.rs           range.rs:0
     3.99%  code_with_type.rs  code_with_type.rs:16
    [...]

[1]: https://doc.rust-lang.org/reference/linkage.html#mixed-rust-and-foreign-codebases

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools build: Add a feature test for rust compiler

Add a feature test to identify if the rust compiler is available, so
that perf could build rust based worloads based on that.

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test parse-metric: Ensure aggregate counts appear to have run

Commit bb5a920b90991279 ("perf stat: Ensure metrics are displayed even
with failed events") with failed events") made it so that counters which
weren't enabled in the kernel were handled as NaN in metrics.

This caused the "Parse and process metrics" test to start failing as it
wasn't putting a non-zero value in these variables.

Add arbitrary values of 1 to fix the test.

Fixes: bb5a920b90991279 ("perf stat: Ensure metrics are displayed even with failed events")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test record.sh: Fix shellcheck warning

Add quotes to avoid the following warning:
```
In tests/shell/record.sh line 264:
[ $(uname -m) = "s390x" ] && {
^---------^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
```

Fixes: c73a56ed3c97ae65 ("perf test: Fix test case Leader sampling on s390")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf build: Reduce pmu-events related copying and mkdirs

When building to an output directory the previous code would remove
files and then copy the source files over.

Each source file copy would have a rule to make its directory. All JSON
for every architecture was considered a source file.

This led to unnecessary copying as a file would be deleted and then the
same file copied again, unnecessary directory making, and copying of
files not used in the build.

A side-effect would be a lot of build messages.

This change makes it so that all computed output files are created and
then compared to all files in the OUTPUT directory.

By filtering out the files that would be copied, unnecessary files can
be determined and then deleted - note, this is a phony target which
would remake the pmu-events.c if always depended upon, and so the
dependency is conditional on there being files to remove.

This has some overhead as the $(OUTPUT)/pmu-events is "find" over rather
than just "rm -fr", but the savings from unnecessary copying, etc.
should make up for this new make overhead.

The copy target just does copying but has a dependency on the directory
it needs being built, avoiding repetitive mkdirs.

The source files for copying only consider the JEVENTS_ARCH unless the
JEVENTS_ARCH is all.

The metric JSON is only generated if appropriate, rather than always
being generated and jevents.py deciding whether or not to use the files.

The mypy and pylint targets are fixed as variable names had changed but
the rules not updated.

The line count of a build with "make -C tools/perf O=/tmp/perf clean all"
prior to this change was 2181 lines, after this change it is 1596
lines.

This is a reduction of 585 lines or about 27%.

The generated pmu-events.c for JEVENTS_ARCH "x86" and "all" were
validated as being identical after this change.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf lock contention: fix segfault in `lock contention -b/--use-bpf`

When run on a kernel without BTF info, perf crashes:

    libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
    libbpf: failed to find valid kernel BTF

    Program received signal SIGSEGV, Segmentation fault.
    0x00005555556915b7 in btf.type_cnt ()
    (gdb) bt
    #0  0x00005555556915b7 in btf.type_cnt ()
    #1  0x0000555555691fbc in btf_find_by_name_kind ()
    #2  0x00005555556920d0 in btf.find_by_name_kind ()
    #3  0x00005555558a1b7c in init_numa_data (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:125
    #4  0x00005555558a264b in lock_contention_prepare (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:313
    #5  0x0000555555620702 in __cmd_contention (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2084
    #6  0x0000555555622c8d in cmd_lock (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2755
    #7  0x0000555555651451 in run_builtin (p=0x555556104f00 <commands+576>, argc=3, argv=0x7fffffffea10)
        at perf.c:349
    #8  0x00005555556516ed in handle_internal_command (argc=3, argv=0x7fffffffea10) at perf.c:401
    #9  0x000055555565184e in run_argv (argcp=0x7fffffffe7fc, argv=0x7fffffffe7f0) at perf.c:445
    #10 0x0000555555651b9f in main (argc=3, argv=0x7fffffffea10) at perf.c:553

Check if btf loading failed, and don't do anything with it in
init_numa_data(). This leads to the following error message, instead of
just a crash:

    libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
    libbpf: failed to find valid kernel BTF
    libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
    libbpf: failed to find valid kernel BTF
    libbpf: Error loading vmlinux BTF: -ESRCH
    libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH
    Failed to load lock-contention BPF skeleton
    lock contention BPF setup failed

Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sort: Replace static cacheline size with sysconf cacheline size

Testing:
- Built perf
- Executed perf mem record and report

Committer notes:

This addresses a TODO and improves the situation where record and
report/c2c are performed on the same machine or in machines with the
same cacheline size, but the proper way is to store the cacheline size
in the perf.data header at 'record' time and then use it at post
processing time.

Signed-off-by: Ricky Ringler <ricky.ringler@proton.me>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20260129004223.26799-1-ricky.ringler@proton.me
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test: Fix test case Leader sampling on s390

The subtest 'Leader sampling' some time fails on s390.
- for z/VM guest: Disable the test for z/VM guest. There is no
  CPU Measurement facility to run the test successfully.
- for LPAR: Use correct event names.

A detailed analysis follows here:
Now to the debugging and investigation:
1. With command
       perf record -e '{cycles,cycles}:S' -- ....
   the first cycles event starts sampling.
   On s390 this sets up sampling with a frequency of 4000 Hz.
   This translates to hardware sample rate of 1377000 instructions per
   micro-second to meet a frequency of 4000 HZ.

2. With first event cycles now sampling into a hardware buffer, an
   interrupt is triggered each time a sampling buffer gets full.
   The interrupt handler is then invoked and debug output shows the
   processing of samples.  The size of one hardware sample is 32 bytes.
   With an interrupt triggered when the hardware buffer page of 4KB
   gets full, the interrupt handler processes 128 samples.
   (This is taken from s390 specific fast debug data gathering)
   2025-11-07 14:35:51.977248  000003ffe013cbfa \
   perf_event_count_update event->count 0x0 count 0x1502e8
   2025-11-07 14:35:51.977248  000003ffe013cbfa \
   perf_event_count_update event->count 0x1502e8 count 0x1502e8
   2025-11-07 14:35:51.977248  000003ffe013cbfa \
   perf_event_count_update event->count 0x2a05d0 count 0x1502e8
   2025-11-07 14:35:51.977252  000003ffe013cbfa \
   perf_event_count_update event->count 0x3f08b8 count 0x1502e8
   2025-11-07 14:35:51.977252  000003ffe013cbfa \
   perf_event_count_update event->count 0x540ba0 count 0x1502e8
   2025-11-07 14:35:51.977253  000003ffe013cbfa \
   perf_event_count_update event->count 0x690e88 count 0x1502e8
   2025-11-07 14:35:51.977254  000003ffe013cbfa \
   perf_event_count_update event->count 0x7e1170 count 0x1502e8
   2025-11-07 14:35:51.977254  000003ffe013cbfa \
   perf_event_count_update event->count 0x931458 count 0x1502e8
   2025-11-07 14:35:51.977254  000003ffe013cbfa \
   perf_event_count_update event->count 0xa81740 count 0x1502e8

3. The value is constantly increasing by the number of instructions
   executed to generate a sample entry.  This is the first line of the
   pairs of lines. count 0x1502e8 --> 1377000

   # perf script | grep 1377000 | wc -l
   214
   # perf script | wc -l
   428
   #
   That is 428 lines in total, and half of the lines contain value
   1377000.

4. The second event cycles is opened against the counting PMU, which
   is an independent PMU and is not interrupt driven.  Once enabled it
   runs in the background and keeps running, incrementing silently
   about 400+ counters. The counter values are read via assembly
   instructions.

   This second counter PMU's read call back function is called when the
   interrupt handler of the sampling facility processes each sample. The
   function call sequence is:

   perf_event_overflow()
   +--> __perf_event_overflow()
        +--> __perf_event_output()
               +--> perf_output_sample()
                    +--> perf_output_read()
                         +--> perf_output_read_group()
                          for_each_sibling_event(sub, leader) {
values[n++] = perf_event_count(sub, self);
printk("%s sub %p values %#lx\n", __func__, sub, values[n-1]);
          }

   The last function perf_event_count() is invoked on the second event
   cylces *on* the counting PMU. An added printk statement shows the
   following lines in the dmesg output:

   # dmesg|grep perf_output_read_group |head -10
   [  332.368620] perf_output_read_group sub 00000000d80b7c1f values 0x3a80917 (1)
   [  332.368624] perf_output_read_group sub 00000000d80b7c1f values 0x3a86c7f (2)
   [  332.368627] perf_output_read_group sub 00000000d80b7c1f values 0x3a89c15 (3)
   [  332.368629] perf_output_read_group sub 00000000d80b7c1f values 0x3a8c895 (4)
   [  332.368631] perf_output_read_group sub 00000000d80b7c1f values 0x3a8f569 (5)
   [  332.368633] perf_output_read_group sub 00000000d80b7c1f values 0x3a9204b
   [  332.368635] perf_output_read_group sub 00000000d80b7c1f values 0x3a94790
   [  332.368637] perf_output_read_group sub 00000000d80b7c1f values 0x3a9704b
   [  332.368638] perf_output_read_group sub 00000000d80b7c1f values 0x3a99888
   #

   This correlates with the output of
   # perf report -D | grep 'id 00000000000000'|head -10
   ..... id 0000000000000006, value 00000000001502e8, lost 0
   ..... id 000000000000000e, value 0000000003a80917, lost 0 --> line (1) above
   ..... id 0000000000000006, value 00000000002a05d0, lost 0
   ..... id 000000000000000e, value 0000000003a86c7f, lost 0 --> line (2) above
   ..... id 0000000000000006, value 00000000003f08b8, lost 0
   ..... id 000000000000000e, value 0000000003a89c15, lost 0 --> line (3) above
   ..... id 0000000000000006, value 0000000000540ba0, lost 0
   ..... id 000000000000000e, value 0000000003a8c895, lost 0 --> line (4) above
   ..... id 0000000000000006, value 0000000000690e88, lost 0
   ..... id 000000000000000e, value 0000000003a8f569, lost 0 --> line (5) above

Summary:
- Above command starts the CPU sampling facility, with runs interrupt
  driven when a 4KB page is full. An interrupt processes the 128 samples
  and calls eventually perf_output_read_group() for each sample to save it
  in the event's ring buffer.

- At that time the CPU counting facility is invoked to read the value of
  the event cycles. This value is saved as the second value in the
  sample_read structure.

- The first and odd lines in the perf script output displays the period
  value between 2 samples being created by hardware. It is the number
  of instructions executes before the hardware writes a sample.

- The second and even lines in the perf script output displays the number
  of CPU cycles needed to process each sample and save it in the event's
  ring buffer.
These 2 different values can never be identical on s390.

Since event leader sampling is not possible on s390 the perf tool will
return EOPNOTSUPP soon. Perpare the test case for that.

Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Jan Polensky <japo@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf annotate: Fix register usage in data type profiling

On data type profiling, it tried to match register name with a partial
string. For example, it allowed to match with "%rbp)" or "%rdi,8)".

But with recent change in the area, it doesn't match anymore and break
the data type profiling.

Let's pass the correct register name by removing the unwanted part.

Add arch__dwarf_regnum() to handle it in a single place.

Closes: 7d3n23li6drroxrdlpxn7ixehdeszkjdftah3zyngjl2qs22ef@yelcjv53v42o
Reported-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat: Ensure metrics are displayed even with failed events

Currently, `perf stat` skips or hides metrics when the underlying
hardware events cannot be counted (e.g., due to insufficient permissions
or unsupported events).

In `--metric-only` mode, this often results in missing columns or blank
spaces, making the output difficult to parse.

Modify the logic to ensure metrics are consistently displayed by
propagating NAN (Not a Number) through the expression evaluator.
Specifically:

1. Update `prepare_metric()` in stat-shadow.c to treat uncounted events
   (where `run == 0`) as NAN. This leverages the existing math in expr.y
   to propagate NAN through metric expressions.

2. Remove the early return in the display logic's `printout()` function
   that was previously skipping metrics in `--metric-only` mode for
   failed events.
l
3. Simplify `perf_stat__skip_metric_event()` to no longer depend on
   event runtime.

Tested:

1. `perf all metrics test` did not crash while paranoid is 2.

2. Multiple combinations with `CPUs_utilized` while paranoid is 2.

  $ ./perf stat -M CPUs_utilized -a -- sleep 1

   Performance counter stats for 'system wide':

     <not supported> msec cpu-clock:u                      #      nan CPUs  CPUs_utilized
       1,006,356,120      duration_time

         1.004375550 seconds time elapsed

  $ ./perf stat -M CPUs_utilized -a -j -- sleep 1
  {"counter-value" : "<not supported>", "unit" : "msec", "event" : "cpu-clock:u", "event-runtime" : 0, "pcnt-running" : 100.00, "metric-value" : "nan", "metric-unit" : "CPUs  CPUs_utilized"}
  {"counter-value" : "1006642462.000000", "unit" : "", "event" : "duration_time", "event-runtime" : 1, "pcnt-running" : 100.00}

  $ ./perf stat -M CPUs_utilized -a --metric-only -- sleep 1

   Performance counter stats for 'system wide':

    CPUs  CPUs_utilized
                      nan

         1.004424652 seconds time elapsed

  $ ./perf stat -M CPUs_utilized -a --metric-only -j -- sleep 1
  {"CPUs  CPUs_utilized" : "none"}

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test addr2line_inlines: Ensure inline information shows on LBR leaves

Expand the addr2line inline function testing to also run for an LBR
callchain, skipping if LBR support isn't present.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf callchain lbr: Make the leaf IP that of the sample

The current IP of a leaf function when reported from a perf record with
"--call-graph lbr" is the "to" field of the LBR branch stack record.

The sample for the event being recorded may be further into the function
and there may be inlining information associated with it.

Rather than use the branch stack "to" field in this case switch to the
callchain appending the sample->ip and thereby allowing the inline
information to show.

Before this change:
```
$ perf record --call-graph lbr perf test -w inlineloop
...
$ perf script --fields +srcline
...
perf-inlineloop  467586  4649.344493:     950905 cpu_core/cycles/P:
           55dfda2829c0 parent+0x0 (perf)
inlineloop.c:31
           55dfda282a96 inlineloop+0x86 (perf)
inlineloop.c:47
           55dfda236420 run_workload+0x59 (perf)
builtin-test.c:715
           55dfda236b03 cmd_test+0x413 (perf)
builtin-test.c:825
...
```

After this change:
```
$ perf record --call-graph lbr perf test -w inlineloop
...
$ perf script --fields +srcline
...
perf-inlineloop  529703 11878.680815:     950905 cpu_core/cycles/P:
            555ce86be9e6 leaf+0x26
  inlineloop.c:20 (inlined)
            555ce86be9e6 middle+0x26
  inlineloop.c:27 (inlined)
            555ce86be9e6 parent+0x26 (perf)
  inlineloop.c:32
            555ce86bea96 inlineloop+0x86 (perf)
  inlineloop.c:47
            555ce8672420 run_workload+0x59 (perf)
  builtin-test.c:715
            555ce8672b03 cmd_test+0x413 (perf)
  builtin-test.c:825
...
```

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf kvm stat: Fix build error

Since commit ceea279f9376 ("perf kvm stat: Remove use of the arch
directory"), a native build on Arm64 machine reports:

  util/kvm-stat-arch/kvm-stat-x86.c:7:10: fatal error: asm/svm.h: No such file or directory
    7 | #include <asm/svm.h>
      |          ^~~~~~~~~~~
  compilation terminated.

The build fails to find x86's asm headers when building for Arm64.  Fix
this by including asm headers with relative path instead.

Fixes: ceea279f9376 ("perf kvm stat: Remove use of the arch directory")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20260206-perf_fix_kvm_stat_error-v1-1-ad40115876be@arm.com
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf regs: Remove __weak attributive arch_sdt_arg_parse_op() function

In line with the previous patch, the __weak arch_sdt_arg_parse_op()
function is removed.

Architectural-specific implementations in the arch/ directory are now
converted into sub-functions within the util/perf-regs-arch/ directory.

The perf_sdt_arg_parse_op() function will call these sub-functions based
on the EM_HOST.

This change enables cross-architecture calls to arch_sdt_arg_parse_op().

No functional changes are intended.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
[ Fixed up somme fuzz with powerpc and x86 Build files wrt removing perf_regs.o ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf regs: Remove __weak attributive arch__xxx_reg_mask() functions

Currently, some architecture-specific perf-regs functions, such as
arch__intr_reg_mask() and arch__user_reg_mask(), are defined with the
__weak attribute.

This approach ensures that only functions matching the architecture of
the build/run host are compiled and executed, reducing build time and
binary size.

However, this __weak attribute restricts these functions to be called
only on the same architecture, preventing cross-architecture
functionality.

For example, a perf.data file captured on x86 cannot be parsed on an ARM
platform.

To address this limitation, this patch removes the __weak attribute from
these perf-regs functions.

The architecture-specific code is moved from the arch/ directory to the
util/perf-regs-arch/ directory.

The appropriate architectural functions are then called based on the
EM_HOST.

No functional changes are intended.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
[ Fixed up somme fuzz with s390 and riscv Build files wrt removing perf_regs.o ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf arch: Update arch headers to use relative UAPI paths

The architectural specific headers perf_regs.h currently rely on the
host architecture's 'asm/perf_regs.h'.

This can lead to compilation inconsistencies or failures when including
and building perf for a target architecture that differs from the host's
architecture.

Explicitly point to the UAPI headers within the tools source tree using
relative paths.

This ensures that perf is always built against the intended
architecture.

No functional changes are intended.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf regs: Fix abort for "-I" or "--user-regs" options

Fix an issue where the `perf` tool aborts unexpectedly when running the
following command:

```
perf record -e cycles -I -- true

Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -I, --intr-regs[=<any register>]
    sample selected machine registers on interrupt, use '-I?' to list register names
```

The usage of the `-I` or `--user-regs` options without specifying any
registers should default to sampling all general-purpose registers.

However, this currently causes an abnormal termination.

The issue was introduced by commit 3d06db9bad1a ("perf regs: Refactor
use of arch__sample_reg_masks() to perf_reg_name()").

This patch resolves the problem, ensuring that the `-I` or `--user-regs`
options work as intended without causing an abort.

Fixes: 3d06db9bad1ad8e6 ("perf regs: Refactor use of arch__sample_reg_masks() to perf_reg_name()")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf metricgroup: Don't early exit if no CPUID table exists

The failure to find a table of metrics with a CPUID shouldn't early
exit as the metric code will now also consider the default table.

When searching for a metric or metric group,
pmu_metrics_table__for_each_metric() considers all tables and so the
caller doesn't need to switch the table to do this.

Fixes: c7adeb0974f18da4 ("perf jevents: Add set of common metrics based on default ones")
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tests: build-test coverage for NO_JEVENTS=1

Leo reported 'perf stat' being broken and this highlighted that the
'make NO_JEVENTS=1' variant is missing from 'make -C tools/perf
build-test', add it.

Closes: https://lore.kernel.org/linux-perf-users/20260205175250.GC3529712@e132581.arm.com/
Reported-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tests: Additional 'perf stat' tests

Recently 'perf stat' regressed in per CPU mode [1].

Let's expand test coverage to catch the same breakage again as well as
to test the repeat, pid, detailed and no aggregation options.

[1] https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf record: Make logs more readable for event open failures

Since commit ee27476fa3004f83 ("perf record: Skip don't fail for events
that don't open"), if a user does not have permission to access a PMU
event, perf reports:

  perf record -e cs_etm// -C 3 -- ls
  Error:
  Failure to open event 'cs_etm//u' on PMU 'cs_etm' which will be removed.
  No fallback found for 'cs_etm//u' for error 13
  Error:
  Failure to open event 'dummy:u' on PMU 'software' which will be removed.
  No fallback found for 'dummy:u' for error 13
  Error:
  Failure to open any events for recording.

The log is not very helpful, as no clear indication of what "error 13"
means or how to address the issue.

This commit restores evsel__open_strerror() to generate a readable error
message and print it out:

  perf record -e cs_etm// -C 3 -- ls
  Error:
  Failure to open event 'cs_etm//' on PMU 'cs_etm' which will be removed.
  Access to performance monitoring and observability operations is limited.
  Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
  access to performance monitoring and observability operations for processes
  without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
  More information can be found at 'Perf events and tool security' document:
  https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
  perf_event_paranoid setting is 1:
    -1: Allow use of (almost) all events by all users
        Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
  >= 0: Disallow raw and ftrace function tracepoint access
  >= 1: Disallow CPU event access
  >= 2: Disallow kernel profiling
  To make the adjusted perf_event_paranoid setting permanent preserve it
  in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
  Error:
  Failure to open event 'dummy:u' on PMU 'software' which will be removed.
  Access to performance monitoring and observability operations is limited.
  Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
  access to performance monitoring and observability operations for processes
  without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
  More information can be found at 'Perf events and tool security' document:
  https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
  perf_event_paranoid setting is 1:
    -1: Allow use of (almost) all events by all users
        Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
  >= 0: Disallow raw and ftrace function tracepoint access
  >= 1: Disallow CPU event access
  >= 2: Disallow kernel profiling
  To make the adjusted perf_event_paranoid setting permanent preserve it
  in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
  Error:
  Failure to open any events for recording.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf thread: Don't require machine to compute the e_machine

The machine can be calculated from a thread via its maps.

Don't require the machine argument to simplify callers and also to delay
computing the machine until a little later.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Add e_machine/e_flags to the header

Add 64-bits of feature data to record the ELF machine and flags.

This allows readers to initialize based on the data.

For example, `perf kvm stat` wants to initialize based on the kind of
data to be read, but at initialization time there are no threads to base
this data upon and using the host means cross platform support won't
work.

The values in the perf_env also act as a cache for these within the
session.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Add e_flags to the e_machine helper

Allow e_flags as well as e_machine to be computed using the e_machine
helper.

This isn't currently used, the argument is always NULL, but it will be
used for a new header feature.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf kvm: Wire up e_machine

Pass the e_machine to the kvm functions so that they aren't just wired
to EM_HOST.

In the case of a session move some setup until the session
is created.

As the session isn't fully running the default EM_HOST is returned as no
e_machine can be found in a running machine.

This is, however, some marginal progress to cross platform support.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf kvm stat: Remove use of the arch directory

`perf kvm stat` supports record and report options.

By using the arch directory a report for a different machine type cannot
be supported.

Move the kvm-stat code out of the arch directory and into
util/kvm-stat-arch following the pattern of perf-regs and dwarf-regs.

Avoid duplicate symbols by renaming functions to have the architecture
name within them.

For global variables, wrap them in an architecture specific function.
Selecting the architecture to use with `perf kvm stat` is selected by
EM_HOST, ie no different than before the change.

Later the ELF machine can be determined from the session or a header
feature (ie EM_HOST at the time of the record).

The build and #define HAVE_KVM_STAT_SUPPORT is now redundant so remove
across Makefiles and in the build.

Opportunistically constify architectural structs and arrays.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

libperf build: Always place libperf includes first

When building tools/perf the CFLAGS can contain a directory for the
installed headers.

As the headers may be being installed while building libperf.a this can
cause headers to be partially installed and found in the include path
while building an object file for libperf.a.

The installed header may reference other installed headers that are
missing given the partial nature of the install and then the build fails
with a missing header file.

Avoid this by ensuring the libperf source headers are always first in
the CFLAGS.

Fixes: 3143504918105156 ("libperf: Make libperf.a part of the perf build")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test kvm: Add stat live testing

Ensure the `perf kvm stat live -p ..` has some basic functionality.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf capstone: Support for dlopen-ing libcapstone.so

If perf is built with LIBCAPSTONE_DLOPEN=1, support dlopen-ing
libcapstone.so and then calling the necessary functions by looking them
up using dlsym.

The types come from capstone.h which means the libcapstone feature check
needs to pass, and NO_CAPSTONE=1 hasn't been defined. This will cause
the definition of HAVE_LIBCAPSTONE_SUPPORT.

Earlier versions of this code tried to declare the necessary
capstone.h constants and structs, but they weren't stable and caused
breakages across libcapstone releases.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf build: Remove NO_LIBCAP that controls nothing

Using libcap was removed in commit e25ebda78e230283 ("perf cap: Tidy up
and improve capability testing") and improve capability testing"),
however, some build documentation and a use of the NO_LIBCAP=1 were
lingering.

Remove these left over bits.

Fixes: e25ebda78e230283 ("perf cap: Tidy up and improve capability testing")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Validate that all names given an Event

Validate they exist in a JSON file from one directory found from one
directory above the model's JSON directory.

This avoids broken fallback encodings being created.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add cycles breakdown metric for arm64/AMD/Intel

Breakdown cycles to user, kernel and guest. Add a common_metrics.py
file for such metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add mesh bandwidth saturation metric for Intel

Memory bandwidth saturation from CBOX/CHA events present in
broadwellde, broadwellx, cascadelakex, haswellx, icelakex, skylakex
and snowridgex.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add upi_bw metric for Intel

Break down UPI read and write bandwidth using uncore_upi counters.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add local/remote miss latency metrics for Intel

Derive from CBOX/CHA occupancy and inserts the average latency as is
provided in Intel's uncore performance monitoring reference.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add C-State metrics from the PCU PMU for Intel

Use occupancy events fixed in:

https://lore.kernel.org/lkml/20240226201517.3540187-1-irogers@google.com/

Metrics are at the socket level referring to cores, not hyperthreads.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add dir breakdown metrics for Intel

Breakdown directory hit, misses and requests. The implementation uses
the M2M and CHA PMUs present in server models broadwellde, broadwellx
cascadelakex, emeraldrapids, icelakex, sapphirerapids and skylakex.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add local/remote "mem" breakdown metrics for Intel

Breakdown local and remote memory bandwidth, read and writes.

The implementation uses the HA and CHA PMUs present in server models
broadwellde, broadwellx cascadelakex, emeraldrapids, haswellx, icelakex,
ivytown, sapphirerapids and skylakex.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add mem_bw metric for Intel

Break down memory bandwidth using uncore counters. For many models
this matches the memory_bandwidth_* metrics, but these metrics aren't
made available on all models.

Add support for free running counters. Query the event JSON when
determining which what events/counters are available.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add Miss Level Parallelism (MLP) metric for Intel

Number of outstanding load misses per cycle.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add FPU metrics for Intel

Metrics break down of floating point operations.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add context switch metrics for Intel

Metrics break down context switches for different kinds of
instruction.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add ILP metrics for Intel

Use the counter mask (cmask) to see how many cycles an instruction
takes to retire. Present as a set of ILP metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add load store breakdown metrics ldst for Intel

Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add L2 metrics for Intel

Give a breakdown of various L2 counters as metrics, including totals,
reads, hardware prefetcher, RFO, code and evictions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add ports metric group giving utilization on Intel

The ports metric group contains a metric for each port giving its
utilization as a ratio of cycles.

The metrics are created by looking for UOPS_DISPATCHED.PORT events.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add software prefetch (swpf) metric group for Intel

Add metrics that breakdown software prefetch instruction use.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add br metric group for branch statistics on Intel

The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using JSON
events.

Conditional taken and not taken metrics are specific to Icelake and
later generations, so the presence of the event is used to determine
whether the metric should exist.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add tsx metric group for Intel models

Allow duplicated metric to be dropped from JSON files. Detect when TSX
is supported by a model by using the JSON events, use sysfs events at
runtime as hypervisors, etc. may disable TSX.

Add CheckPmu to metric to determine if which PMUs have been associated
with the loaded events.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Mark metrics with experimental events as experimental

When metrics are made with experimental events it is desirable the
metric description also carries this information in case of metric
inaccuracies.

Suggested-by: Perry Taylor <perry.taylor@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add smi metric group for Intel models

Allow duplicated metric to be dropped from JSON files.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add CheckPmu to see if a PMU is in loaded JSON events

CheckPmu can be used to determine if hybrid events are present,
allowing for hybrid conditional metrics/events/pmus to be premised on
the JSON files rather than hard coded tables.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add idle metric for Intel models

Compute using the msr PMU the percentage of wallclock cycles where the
CPUs are in a low power state.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add RAPL metrics for all Intel models

Add a 'cpu_power' metric group that computes the power consumption
from RAPL events if they are present.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add context switch metrics for AMD

Metrics break down context switches for different kinds of
instruction.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add load store breakdown metrics ldst for AMD

Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add uncore l3 metric group for AMD

Metrics use the amd_l3 PMU for access/miss/hit information.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add dtlb metric group for AMD

Add metrics that give an overview and details of the dtlb (zen1, zen2,
zen3).

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add itlb metric group for AMD

Add metrics that give an overview and details of the l1 itlb (zen1,
zen2, zen3) and l2 itlb (all zens).

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add br metric group for branch statistics on AMD

The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using JSON
events.

The lack of conditional events on anything but zen2 means this category
is lacking on zen1, zen3 and zen4.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add upc metric for uops per cycle for AMD

The metric adjusts for whether or not SMT is on.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add idle metric for AMD zen models

Compute using the MSR PMU the percentage of wallclock cycles where the
CPUs are in a low power state.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add RAPL event metric for AMD zen models

Add power per second metrics based on RAPL.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Add load event JSON to verify and allow fallbacks

Add a LoadEvents function that loads all event JSON files in a
directory.

In the Event constructor ensure all events are defined in the event JSON
except for legacy events like "cycles".

If the initial event isn't found then legacy_event1 is used, and if that
isn't found legacy_event2 is used.

This allows a single Event to have multiple event names as models will
often rename the same event over time. If the event doesn't exist an
exception is raised.

So that references to metrics can be added, add the MetricRef
class. This doesn't validate as an event name and so provides an
escape hatch for metrics to refer to each other.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Build support for generating metrics from python

Generate extra-metrics.json and extra-metricgroups.json from python
architecture specific scripts. The metrics themselves will be added in
later patches.

If a build takes place in tools/perf/ then extra-metrics.json and
extra-metricgroups.json are generated in that directory and so added
to .gitignore.

If there is an OUTPUT directory then the tools/perf/pmu-events/arch
files are copied to it so the generated extra-metrics.json and
extra-metricgroups.json can be added/generated there.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf: Remove redundant kernel.h include

Now that the bitfield dependency is resolved, the explicit inclusion of
kernel.h is no longer needed.

Remove the redundant include.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

tools: Fix bitfield dependency failure

A perf build failure was reported by Thomas Voegtle on stable kernel
v6.6.120:

    CC      tests/sample-parsing.o
    CC      util/intel-pt-decoder/intel-pt-pkt-decoder.o
    CC      util/perf-regs-arch/perf_regs_csky.o
    CC      util/arm-spe-decoder/arm-spe-pkt-decoder.o
    CC      util/perf-regs-arch/perf_regs_loongarch.o
  In file included from util/arm-spe-decoder/arm-spe-pkt-decoder.h:10,
                   from util/arm-spe-decoder/arm-spe-pkt-decoder.c:14:
  /local/git/linux-stable-rc/tools/include/linux/bitfield.h: In function ‘le16_encode_bits’:
  /local/git/linux-stable-rc/tools/include/linux/bitfield.h:166:31: error: implicit declaration of
  function ‘cpu_to_le16’; did you mean ‘htole16’? [-Werror=implicit-function-declaration]
    ____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu) \
                                 ^~~~~~~~~
  /local/git/linux-stable-rc/tools/include/linux/bitfield.h:149:9: note: in definition of macro
  ‘____MAKE_OP’
    return to((v & field_mask(field)) * field_multiplier(field)); \
           ^~
  /local/git/linux-stable-rc/tools/include/linux/bitfield.h:170:1: note: in expansion of macro
  ‘__MAKE_OP’
   __MAKE_OP(16)

Fix this by including linux/kernel.h, which provides the required
definitions.

The issue was not found on the mainline due to the relevant C files have
included kernel.h.  It'd be good to merge this change on mainline
as well for robustness.

Closes: https://lore.kernel.org/stable/3a44500b-d7c8-179f-61f6-e51cb50d3512@lio96.de/
Fixes: 64d86c03e1441742 ("perf arm-spe: Extend branch operations")
Reported-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
To: Sasha Levin <sashal@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched stats: Fixes in man page

Fix the incorrect description of the schedstats report. Also fix the
spelling errors in man page.

Fixes: 800af362d68945e5 ("perf sched stats: Add details in man page")
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched stats: Define macro for SEP_LEN

Define a macro for separator length of the line in perf sched stats
report.

Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched stats: correct spelling of function name

Replace store_schedtstat_cpu_diff() with store_schedstat_cpu_diff()

Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched stats: Add NULL check for cd_map

In perf_sched__schedstat_live(), build_cpu_domain_map() returns the
pointer to cpu_domain_map which can also be NULL.

Add NULL check for the same to avoid NULL pointer dereference.

Fixes: 00093b3133984ffe ("perf sched stats: Add support for live mode")
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf util: Fix NULL check in cpumask_to_cpulist()

The function cpumask_to_cpulist() allocates memory with calloc() and
stores the result in 'bm', but then incorrectly checks 'cpumask' for
NULL instead of 'bm'.

This means that if the allocation fails, the function will dereference a
NULL pointer when trying to access 'bm'.

Fix the check to test the correct variable 'bm'.

Fixes: d40c68a49f69c9bd ("perf header: Support CPU DOMAIN relation info")
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Replace hardcoded max cpus by MAX_NR_CPUS

cpumask and cpulist from cpu-domain header have hardcoded max_cpus value
of 1024.

Current systems have more cpus than this value. Replace it with
MAX_NR_CPUS.

Also define a macro to represent domain name length.

Fixes: d40c68a49f69c9bd ("perf header: Support CPU DOMAIN relation info")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf jevents: Handle deleted JSONS in out of source builds

Make the source folders a dependency for the generated folder root so
that whenever a file is deleted from the source it will force a new
fresh copy of all the JSON files and avoid stale deleted files.

JSON_DIRS_OUTPUT_ROOT needs to be a dependency of LEGACY_CACHE_JSON so
that the root folder doesn't get cleaned after the legacy JSON is
generated. But this is a no-op with in-source builds as
JSON_DIRS_OUTPUT_ROOT is unset.

JSON_DIRS is added as a dependency of PMU_EVENTS_C which also forces a
re-build for in source builds when JSON files are deleted. This could
have also resulted in stale builds, but never a broken one.

Closes: https://lore.kernel.org/linux-next/aW5XSAo88_LBPSYI@sirena.org.uk/
Fixes: 4bb55de4ff03db3e ("perf jevents: Support copying the source json files to OUTPUT")
Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>