linux-2.6-microblaze.git
8 weeks agoperf docs: Document cross compilation
Leo Yan [Wed, 17 Jul 2024 08:22:11 +0000 (09:22 +0100)]
perf docs: Document cross compilation

Records the commands for cross compilation with two methods.

The first method relies on Multiarch. The second approach is to explicitly
specify the PKG_CONFIG variables, which is widely used in build system
(like Buildroot, Yocto, etc).

Co-developed-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-7-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 weeks agoperf: build: Link lib 'zstd' for static build
Leo Yan [Wed, 17 Jul 2024 08:22:10 +0000 (09:22 +0100)]
perf: build: Link lib 'zstd' for static build

When build static perf, Makefile reports the error:

  Makefile.config:480: No libdw DWARF unwind found, Please install
  elfutils-devel/libdw-dev >= 0.158 and/or set LIBDW_DIR

The libdw has been installed on the system, but the build system fails
to build the feature detecting binary 'test-libdw-dwarf-unwind'. The
failure is caused by missing to link the lib 'zstd'.

Link lib 'zstd' for the static build, in the end, the dwarf feature can
be enabled in the static perf.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-6-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 weeks agoperf: build: Link lib 'lzma' for static build
Leo Yan [Wed, 17 Jul 2024 08:22:09 +0000 (09:22 +0100)]
perf: build: Link lib 'lzma' for static build

The libunwind feature test failed with the static linkage. This is due
to the 'lzma' lib is missed, so link it to dismiss building failure.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-5-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 weeks agoperf: build: Only link libebl.a for old libdw
Leo Yan [Wed, 17 Jul 2024 08:22:08 +0000 (09:22 +0100)]
perf: build: Only link libebl.a for old libdw

Since libdw version 0.177, elfutils has merged libebl.a into libdw (see
the commit "libebl: Don't install libebl.a, libebl.h and remove backends
from spec." in the elfutils repository).

As a result, libebl.a does not exist on Debian Bullseye and newer
releases, causing static perf builds to fail on these distributions.

This commit checks the libdw version and only links libebl.a if it
detects that the libdw version is older than 0.177.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-4-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 weeks agoperf: build: Set Python configuration for cross compilation
Leo Yan [Wed, 17 Jul 2024 08:22:07 +0000 (09:22 +0100)]
perf: build: Set Python configuration for cross compilation

Python configuration has dedicated folders for different architectures.
For example, Python 3.11 has two folders as shown below, one for Arm64
and another for x86_64:

  /usr/lib/python3.11/config-3.11-aarch64-linux-gnu/
  /usr/lib/python3.11/config-3.11-x86_64-linux-gnu/

This commit updates the Python configuration path based on the
compiler's machine type, guiding the compiler to find the correct path
for Python libraries. It also renames the generated .so file name to
match the machine name.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-3-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 weeks agoperf: build: Setup PKG_CONFIG_LIBDIR for cross compilation
Leo Yan [Wed, 17 Jul 2024 08:22:06 +0000 (09:22 +0100)]
perf: build: Setup PKG_CONFIG_LIBDIR for cross compilation

On recent Linux distros like Ubuntu Noble and Debian Bookworm, the
'pkg-config-aarch64-linux-gnu' package is missing. As a result, the
aarch64-linux-gnu-pkg-config command is not available, which causes
build failures.

When a build passes the environment variables PKG_CONFIG_LIBDIR or
PKG_CONFIG_PATH, like a user uses make command or a build system
(like Yocto, Buildroot, etc) prepares the variables and passes to the
Perf's Makefile, the commit keeps these variables for package
configuration. Otherwise, this commit sets the PKG_CONFIG_LIBDIR
variable to use the Multiarch libs for the cross compilation.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 weeks agoperf tool: fix dereferencing NULL al->maps
Casey Chen [Mon, 22 Jul 2024 21:15:48 +0000 (15:15 -0600)]
perf tool: fix dereferencing NULL al->maps

With 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions"),
when cpumode is 3 (macro PERF_RECORD_MISC_HYPERVISOR),
thread__find_map() could return with al->maps being NULL.

The path below could add a callchain_cursor_node with NULL ms.maps.

add_callchain_ip()
  thread__find_symbol(.., &al)
    thread__find_map(.., &al)   // al->maps becomes NULL
  ms.maps = maps__get(al.maps)
  callchain_cursor_append(..., &ms, ...)
    node->ms.maps = maps__get(ms->maps)

Then the path below would dereference NULL maps and get segfault.

fill_callchain_info()
  maps__machine(node->ms.maps);

Fix it by checking if maps is NULL in fill_callchain_info().

Fixes: 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions")
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: yzhong@purestorage.com
Link: https://lore.kernel.org/r/20240722211548.61455-1-cachen@purestorage.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf dso: Fix build when libunwind is enabled
James Clark [Mon, 15 Jul 2024 09:47:13 +0000 (10:47 +0100)]
perf dso: Fix build when libunwind is enabled

Now that symsrc_filename is always accessed through an accessor, we also
need a free() function for it to avoid the following compilation error:

  util/unwind-libunwind-local.c:416:12: error: lvalue required as unary
    ‘&’ operand
  416 |      zfree(&dso__symsrc_filename(dso));

Fixes: 1553419c3c10 ("perf dso: Fix address sanitizer build")
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240715094715.3914813-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agotools/latency: Use pkg-config in lib_setup of Makefile.config
Guilherme Amadio [Wed, 17 Jul 2024 17:47:39 +0000 (19:47 +0200)]
tools/latency: Use pkg-config in lib_setup of Makefile.config

This allows to build against libtraceevent and libtracefs installed
in non-standard locations.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20240717174739.186988-6-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agotools/rtla: Use pkg-config in lib_setup of Makefile.config
Guilherme Amadio [Fri, 12 Jul 2024 19:40:49 +0000 (21:40 +0200)]
tools/rtla: Use pkg-config in lib_setup of Makefile.config

This allows to build against libtraceevent and libtracefs installed
in non-standard locations.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240712194511.3973899-5-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agotools/verification: Use pkg-config in lib_setup of Makefile.config
Guilherme Amadio [Fri, 12 Jul 2024 19:40:48 +0000 (21:40 +0200)]
tools/verification: Use pkg-config in lib_setup of Makefile.config

This allows to build against libtraceevent and libtracefs installed
in non-standard locations.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240712194511.3973899-4-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agotools: Make pkg-config dependency checks usable by other tools
Guilherme Amadio [Wed, 17 Jul 2024 17:47:36 +0000 (19:47 +0200)]
tools: Make pkg-config dependency checks usable by other tools

Other tools, in tools/verification and tools/tracing, make use of
libtraceevent and libtracefs as dependencies. This allows setting
up the feature check flags for them as well.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20240717174739.186988-3-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf build: Warn if libtracefs is not found
Guilherme Amadio [Wed, 17 Jul 2024 17:47:35 +0000 (19:47 +0200)]
perf build: Warn if libtracefs is not found

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20240717174739.186988-2-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf trace: Fix iteration of syscall ids in syscalltbl->entries
Howard Chu [Fri, 5 Jul 2024 13:20:51 +0000 (21:20 +0800)]
perf trace: Fix iteration of syscall ids in syscalltbl->entries

This is a bug found when implementing pretty-printing for the
landlock_add_rule system call, I decided to send this patch separately
because this is a serious bug that should be fixed fast.

I wrote a test program to do landlock_add_rule syscall in a loop,
yet perf trace -e landlock_add_rule freezes, giving no output.

This bug is introduced by the false understanding of the variable "key"
below:
```
for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
struct syscall *sc = trace__syscall_info(trace, NULL, key);
...
}
```
The code above seems right at the beginning, but when looking at
syscalltbl.c, I found these lines:

```
for (i = 0; i <= syscalltbl_native_max_id; ++i)
if (syscalltbl_native[i])
++nr_entries;

entries = tbl->syscalls.entries = malloc(sizeof(struct syscall) * nr_entries);
...

for (i = 0, j = 0; i <= syscalltbl_native_max_id; ++i) {
if (syscalltbl_native[i]) {
entries[j].name = syscalltbl_native[i];
entries[j].id = i;
++j;
}
}
```

meaning the key is merely an index to traverse the syscall table,
instead of the actual syscall id for this particular syscall.

So if one uses key to do trace__syscall_info(trace, NULL, key), because
key only goes up to trace->sctbl->syscalls.nr_entries, for example, on
my X86_64 machine, this number is 373, it will end up neglecting all
the rest of the syscall, in my case, everything after `rseq`, because
the traversal will stop at 373, and `rseq` is the last syscall whose id
is lower than 373

in tools/perf/arch/x86/include/generated/asm/syscalls_64.c:
```
...
[334] = "rseq",
[424] = "pidfd_send_signal",
...
```

The reason why the key is scrambled but perf trace works well is that
key is used in trace__syscall_info(trace, NULL, key) to do
trace->syscalls.table[id], this makes sure that the struct syscall returned
actually has an id the same value as key, making the later bpf_prog
matching all correct.

After fixing this bug, I can do perf trace on 38 more syscalls, and
because more syscalls are visible, we get 8 more syscalls that can be
augmented.

before:

perf $ perf trace -vv --max-events=1 |& grep Reusing
Reusing "open" BPF sys_enter augmenter for "stat"
Reusing "open" BPF sys_enter augmenter for "lstat"
Reusing "open" BPF sys_enter augmenter for "access"
Reusing "connect" BPF sys_enter augmenter for "accept"
Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
Reusing "connect" BPF sys_enter augmenter for "bind"
Reusing "connect" BPF sys_enter augmenter for "getsockname"
Reusing "connect" BPF sys_enter augmenter for "getpeername"
Reusing "open" BPF sys_enter augmenter for "execve"
Reusing "open" BPF sys_enter augmenter for "truncate"
Reusing "open" BPF sys_enter augmenter for "chdir"
Reusing "open" BPF sys_enter augmenter for "mkdir"
Reusing "open" BPF sys_enter augmenter for "rmdir"
Reusing "open" BPF sys_enter augmenter for "creat"
Reusing "open" BPF sys_enter augmenter for "link"
Reusing "open" BPF sys_enter augmenter for "unlink"
Reusing "open" BPF sys_enter augmenter for "symlink"
Reusing "open" BPF sys_enter augmenter for "readlink"
Reusing "open" BPF sys_enter augmenter for "chmod"
Reusing "open" BPF sys_enter augmenter for "chown"
Reusing "open" BPF sys_enter augmenter for "lchown"
Reusing "open" BPF sys_enter augmenter for "mknod"
Reusing "open" BPF sys_enter augmenter for "statfs"
Reusing "open" BPF sys_enter augmenter for "pivot_root"
Reusing "open" BPF sys_enter augmenter for "chroot"
Reusing "open" BPF sys_enter augmenter for "acct"
Reusing "open" BPF sys_enter augmenter for "swapon"
Reusing "open" BPF sys_enter augmenter for "swapoff"
Reusing "open" BPF sys_enter augmenter for "delete_module"
Reusing "open" BPF sys_enter augmenter for "setxattr"
Reusing "open" BPF sys_enter augmenter for "lsetxattr"
Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
Reusing "open" BPF sys_enter augmenter for "getxattr"
Reusing "open" BPF sys_enter augmenter for "lgetxattr"
Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
Reusing "open" BPF sys_enter augmenter for "listxattr"
Reusing "open" BPF sys_enter augmenter for "llistxattr"
Reusing "open" BPF sys_enter augmenter for "removexattr"
Reusing "open" BPF sys_enter augmenter for "lremovexattr"
Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
Reusing "open" BPF sys_enter augmenter for "mq_open"
Reusing "open" BPF sys_enter augmenter for "mq_unlink"
Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
Reusing "open" BPF sys_enter augmenter for "symlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
Reusing "connect" BPF sys_enter augmenter for "accept4"
Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
Reusing "open" BPF sys_enter augmenter for "memfd_create"
Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
Reusing "fremovexattr" BPF sys_enter augmenter for "statx"

after

perf $ perf trace -vv --max-events=1 |& grep Reusing
Reusing "open" BPF sys_enter augmenter for "stat"
Reusing "open" BPF sys_enter augmenter for "lstat"
Reusing "open" BPF sys_enter augmenter for "access"
Reusing "connect" BPF sys_enter augmenter for "accept"
Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
Reusing "connect" BPF sys_enter augmenter for "bind"
Reusing "connect" BPF sys_enter augmenter for "getsockname"
Reusing "connect" BPF sys_enter augmenter for "getpeername"
Reusing "open" BPF sys_enter augmenter for "execve"
Reusing "open" BPF sys_enter augmenter for "truncate"
Reusing "open" BPF sys_enter augmenter for "chdir"
Reusing "open" BPF sys_enter augmenter for "mkdir"
Reusing "open" BPF sys_enter augmenter for "rmdir"
Reusing "open" BPF sys_enter augmenter for "creat"
Reusing "open" BPF sys_enter augmenter for "link"
Reusing "open" BPF sys_enter augmenter for "unlink"
Reusing "open" BPF sys_enter augmenter for "symlink"
Reusing "open" BPF sys_enter augmenter for "readlink"
Reusing "open" BPF sys_enter augmenter for "chmod"
Reusing "open" BPF sys_enter augmenter for "chown"
Reusing "open" BPF sys_enter augmenter for "lchown"
Reusing "open" BPF sys_enter augmenter for "mknod"
Reusing "open" BPF sys_enter augmenter for "statfs"
Reusing "open" BPF sys_enter augmenter for "pivot_root"
Reusing "open" BPF sys_enter augmenter for "chroot"
Reusing "open" BPF sys_enter augmenter for "acct"
Reusing "open" BPF sys_enter augmenter for "swapon"
Reusing "open" BPF sys_enter augmenter for "swapoff"
Reusing "open" BPF sys_enter augmenter for "delete_module"
Reusing "open" BPF sys_enter augmenter for "setxattr"
Reusing "open" BPF sys_enter augmenter for "lsetxattr"
Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
Reusing "open" BPF sys_enter augmenter for "getxattr"
Reusing "open" BPF sys_enter augmenter for "lgetxattr"
Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
Reusing "open" BPF sys_enter augmenter for "listxattr"
Reusing "open" BPF sys_enter augmenter for "llistxattr"
Reusing "open" BPF sys_enter augmenter for "removexattr"
Reusing "open" BPF sys_enter augmenter for "lremovexattr"
Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
Reusing "open" BPF sys_enter augmenter for "mq_open"
Reusing "open" BPF sys_enter augmenter for "mq_unlink"
Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
Reusing "open" BPF sys_enter augmenter for "symlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
Reusing "connect" BPF sys_enter augmenter for "accept4"
Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
Reusing "open" BPF sys_enter augmenter for "memfd_create"
Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
Reusing "fremovexattr" BPF sys_enter augmenter for "statx"

TL;DR:

These are the new syscalls that can be augmented
Reusing "openat" BPF sys_enter augmenter for "open_tree"
Reusing "openat" BPF sys_enter augmenter for "openat2"
Reusing "openat" BPF sys_enter augmenter for "mount_setattr"
Reusing "openat" BPF sys_enter augmenter for "move_mount"
Reusing "open" BPF sys_enter augmenter for "fsopen"
Reusing "openat" BPF sys_enter augmenter for "fspick"
Reusing "openat" BPF sys_enter augmenter for "faccessat2"
Reusing "openat" BPF sys_enter augmenter for "fchmodat2"

as for the perf trace output:

before

perf $ perf trace -e faccessat2 --max-events=1
[no output]

after

perf $ ./perf trace -e faccessat2 --max-events=1
     0.000 ( 0.037 ms): waybar/958 faccessat2(dfd: 40, filename: "uevent")                               = 0

P.S. The reason why this bug was not found in the past five years is
probably because it only happens to the newer syscalls whose id is
greater, for instance, faccessat2 of id 439, which not a lot of people
care about when using perf trace.

[Arnaldo]: notes

That and the fact that the BPF code was hidden before having to use -e,
that got changed kinda recently when we switched to using BPF skels for
augmenting syscalls in 'perf trace':

⬢[acme@toolbox perf-tools-next]$ git log --oneline tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
a9f4c6c999008c92 perf trace: Collect sys_nanosleep first argument
29d16de26df17e94 perf augmented_raw_syscalls.bpf: Move 'struct timespec64' to vmlinux.h
5069211e2f0b47e7 perf trace: Use the right bpf_probe_read(_str) variant for reading user data
33b725ce7b988756 perf trace: Avoid compile error wrt redefining bool
7d9642311b6d9d31 perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(augmented_arg->value) is a power of two.
262b54b6c9396823 perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(saddr) is a power of two.
1836480429d173c0 perf bpf_skel augmented_raw_syscalls: Cap the socklen parameter using &= sizeof(saddr)
cd2cece61ac5f900 perf trace: Tidy comments related to BPF + syscall augmentation
5e6da6be3082f77b perf trace: Migrate BPF augmentation to use a skeleton
⬢[acme@toolbox perf-tools-next]$

⬢[acme@toolbox perf-tools-next]$ git show --oneline --pretty=reference 5e6da6be3082f77b | head -1
5e6da6be3082f77b (perf trace: Migrate BPF augmentation to use a skeleton, 2023-08-10)
⬢[acme@toolbox perf-tools-next]$

I.e. from August, 2023.

One had as well to ask for BUILD_BPF_SKEL=1, which now is default if all
it needs is available on the system.

I simplified the code to not expose the 'struct syscall' outside of
tools/perf/util/syscalltbl.c, instead providing a function to go from
the index to the syscall id:

  int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx);

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/lkml/ZmhlAxbVcAKoPTg8@x1
Link: https://lore.kernel.org/r/20240705132059.853205-2-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf dso: Fix address sanitizer build
Ian Rogers [Thu, 4 Jul 2024 01:17:45 +0000 (18:17 -0700)]
perf dso: Fix address sanitizer build

Various files had been missed from having accessor functions added for
the sake of dso reference count checking. Add the function calls and
missing dso accessor functions.

Fixes: ee756ef7491e ("perf dso: Add reference count checking and accessor functions")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf mem: Warn if memory events are not supported on all CPUs
Leo Yan [Sat, 6 Jul 2024 15:20:35 +0000 (16:20 +0100)]
perf mem: Warn if memory events are not supported on all CPUs

It is possible that memory events are not supported on all CPUs.

Prints a warning by dumping the enabled CPU maps in this case.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20240706152035.86983-3-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf arm-spe: Support multiple Arm SPE PMUs
Leo Yan [Sat, 6 Jul 2024 15:20:34 +0000 (16:20 +0100)]
perf arm-spe: Support multiple Arm SPE PMUs

A platform can have more than one Arm SPE PMU. For example, a system
with multiple clusters may have each cluster enabled with its own Arm
SPE instance. In such case, the PMU devices will be named 'arm_spe_0',
'arm_spe_1', and so on.

Currently, the tool only supports 'arm_spe_0'. This commit extends
support to multiple Arm SPE PMUs by detecting the substring 'arm_spe_'.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20240706152035.86983-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf build x86: Fix SC2034 error in syscalltbl.sh
Haoze Xie [Sun, 7 Jul 2024 18:04:02 +0000 (02:04 +0800)]
perf build x86: Fix SC2034 error in syscalltbl.sh

Change the unused var in 'arch/x86/entry/syscalls/syscalltbl.sh' to '_'
when reading from '$sorted_table'. This change allows the script to pass
tests of ShellCheck before and after version 0.7.2 at the same time.

When building in arch x86, syscalltbl.sh got a ShellCheck warning, which
makes compilation error:

    In arch/x86/entry/syscalls/syscalltbl.sh line 27:
    while read nr _abi name entry _compat; do
                  ^-^ SC2034: abi appears unused.
                  Verify use (or export if used externally).
                                  ^----^ SC2034: compat appears unused.
                               Verify use (or export if used externally).

The script reads unused param abi and compat. It uses format '_xxx' to
indicate dummy vars, which won't work properly when ShellCheck <= 0.7.2.

According to SC2034, the more general way of writing is to use directly
'_' to indicate discarding vars. 'entry' is also replaced by '_' because
it just happens to be defined in emit function, otherwise it will lead
to some misunderstandings.

Link: https://www.shellcheck.net/wiki/SC2034
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Link: https://lore.kernel.org/r/2143cab4cd8468c88860f4e5e382d0e6b4d89ac9.1720372178.git.royenheart@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf record: Fix memset out-of-range error
Haoze Xie [Sun, 7 Jul 2024 18:01:00 +0000 (02:01 +0800)]
perf record: Fix memset out-of-range error

Modified the object of 'memset' from '&lost.lost' to '&lost' in
record__read_lost_samples. This allows 'memset' to access memory properly
without causing out-of-bounds problems.

The problems got from builtin-record.c are:

In file included from /usr/include/string.h:495,
                 from util/parse-events.h:13,
                 from builtin-record.c:14:
In function 'memset',
    inlined from 'record__read_lost_samples' at
    builtin-record.c:1958:6,
    inlined from '__cmd_record.constprop' at builtin-record.c:2817:2:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:71:10: error:
'__builtin_memset' offset [17, 64] from the object at 'lost' is out
of the bounds of referenced subobject 'lost' with type
'struct perf_record_lost_samples' at offset 0 [-Werror=array-bounds]
71|return __builtin___memset_chk (__dest,__ch,__len,__bos0 (__dest));
  |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The error arised when performing a memset operation on the 'lost' variable,
the bytes of 'sizeof(lost)' exceeds that of '&lost.lost', which are 64
and 16.

Fixes: 6c1785cd75ef ("perf record: Ensure space for lost samples")
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Link: https://lore.kernel.org/r/11e12f171b846577cac698cd3999db3d7f6c4d03.1720372317.git.royenheart@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf sched map: Add --fuzzy-name option for fuzzy matching in task names
Madadi Vineeth Reddy [Sun, 7 Jul 2024 18:27:16 +0000 (23:57 +0530)]
perf sched map: Add --fuzzy-name option for fuzzy matching in task names

The --fuzzy-name option can be used if fuzzy name matching is required.
For example, "taskname" can be matched to any string that contains
"taskname" as its substring.

Sample output for --task-name wdav --fuzzy-name
=============
 .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
 .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
 .  *-   B0  .   .   .   -   .   131040.641379 secs
*C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
 C0  .   B0  .  *D0  .   .   .   131040.641572 secs D0 => wdavdaemon:62277
 C0  .   B0  .   D0  .  *E0  .   131040.641578 secs E0 => wdavdaemon:62270
*-   .   B0  .   D0  .   E0  .   131040.641581 secs

Suggested-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Link: https://lore.kernel.org/r/20240707182716.22054-4-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf sched map: Add support for multiple task names using CSV
Madadi Vineeth Reddy [Sun, 7 Jul 2024 18:27:15 +0000 (23:57 +0530)]
perf sched map: Add support for multiple task names using CSV

To track the scheduling patterns of multiple tasks simultaneously,
multiple task names can be specified using a comma separator
without any whitespace.

Sample output for --task-name perf,wdavdaemon
=============
 .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
 .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
 .  *-   B0  .   .   .   -   .   131040.641379 secs
*C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283

...

 .  *-   .   .   .   .   .   .   131041.395649 secs
 .   .   .   .   .   .   .  *X2  131041.403969 secs X2 => perf:70211
 .   .   .   .   .   .   .  *-   131041.404006 secs

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20240707182716.22054-3-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf sched map: Add task-name option to filter the output map
Madadi Vineeth Reddy [Sun, 7 Jul 2024 18:27:14 +0000 (23:57 +0530)]
perf sched map: Add task-name option to filter the output map

By default, perf sched map prints sched-in events for all the tasks
which may not be required all the time as it prints lot of symbols
and rows to the terminal.

With --task-name option, one could specify the specific task name
for which the map has to be shown. This would help in analyzing the
CPU usage patterns easier for that specific task. Since multiple
PID's might have the same task name, using task-name filter
would be more useful for debugging.

For other tasks, instead of printing the symbol, '-' is printed and
the same '.' is used to represent idle. '-' is used instead of symbol
for other tasks because it helps in clear visualization of task
of interest and secondly the symbol itself doesn't mean anything
because the sched-in of that symbol will not be printed(first sched-in
contains pid and the corresponding symbol).

When using the --task-name option, the sched-out time is represented
by a '*-'. Since not all task sched-in events are printed, the sched-out
time of the relevant task might be lost. This representation ensures
that the sched-out time of the interested task is not overlooked.

6.10.0-rc1
==========
*A0                              131040.639793 secs A0 => migration/0:19
*.                               131040.639801 secs .  => swapper:0
 .  *B0                          131040.639830 secs B0 => migration/1:24
 .  *.                           131040.639836 secs
 .   .  *C0                      131040.640108 secs C0 => migration/2:30
 .   .  *.                       131040.640163 secs
 .   .   .  *D0                  131040.640386 secs D0 => migration/3:36
 .   .   .  *.                   131040.640395 secs

6.10.0-rc1 + patch (--task-name wdavdaemon)
=============
 .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
 .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
 -  *-   B0  .   .   .   -   .   131040.641379 secs
*C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
 C0  .   B0  .  *D0  .   .   .   131040.641572 secs D0 => wdavdaemon:62277
 C0  .   B0  .   D0  .  *E0  .   131040.641578 secs E0 => wdavdaemon:62270
*-   .   B0  .   D0  .   E0  .   131040.641581 secs
 .   .   B0  .   D0  .  *-   .   131040.641583 secs

Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20240707182716.22054-2-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf build: Conditionally add feature check flags for libtrace{event,fs}
Guilherme Amadio [Fri, 28 Jun 2024 20:34:27 +0000 (22:34 +0200)]
perf build: Conditionally add feature check flags for libtrace{event,fs}

This avoids reported warnings when the packages are not installed.

[namhyung]: Removed the dummy assignment and unnecessary ifeq checks.

Fixes: 0f0e1f445690 ("perf build: Use pkg-config for feature check for libtrace{event,fs}")
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/20240628203432.3273625-1-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf install: Don't propagate subdir to Documentation submake
Nicolas Schier [Thu, 23 May 2024 08:06:40 +0000 (10:06 +0200)]
perf install: Don't propagate subdir to Documentation submake

Explicitly reset 'subdir' variable when descending to
tools/perf/Documentation.  Similar to commit f89fb55714b62 ("perf build:
Don't propagate subdir to submakes for install_headers", 2023-01-02),
calling the 'tools/perf_install' target via top-levels Makefile results
in repeated subdir components when attempting to call the perf
documentation installation rules:

    $ make tools/perf_install NO_LIBTRACEEVENT=1 JOBS=1
    [...]
    /bin/sh: 1: cd: can't cd to /data/linux/kbuild/tools/perf/tools/perf/
    ../../scripts/Makefile.include:17: *** output directory "/data/linux/kbuild/tools/perf/tools/perf/" does not exist.  Stop.
    make[5]: *** [Makefile.perf:1096: try-install-man] Error 2
    make[4]: *** [Makefile.perf:264: sub-make] Error 2
    make[3]: *** [Makefile:113: install] Error 2
    make[2]: *** [Makefile:131: perf_install] Error 2

Resetting 'subdir' fixes the call from top-level Makefile.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Nicolas Schier <n.schier@avm.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Tested-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20240523-make-tools-perf-install-v1-1-3903499e637f@avm.de
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf vendor events arm64:: Add i.MX95 DDR Performance Monitor metrics
Xu Yang [Wed, 29 May 2024 08:03:58 +0000 (16:03 +0800)]
perf vendor events arm64:: Add i.MX95 DDR Performance Monitor metrics

Add JSON metrics for i.MX95 DDR Performance Monitor.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Cc: festevam@gmail.com
Cc: conor+dt@kernel.org
Cc: robh+dt@kernel.org
Cc: shawnguo@kernel.org
Cc: will@kernel.org
Cc: krzysztof.kozlowski+dt@linaro.org
Cc: mike.leach@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: imx@lists.linux.dev
Cc: kernel@pengutronix.de
Cc: s.hauer@pengutronix.de
Cc: devicetree@vger.kernel.org
Link: https://lore.kernel.org/r/20240529080358.703784-8-xu.yang_2@nxp.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf vendor events arm64:: Add i.MX93 DDR Performance Monitor metrics
Xu Yang [Wed, 29 May 2024 08:03:57 +0000 (16:03 +0800)]
perf vendor events arm64:: Add i.MX93 DDR Performance Monitor metrics

Add JSON metrics for i.MX93 DDR Performance Monitor.

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Cc: festevam@gmail.com
Cc: conor+dt@kernel.org
Cc: robh+dt@kernel.org
Cc: shawnguo@kernel.org
Cc: will@kernel.org
Cc: krzysztof.kozlowski+dt@linaro.org
Cc: mike.leach@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: imx@lists.linux.dev
Cc: john.g.garry@oracle.com
Cc: kernel@pengutronix.de
Cc: s.hauer@pengutronix.de
Cc: devicetree@vger.kernel.org
Link: https://lore.kernel.org/r/20240529080358.703784-7-xu.yang_2@nxp.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf dsos: When adding a dso into sorted dsos maintain the sort order
Ian Rogers [Wed, 3 Jul 2024 17:21:17 +0000 (10:21 -0700)]
perf dsos: When adding a dso into sorted dsos maintain the sort order

dsos__add would add at the end of the dso array possibly requiring a
later find to re-sort the array. Patterns of find then add were
becoming O(n*log n) due to the sorts. Change the add routine to be
O(n) rather than O(1) but to maintain the sorted-ness of the dsos
array so that later finds don't need the O(n*log n) sort.

Fixes: 3f4ac23a9908 ("perf dsos: Switch backing storage to array from rbtree/list")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Matt Fleming <matt@readmodwrite.com>
Link: https://lore.kernel.org/r/20240703172117.810918-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf comm str: Avoid sort during insert
Ian Rogers [Wed, 3 Jul 2024 17:21:16 +0000 (10:21 -0700)]
perf comm str: Avoid sort during insert

The array is sorted, so just move the elements and insert in order.

Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Matt Fleming <matt@readmodwrite.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Matt Fleming <matt@readmodwrite.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240703172117.810918-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf report: Calling available function for stats printing
Abhishek Dubey [Fri, 28 Jun 2024 18:32:24 +0000 (14:32 -0400)]
perf report: Calling available function for stats printing

For printing dump_trace, just use existing stats_print()
function.

Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
Link: https://lore.kernel.org/r/20240628183224.452055-1-adubey@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf intel-pt: Fix exclude_guest setting
Adrian Hunter [Tue, 25 Jun 2024 10:45:32 +0000 (13:45 +0300)]
perf intel-pt: Fix exclude_guest setting

In the past, the exclude_guest setting has had no effect on Intel PT
tracing, but that may not be the case in the future.

Set the flag correctly based upon whether KVM is using Intel PT
"Host/Guest" mode, which is determined by the kvm_intel module
parameter pt_mode:

 pt_mode=0 System-wide mode : host and guest output to host buffer
 pt_mode=1 Host/Guest mode : host/guest output to host/guest
                buffers respectively

Fixes: 6e86bfdc4a60 ("perf intel-pt: Support decoding of guest kernel")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240625104532.11990-3-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf intel-pt: Fix aux_watermark calculation for 64-bit size
Adrian Hunter [Tue, 25 Jun 2024 10:45:31 +0000 (13:45 +0300)]
perf intel-pt: Fix aux_watermark calculation for 64-bit size

aux_watermark is a u32. For a 64-bit size, cap the aux_watermark
calculation at UINT_MAX instead of truncating it to 32-bits.

Fixes: 874fc35cdd55 ("perf intel-pt: Use aux_watermark")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240625104532.11990-2-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoMerge remote-tracking branch 'perf-tools' into perf-tools-next
Namhyung Kim [Tue, 2 Jul 2024 18:51:32 +0000 (11:51 -0700)]
Merge remote-tracking branch 'perf-tools' into perf-tools-next

Merge fixes and updates in v6.10 into perf-tools-next to resolve changes
in synthesizing the LOST_SAMPLES records and build fixes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf sched replay: Fix -r/--repeat command line option for infinity
Madadi Vineeth Reddy [Fri, 28 Jun 2024 07:18:21 +0000 (12:48 +0530)]
perf sched replay: Fix -r/--repeat command line option for infinity

Currently, the -r/--repeat option accepts values from 0 and complains
for -1. The help section specifies:
-r, --repeat <n>      repeat the workload replay N times (-1: infinite)

The -r -1 option raises an error because replay_repeat is defined as
an unsigned int.

In the current implementation, the workload is repeated n times when
-r <n> is used, except when n is 0.

When -r is set to 0, the workload is also repeated once. This happens
because when -r=0, the run_one_test function is not called. (Note that
mutex unlocking, which is essential for child threads spawned to emulate
the workload, happens in run_one_test.) However, mutex unlocking is
still performed in the destroy_tasks function. Thus, -r=0 results in the
workload running once coincidentally.

To clarify and maintain the existing logic for -r >= 1 (which runs the
workload the specified number of times) and to fix the issue with infinite
runs, make -r=0 perform an infinite run.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240628071821.15264-1-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf: pmus: Remove unneeded semicolon
Yang Li [Fri, 28 Jun 2024 05:30:49 +0000 (13:30 +0800)]
perf: pmus: Remove unneeded semicolon

./tools/perf/util/pmu.c:1776:49-50: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9443
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20240628053049.44521-1-yang.lee@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf stat: Use field separator in the metric header
Namhyung Kim [Fri, 28 Jun 2024 00:06:04 +0000 (17:06 -0700)]
perf stat: Use field separator in the metric header

It didn't use the passed field separator (using -x option) when it
prints the metric headers and always put "," between the fields.

Before:
  $ sudo ./perf stat -a -x : --per-core -M tma_core_bound --metric-only true
  core,cpus,%  tma_core_bound:     <<<--- here: "core,cpus," but ":" expected
  S0-D0-C0:2:10.5:
  S0-D0-C1:2:14.8:
  S0-D0-C2:2:9.9:
  S0-D0-C3:2:13.2:

After:
  $ sudo ./perf stat -a -x : --per-core -M tma_core_bound --metric-only true
  core:cpus:%  tma_core_bound:
  S0-D0-C0:2:10.5:
  S0-D0-C1:2:15.0:
  S0-D0-C2:2:16.5:
  S0-D0-C3:2:12.5:

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20240628000604.1296808-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf stat: Fix a segfault with --per-cluster --metric-only
Namhyung Kim [Fri, 28 Jun 2024 00:06:03 +0000 (17:06 -0700)]
perf stat: Fix a segfault with --per-cluster --metric-only

The new --per-cluster option was added recently but it forgot to update
the aggr_header fields which are used for --metric-only option.  And it
resulted in a segfault due to NULL string in fputs().

Fixes: cbc917a1b03b ("perf stat: Support per-cluster aggregation")
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20240628000604.1296808-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 months agoperf pmu: Don't de-duplicate core PMUs
James Clark [Wed, 26 Jun 2024 14:54:46 +0000 (15:54 +0100)]
perf pmu: Don't de-duplicate core PMUs

Arm PMUs have a suffix, either a single decimal (armv8_pmuv3_0) or 3 hex
digits which (armv8_cortex_a53) which Perf assumes are both strippable
suffixes for the purposes of deduplication. S390 "cpum_cf" is a
similarly suffixed core PMU but is only two characters so is not treated
as strippable because the rules are a minimum of 3 hex characters or 1
decimal character.

There are two paths involved in listing PMU events:

 * HW/cache event printing assumes core PMUs don't have suffixes so
   doesn't try to strip.
 * Sysfs PMU events share the printing function with uncore PMUs which
   strips.

This results in slightly inconsistent Perf list behavior if a core PMU
has a suffix:

  # perf list
  ...
  armv8_pmuv3_0/branch-load-misses/
  armv8_pmuv3/l3d_cache_wb/          [Kernel PMU event]
  ...

Fix it by partially reverting back to the old list behavior where
stripping was only done for uncore PMUs. For example commit 8d9f5146f5da
("perf pmus: Sort pmus by name then suffix") mentions that only PMUs
starting 'uncore_' are considered to have a potential suffix. This
change doesn't go back that far, but does only strip PMUs that are
!is_core. This keeps the desirable behavior where the many possibly
duplicated uncore PMUs aren't repeated, but it doesn't break listing for
core PMUs.

Searching for a PMU continues to use the new stripped comparison
functions, meaning that it's still possible to request an event by
specifying the common part of a PMU name, or even open events on
multiple similarly named PMUs. For example:

  # perf stat -e armv8_cortex/inst_retired/

  5777173628      armv8_cortex_a53/inst_retired/          (99.93%)
  7469626951      armv8_cortex_a57/inst_retired/          (49.88%)

Fixes: 3241d46f5f54 ("perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu")
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: robin.murphy@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240626145448.896746-3-james.clark@arm.com
2 months agoperf pmu: Restore full PMU name wildcard support
James Clark [Wed, 26 Jun 2024 14:54:45 +0000 (15:54 +0100)]
perf pmu: Restore full PMU name wildcard support

Commit b2b9d3a3f021 ("perf pmu: Support wildcards on pmu name in dynamic
pmu events") gives the following example for wildcarding a subset of
PMUs:

  E.g., in a system with the following dynamic pmus:

        mypmu_0
        mypmu_1
        mypmu_2
        mypmu_4

  perf stat -e mypmu_[01]/<config>/

Since commit f91fa2ae6360 ("perf pmu: Refactor perf_pmu__match()"), only
"*" has been supported, removing the ability to subset PMUs, even though
parse-events.l still supports ? and [] characters.

Fix it by using fnmatch() when any glob character is detected and add a
test which covers that and other scenarios of
perf_pmu__match_ignoring_suffix().

Fixes: f91fa2ae6360 ("perf pmu: Refactor perf_pmu__match()")
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: robin.murphy@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240626145448.896746-2-james.clark@arm.com
2 months agoperf report: Display pregress bar on redirected pipe data
Namhyung Kim [Thu, 27 Jun 2024 18:19:16 +0000 (11:19 -0700)]
perf report: Display pregress bar on redirected pipe data

It's possible to save pipe output of perf record into a file.

  $ perf record -o- ... > pipe.data

And you can use the data same as the normal perf data.

  $ perf report -i pipe.data

In that case, perf tools will treat the input as a pipe, but it can get
the total size of the input.  This means it can show the progress bar
unlike the normal pipe input (which doesn't know the total size in
advance).

While at it, fix the string in __perf_session__process_dir_events().

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240627181916.1202110-1-namhyung@kernel.org
2 months agoperf test stat_bpf_counter.sh: Stabilize the test results
Veronika Molnarova [Tue, 25 Jun 2024 09:20:01 +0000 (11:20 +0200)]
perf test stat_bpf_counter.sh: Stabilize the test results

The test has been failing for some time when two separate runs of
perf benchmarks are recorded for cycles events and their counts are
compared, while once the recording was done with option --bpf-counters
and once without it. It is expected that the count of the samples
should be within a certain range, firstly the difference was set to be
within 10%, which was then later raised to 20%. However, the test case
keeps failing on certain architectures as recording the provided
benchmark can produce completely different counts based on the
current load of the system.

Sampling two separate runs on intel-eaglestream-spr-13 of "perf stat
--no-big-num -e cycles -- perf bench sched messaging -g 1 -l 100 -t":

 Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':

         396782898      cycles

       0.010051983 seconds time elapsed

       0.008664000 seconds user
       0.097058000 seconds sys

 Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':

        1431133032      cycles

       0.021803714 seconds time elapsed

       0.023377000 seconds user
       0.349918000 seconds sys

, which is ranging from 400mil to 1400mil samples.

Instead of recording the cycles use instructions event, which provides
more stable values. At the same time change the tested workload to one
of the provided testing workloads by perf that is not based on a
scheduler, which can provide another dependency on the current load.

Sampling instructions event with the new workload provide much more
stable results on intel-eaglestream-spr-13 of "perf stat --no-big-num
-e instructions -- perf test -w brstack":

 Performance counter stats for 'perf test -w brstack':

          64584494      instructions

       0.009173945 seconds time elapsed

       0.007262000 seconds user
       0.002071000 seconds sys

 Performance counter stats for 'perf test -w brstack':

          64672669      instructions

       0.008888135 seconds time elapsed

       0.005018000 seconds user
       0.004018000 seconds sys

Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: mpetlan@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625092001.10909-1-vmolnaro@redhat.com
2 months agoperf python: Clean up build dependencies
Ian Rogers [Tue, 25 Jun 2024 21:41:17 +0000 (14:41 -0700)]
perf python: Clean up build dependencies

The python build now depends on libraries and doesn't use
python-ext-sources except for the util/python.c dependency. Switch to
just directly depending on that file and util/setup.py. This allows
the removal of python-ext-sources.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-9-irogers@google.com
2 months agoperf python: Switch module to linking libraries from building source
Ian Rogers [Tue, 25 Jun 2024 21:41:16 +0000 (14:41 -0700)]
perf python: Switch module to linking libraries from building source

setup.py was building most perf sources causing setup.py to mimic the
Makefile logic as well as flex/bison code to be stubbed out, due to
complexity building. By using libraries fewer functions are stubbed
out, the build is faster and the Makefile logic is reused which should
simplify updating. The libraries are passed through LDFLAGS to avoid
complexity in python.

Force the -fPIC flag for libbpf.a to ensure it is suitable for linking
into the perf python module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-8-irogers@google.com
2 months agoperf util: Make util its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:15 +0000 (14:41 -0700)]
perf util: Make util its own library

Make the util directory into its own library. This is done to avoid
compiling code twice, once for the perf tool and once for the perf
python module. For convenience:
  arch/common.c
  scripts/perl/Perf-Trace-Util/Context.c
  scripts/python/Perf-Trace-Util/Context.c
are made part of this library.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
2 months agoperf bench: Make bench its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:14 +0000 (14:41 -0700)]
perf bench: Make bench its own library

Make the benchmark code into a library so it may be linked against
things like the python module to avoid compiling code twice.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-6-irogers@google.com
2 months agoperf test: Make tests its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:13 +0000 (14:41 -0700)]
perf test: Make tests its own library

Make the tests code its own library. This is done to avoid compiling
code twice, once for the perf tool and once for the perf python
module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-5-irogers@google.com
2 months agoperf pmu-events: Make pmu-events a library
Ian Rogers [Tue, 25 Jun 2024 21:41:12 +0000 (14:41 -0700)]
perf pmu-events: Make pmu-events a library

Make pmu-events into a library so it may be linked against things like
the python module and not built from source.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-4-irogers@google.com
2 months agoperf ui: Make ui its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:11 +0000 (14:41 -0700)]
perf ui: Make ui its own library

Make the ui code its own library. This is done to avoid compiling code
twice, once for the perf tool and once for the perf python module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-3-irogers@google.com
2 months agoperf build: Add '*.a' to clean targets
Ian Rogers [Tue, 25 Jun 2024 21:41:10 +0000 (14:41 -0700)]
perf build: Add '*.a' to clean targets

Fix some excessively long lines by deploying '\'.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-2-irogers@google.com
3 months agoperf mem: Fix a segfault with NULL event->name
Namhyung Kim [Fri, 21 Jun 2024 17:05:28 +0000 (10:05 -0700)]
perf mem: Fix a segfault with NULL event->name

Guilherme reported a crash in perf mem record.  It's because the
perf_mem_event->name was NULL on his machine.  It should just return
a NULL string when it has no format string in the name.

The backtrace at the crash is below:

  Program received signal SIGSEGV, Segmentation fault.
  __strchrnul_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:67
  67              vmovdqu (%rdi), %ymm2
  (gdb) bt
  #0  __strchrnul_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:67
  #1  0x00007ffff6c982de in __find_specmb (format=0x0) at printf-parse.h:82
  #2  __printf_buffer (buf=buf@entry=0x7fffffffc760, format=format@entry=0x0, ap=ap@entry=0x7fffffffc880,
      mode_flags=mode_flags@entry=0) at vfprintf-internal.c:649
  #3  0x00007ffff6cb7840 in __vsnprintf_internal (string=<optimized out>, maxlen=<optimized out>, format=0x0,
      args=0x7fffffffc880, mode_flags=mode_flags@entry=0) at vsnprintf.c:96
  #4  0x00007ffff6cb787f in ___vsnprintf (string=<optimized out>, maxlen=<optimized out>, format=<optimized out>,
      args=<optimized out>) at vsnprintf.c:103
  #5  0x00005555557b9391 in scnprintf (buf=0x555555fe9320 <mem_loads_name> "", size=100, fmt=0x0)
      at ../lib/vsprintf.c:21
  #6  0x00005555557b74c3 in perf_pmu__mem_events_name (i=0, pmu=0x555556832180) at util/mem-events.c:106
  #7  0x00005555557b7ab9 in perf_mem_events__record_args (rec_argv=0x55555684c000, argv_nr=0x7fffffffca20)
      at util/mem-events.c:252
  #8  0x00005555555e370d in __cmd_record (argc=3, argv=0x7fffffffd760, mem=0x7fffffffcd80) at builtin-mem.c:156
  #9  0x00005555555e49c4 in cmd_mem (argc=4, argv=0x7fffffffd760) at builtin-mem.c:514
  #10 0x000055555569716c in run_builtin (p=0x555555fcde80 <commands+672>, argc=8, argv=0x7fffffffd760) at perf.c:349
  #11 0x0000555555697402 in handle_internal_command (argc=8, argv=0x7fffffffd760) at perf.c:402
  #12 0x0000555555697560 in run_argv (argcp=0x7fffffffd59c, argv=0x7fffffffd590) at perf.c:446
  #13 0x00005555556978a6 in main (argc=8, argv=0x7fffffffd760) at perf.c:562

Reported-by: Guilherme Amadio <amadio@cern.ch>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Closes: https://lore.kernel.org/linux-perf-users/Zlns_o_IE5L28168@cern.ch
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-5-namhyung@kernel.org
3 months agoperf tools: Fix a compiler warning of NULL pointer
Namhyung Kim [Fri, 21 Jun 2024 17:05:27 +0000 (10:05 -0700)]
perf tools: Fix a compiler warning of NULL pointer

A compiler warning on the second argument of bsearch() should not be
NULL, but there's a case we might pass it.  Let's return early if we
don't have any DSOs to search in __dsos__find_by_longname_id().

  util/dsos.c:184:8: runtime error: null pointer passed as argument 2, which is declared to never be null

Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Closes: https://lore.kernel.org/oe-lkp/202406180932.84be448c-oliver.sang@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-4-namhyung@kernel.org
3 months agoperf symbol: Simplify kernel module checking
Namhyung Kim [Fri, 21 Jun 2024 17:05:26 +0000 (10:05 -0700)]
perf symbol: Simplify kernel module checking

In dso__load(), it checks if the dso is a kernel module by looking the
symtab type.  Actually dso has 'is_kmod' field to check that easily and
dso__set_module_info() set the symtab type and the is_kmod bit.  So it
should have the same result to check the is_kmod bit.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-3-namhyung@kernel.org
3 months agoperf report: Fix condition in sort__sym_cmp()
Namhyung Kim [Fri, 21 Jun 2024 17:05:25 +0000 (10:05 -0700)]
perf report: Fix condition in sort__sym_cmp()

It's expected that both hist entries are in the same hists when
comparing two.  But the current code in the function checks one without
dso sort key and other with the key.  This would make the condition true
in any case.

I guess the intention of the original commit was to add '!' for the
right side too.  But as it should be the same, let's just remove it.

Fixes: 69849fc5d2119 ("perf hists: Move sort__has_dso into struct perf_hpp_list")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-2-namhyung@kernel.org
3 months agoperf pmus: Fixes always false when compare duplicates aliases
Junhao He [Fri, 14 Jun 2024 09:43:18 +0000 (17:43 +0800)]
perf pmus: Fixes always false when compare duplicates aliases

In the previous loop, all the members in the aliases[j-1] have been freed
and set to NULL. But in this loop, the function pmu_alias_is_duplicate()
compares the aliases[j] with the aliases[j-1] that has already been
disposed, so the function will always return false and duplicate aliases
will never be discarded.

If we find duplicate aliases, it skips the zfree aliases[j], which is
accompanied by a memory leak.

We can use the next aliases[j+1] to theck for duplicate aliases to
fixes the aliases NULL pointer dereference, then goto zfree code snippet
to release it.

After patch testing:
 $ perf list --unit=hisi_sicl,cpa pmu

 uncore cpa:
   cpa_p0_rd_dat_32b
        [Number of read ops transmitted by the P0 port which size is 32 bytes.
         Unit: hisi_sicl,cpa]
   cpa_p0_rd_dat_64b
        [Number of read ops transmitted by the P0 port which size is 64 bytes.
         Unit: hisi_sicl,cpa]

Fixes: c3245d2093c1 ("perf pmu: Abstract alias/event struct")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: ravi.bangoria@amd.com
Cc: james.clark@arm.com
Cc: prime.zeng@hisilicon.com
Cc: cuigaosheng1@huawei.com
Cc: jonathan.cameron@huawei.com
Cc: linuxarm@huawei.com
Cc: yangyicong@huawei.com
Cc: robh@kernel.org
Cc: renyu.zj@linux.alibaba.com
Cc: kjain@linux.ibm.com
Cc: john.g.garry@oracle.com
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240614094318.11607-1-hejunhao3@huawei.com
3 months agoperf unwind-libunwind: Add malloc() failure handling
Yunseong Kim [Wed, 19 Jun 2024 20:42:12 +0000 (05:42 +0900)]
perf unwind-libunwind: Add malloc() failure handling

Add malloc() failure handling in unread_unwind_spec_debug_frame().
This make caller find_proc_info() works well when the allocation failure.

Signed-off-by: Yunseong Kim <yskelg@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Austin Kim <austindh.kim@gmail.com>
Cc: shjy180909@gmail.com
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240619204211.6438-2-yskelg@gmail.com
3 months agoutil: constant -1 with expression of type char
Yunseong Kim [Wed, 19 Jun 2024 20:34:29 +0000 (05:34 +0900)]
util: constant -1 with expression of type char

This patch resolve following warning.

  tools/perf/util/evsel.c:1620:9: error: result of comparison of constant
   -1 with expression of type 'char' is always false
   -Werror,-Wtautological-constant-out-of-range-compare
   1620 |                 if (c == -1)
        |                     ~ ^  ~~

Signed-off-by: Yunseong Kim <yskelg@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Austin Kim <austindh.kim@gmail.com>
Cc: shjy180909@gmail.com
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240619203428.6330-2-yskelg@gmail.com
3 months agoperf: Timehist account sch delay for scheduled out running
Fernand Sieber [Tue, 18 Jun 2024 09:03:39 +0000 (11:03 +0200)]
perf: Timehist account sch delay for scheduled out running

When using perf timehist, sch delay is only computed for a waking task,
not for a pre empted task. This patches changes sch delay to account for
both. This makes sense as testing scheduling policy need to consider the
effect of scheduling delay globally, not only for waking tasks.

Example of `perf timehist` report before the patch for `stress` task
competing with each other.

First column is wait time, second column sch delay, third column
runtime.

1.492060 [0000]  s    stress[81]                          1.999      0.000      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      0.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      0.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      0.000      1.999      R  next: stress[81]

After the patch, it looks like this (note that all wait time is not zero
anymore):

1.492060 [0000]  s    stress[81]                          1.999      1.999      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      2.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      2.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      2.000      1.999      R  next: stress[81]

Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240618090339.87482-1-sieberf@amazon.com
3 months agoperf tests: Add APX and other new instructions to x86 instruction decoder test
Adrian Hunter [Thu, 2 May 2024 10:58:53 +0000 (13:58 +0300)]
perf tests: Add APX and other new instructions to x86 instruction decoder test

Add samples of APX and other new instructions to the 'x86 instruction
decoder - new instructions' test.

Note the test is only available if the perf tool has been built with
EXTRA_TESTS=1.

Example:

  $ make EXTRA_TESTS=1 -C tools/perf
  $ tools/perf/perf test -F -v 'new ins' |& grep -i 'jmpabs\|popp\|pushp'
  Decoded ok: d5 00 a1 ef cd ab 90 78 56 34 12    jmpabs $0x1234567890abcdef
  Decoded ok: d5 08 53                    pushp  %rbx
  Decoded ok: d5 18 50                    pushp  %r16
  Decoded ok: d5 19 57                    pushp  %r31
  Decoded ok: d5 19 5f                    popp   %r31
  Decoded ok: d5 18 58                    popp   %r16
  Decoded ok: d5 08 5b                    popp   %rbx

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-11-adrian.hunter@intel.com
3 months agoperf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder
Adrian Hunter [Thu, 2 May 2024 10:58:52 +0000 (13:58 +0300)]
perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder

JMPABS is 64-bit absolute direct jump instruction, encoded with a mandatory
REX2 prefix. JMPABS is designed to be used in the procedure linkage table
(PLT) to replace indirect jumps, because it has better performance. In that
case the jump target will be amended at run time. To enable Intel PT to
follow the code, a TIP packet is always emitted when JMPABS is traced under
Intel PT.

Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.

Decode JMPABS as an indirect jump, because it has an associated TIP packet
the same as an indirect jump and the control flow should follow the TIP
packet payload, and not assume it is the same as the on-file object code
JMPABS target address.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-10-adrian.hunter@intel.com
3 months agoperf test: Check output of the probe ... --funcs command
Chaitanya S Prakash [Sat, 1 Jun 2024 12:59:46 +0000 (18:29 +0530)]
perf test: Check output of the probe ... --funcs command

Test "perf probe of function from different CU" only checks if the perf
command has failed and doesn't test the --funcs output. In the issue
reported in the previous commit, the garbage output of the --funcs
command was being ignored by the test when it could have been caught.

The script first makes use of --funcs option with the perf probe command
to check if the function "foo" exists in the testfile before adding a
probe to it in the next command. The output of probe...--funcs command
is redirected to stdout, therefore, add '| grep "foo"' to validate the
result.

Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: anshuman.khandual@arm.com
Cc: james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240601125946.1741414-11-ChaitanyaS.Prakash@arm.com
3 months agotools/perf: Fix parallel-perf python script to replace new python syntax ":=" usage
Athira Rajeev [Sun, 23 Jun 2024 06:48:50 +0000 (12:18 +0530)]
tools/perf: Fix parallel-perf python script to replace new python syntax ":=" usage

perf test "perf script tests" fails as below in systems
with python 3.6

File "/home/athira/linux/tools/perf/tests/shell/../../scripts/python/parallel-perf.py", line 442
if line := p.stdout.readline():
             ^
SyntaxError: invalid syntax
--- Cleaning up ---
---- end(-1) ----
92: perf script tests: FAILED!

This happens because ":=" is a new syntax that assigns values
to variables as part of a larger expression. This is introduced
from python 3.8 and hence fails in setup with python 3.6
Address this by splitting the large expression and check the
value in two steps:
Previous line: if line := p.stdout.readline():
Current change:
line = p.stdout.readline()
if line:

With patch

./perf test "perf script tests"
 93: perf script tests:  Ok

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-3-atrajeev@linux.vnet.ibm.com
3 months agotools/perf: Use is_perf_pid_map_name helper function to check dso's of pattern /tmp...
Athira Rajeev [Sun, 23 Jun 2024 06:48:49 +0000 (12:18 +0530)]
tools/perf: Use is_perf_pid_map_name helper function to check dso's of pattern /tmp/perf-%d.map

commit 80d496be89ed ("perf report: Add support for profiling JIT
generated code") added support for profiling JIT generated code.
This patch handles dso's of form "/tmp/perf-$PID.map".

Some of the references doesn't check exactly for same pattern.
some uses "if (!strncmp(dso_name, "/tmp/perf-", 10))". Fix
this by using helper function perf_pid_map_tid and
is_perf_pid_map_name which looks for proper pattern of
form: "/tmp/perf-$PID.map" for these checks.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-2-atrajeev@linux.vnet.ibm.com
3 months agotools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__load
Athira Rajeev [Sun, 23 Jun 2024 06:48:48 +0000 (12:18 +0530)]
tools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__load

Perf test for perf probe of function from different CU fails
as below:

./perf test -vv "test perf probe of function from different CU"
116: test perf probe of function from different CU:
--- start ---
test child forked, pid 2679
Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.Msa7iy89bx/testfile
  Error: Failed to add events.
--- Cleaning up ---
"foo" does not hit any event.
  Error: Failed to delete events.
---- end(-1) ----
116: test perf probe of function from different CU                   : FAILED!

The test does below to probe function "foo" :

# gcc -g -Og -flto -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.c
-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
# gcc -g -Og -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.c
-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o
# gcc -g -Og -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o

# ./perf probe -x /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile foo
Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
   Error: Failed to add events.

Perf probe fails to find symbol foo in the executable placed in
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7

Simple reproduce:

 # mktemp -d /tmp/perf-checkXXXXXXXXXX
   /tmp/perf-checkcWpuLRQI8j

 # gcc -g -o test test.c
 # cp test /tmp/perf-checkcWpuLRQI8j/
 # nm /tmp/perf-checkcWpuLRQI8j/test | grep foo
   00000000100006bc T foo

 # ./perf probe -x /tmp/perf-checkcWpuLRQI8j/test foo
   Failed to find symbol foo in /tmp/perf-checkcWpuLRQI8j/test
      Error: Failed to add events.

But it works with any files like /tmp/perf/test. Only for
patterns with "/tmp/perf-", this fails.

Further debugging, commit 80d496be89ed ("perf report: Add support
for profiling JIT generated code") added support for profiling JIT
generated code. This patch handles dso's of form
"/tmp/perf-$PID.map" .

The check used "if (strncmp(self->name, "/tmp/perf-", 10) == 0)"
to match "/tmp/perf-$PID.map". With this commit, any dso in
/tmp/perf- folder will be considered separately for processing
(not only JIT created map files ). Fix this by changing the
string pattern to check for "/tmp/perf-%d.map". Add a helper
function is_perf_pid_map_name to do this check. In "struct dso",
dso->long_name holds the long name of the dso file. Since the
/tmp/perf-$PID.map check uses the complete name, use dso___long_name for
the string name.

With the fix,
# ./perf test "test perf probe of function from different CU"
117: test perf probe of function from different CU                   : Ok

Fixes: 56cbeacf1435 ("perf probe: Add test for regression introduced by switch to die_get_decl_file()")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-1-atrajeev@linux.vnet.ibm.com
3 months agoperf test: Make test_arm_callgraph_fp.sh more robust
James Clark [Wed, 12 Jun 2024 14:03:14 +0000 (15:03 +0100)]
perf test: Make test_arm_callgraph_fp.sh more robust

The 2 second sleep can cause the test to fail on very slow network file
systems because Perf ends up being killed before it finishes starting
up.

Fix it by making the leafloop workload end after a fixed time like the
other workloads so there is no need to kill it after 2 seconds.

Also remove the 1 second start sampling delay because it is similarly
fragile. Instead, search through all samples for a matching one, rather
than just checking the first sample and hoping it's in the right place.

Fixes: cd6382d82752 ("perf test arm64: Test unwinding using fame-pointer (fp) mode")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240612140316.3006660-1-james.clark@arm.com
3 months agoperf build: Ensure libtraceevent and libtracefs versions have 3 components
Guilherme Amadio [Thu, 6 Jun 2024 15:33:02 +0000 (17:33 +0200)]
perf build: Ensure libtraceevent and libtracefs versions have 3 components

When either of these have a shorter version, like 1.8, the expression
that computes the version has a syntax error that can be seen in the
output of make:

expr: syntax error: missing argument after +

Link: https://bugs.gentoo.org/917559
Reported-by: Peter Volkov <peter.volkov@gmail.com>
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606153625.2255470-3-amadio@gentoo.org
3 months agoperf build: Use pkg-config for feature check for libtrace{event,fs}
Guilherme Amadio [Thu, 6 Jun 2024 15:33:01 +0000 (17:33 +0200)]
perf build: Use pkg-config for feature check for libtrace{event,fs}

Needed to add required include directories for the feature detection
to succeed. The header tracefs.h is installed either into the include
directory /usr/include/tracefs/tracefs.h when using the Makefile, or
into /usr/include/libtracefs/tracefs.h when using meson to build
libtracefs. The header tracefs.h uses #include <event-parse.h> from
libtraceevent, so pkg-config needs to pick the correct include directory
for libtracefs and add the one for libtraceevent to succeed.

Note that in baa2ca59ec1e31ccbe3f24ff0368152b36f68720 the variable
LIBTRACEEVENT_DIR was introduced, and now the method to compile against
non-standard locations requires PKG_CONFIG_PATH to be set instead, which
works for both libtraceevent and libtracefs.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606153625.2255470-2-amadio@gentoo.org
3 months agoperf arm: Workaround ARM PMUs cpu maps having offline cpus
Ian Rogers [Fri, 7 Jun 2024 06:53:43 +0000 (23:53 -0700)]
perf arm: Workaround ARM PMUs cpu maps having offline cpus

When PMUs have a cpu map in the 'cpus' or 'cpumask' file, perf will
try to open events on those CPUs. ARM doesn't remove offline CPUs
meaning taking a CPU offline will cause perf commands to fail unless a
CPU map is passed on the command line.

More context in:
https://lore.kernel.org/lkml/20240603092812.46616-1-yangyicong@huawei.com/

Reported-by: Yicong Yang <yangyicong@huawei.com>
Closes: https://lore.kernel.org/lkml/20240603092812.46616-2-yangyicong@huawei.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607065343.695369-1-irogers@google.com
3 months agoperf stat: Fix the hard-coded metrics calculation on the hybrid
Kan Liang [Thu, 6 Jun 2024 18:03:16 +0000 (11:03 -0700)]
perf stat: Fix the hard-coded metrics calculation on the hybrid

The hard-coded metrics is wrongly calculated on the hybrid machine.

$ perf stat -e cycles,instructions -a sleep 1

 Performance counter stats for 'system wide':

        18,205,487      cpu_atom/cycles/
         9,733,603      cpu_core/cycles/
         9,423,111      cpu_atom/instructions/     #  0.52  insn per cycle
         4,268,965      cpu_core/instructions/     #  0.23  insn per cycle

The insn per cycle for cpu_core should be 4,268,965 / 9,733,603 = 0.44.

When finding the metric events, the find_stat() doesn't take the PMU
type into account. The cpu_atom/cycles/ is wrongly used to calculate
the IPC of the cpu_core.

In the hard-coded metrics, the events from a different PMU are only
SW_CPU_CLOCK and SW_TASK_CLOCK. They both have the stat type,
STAT_NSECS. Except the SW CLOCK events, check the PMU type as well.

Fixes: 0a57b910807a ("perf stat: Use counts rather than saved_value")
Reported-by: Khalil, Amiri <amiri.khalil@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606180316.4122904-1-kan.liang@linux.intel.com
3 months agoperf vendor events: Add westmereex counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:51 +0000 (11:17 -0700)]
perf vendor events: Add westmereex counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-38-irogers@google.com
3 months agoperf vendor events: Add westmereep-sp counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:50 +0000 (11:17 -0700)]
perf vendor events: Add westmereep-sp counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-37-irogers@google.com
3 months agoperf vendor events: Add westmereep-dp counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:49 +0000 (11:17 -0700)]
perf vendor events: Add westmereep-dp counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-36-irogers@google.com
3 months agoperf vendor events: Add/update tigerlake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:48 +0000 (11:17 -0700)]
perf vendor events: Add/update tigerlake events/metrics

Update events from v1.15 to v1.16.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.16:
https://github.com/intel/perfmon/commit/43f3b8d6f82f3174bd3bffe8587e2179f086d2ce

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-35-irogers@google.com
3 months agoperf vendor events: Add snowridgex counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:47 +0000 (11:17 -0700)]
perf vendor events: Add snowridgex counter information

Update/remove events as per v1.23:
https://github.com/intel/perfmon/commit/9debd874e1b2b0cca42b9ba2342cacaaace2f0ce

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-34-irogers@google.com
3 months agoperf vendor events: Add/update skylakex events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:46 +0000 (11:17 -0700)]
perf vendor events: Add/update skylakex events/metrics

Update events from v1.33 to v1.35.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.35:
https://github.com/intel/perfmon/commit/c99b60c147b96f40f96dd961abfae54909f47e5f

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-33-irogers@google.com
3 months agoperf vendor events: Add/update skylake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:45 +0000 (11:17 -0700)]
perf vendor events: Add/update skylake events/metrics

Update events from v58 to v59.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v59:
https://github.com/intel/perfmon/commit/5d36f1835b02f056031a06e777e4bf54a5964930

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-32-irogers@google.com
3 months agoperf vendor events: Add silvermont counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:44 +0000 (11:17 -0700)]
perf vendor events: Add silvermont counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-31-irogers@google.com
3 months agoperf vendor events: Add/update sierraforest events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:43 +0000 (11:17 -0700)]
perf vendor events: Add/update sierraforest events/metrics

Update events from v1.02 to v1.04.
Add TMA metrics v4.8.

Bring in the event updates v1.04:
https://github.com/intel/perfmon/commit/0a9546cdf63c8b07f5c33ebf6fe49e6ebec89f86
v1.03:
https://github.com/intel/perfmon/commit/c7dd26ce67ca4477d40fb4b55b6baa0584b3e5d6

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

New events are:
FP_INST_RETIRED.128B_DP,
FP_INST_RETIRED.128B_SP,
FP_INST_RETIRED.256B_DP,
FP_INST_RETIRED.32B_SP,
FP_INST_RETIRED.64B_DP,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD,
OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM,
OCR.STREAMING_WR.ANY_RESPONSE,
UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_LOCAL,
UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_REMOTE,
UNC_CHA_TOR_INSERTS.IO_ITOM_LOCAL,
UNC_CHA_TOR_INSERTS.IO_ITOM_REMOTE,
UNC_CHA_TOR_INSERTS.IO_MISS,
UNC_CHA_TOR_INSERTS.IO_MISS_ITOM,
UNC_CHA_TOR_INSERTS.IO_MISS_ITOMCACHENEAR,
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL,
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE,
UNC_CXLCM_RxC_PACK_BUF_INSERTS.MEM_DATA,
UNC_CXLDP_TxC_AGF_INSERTS.M2S_DATA.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-30-irogers@google.com
3 months agoperf vendor events: Add/update sapphirerapids events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:42 +0000 (11:17 -0700)]
perf vendor events: Add/update sapphirerapids events/metrics

Update events from v1.20 to v1.23.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.23:
https://github.com/intel/perfmon/commit/6ace93281c0f573b90d3f8f624486ad59dde1c93
v1.22:
https://github.com/intel/perfmon/commit/356eba05c07c4d54ed5b92c1164ce00fab545636

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SW_PREFETCH_ACCESS.ANY,
UOPS_ISSUED.CYCLES.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-29-irogers@google.com
3 months agoperf vendor events: Update sandybridge metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:41 +0000 (11:17 -0700)]
perf vendor events: Update sandybridge metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-28-irogers@google.com
3 months agoperf vendor events: Add/update rocketlake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:40 +0000 (11:17 -0700)]
perf vendor events: Add/update rocketlake events/metrics

Update events from v1.02 to v1.03.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.03:
https://github.com/intel/perfmon/commit/a7c75ffd56c7056494cd3acc2749336cd6363b90

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-27-irogers@google.com
3 months agoperf vendor events: Add nehalemex counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:39 +0000 (11:17 -0700)]
perf vendor events: Add nehalemex counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-26-irogers@google.com
3 months agoperf vendor events: Add nehalemep counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:38 +0000 (11:17 -0700)]
perf vendor events: Add nehalemep counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-25-irogers@google.com
3 months agoperf vendor events: Update meteorlake events and add counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:37 +0000 (11:17 -0700)]
perf vendor events: Update meteorlake events and add counter information

Update events from v1.08 to v1.10.

Bring in the event updates v1.10:
https://github.com/intel/perfmon/commit/3bee3dc150164df0bec5980ca5586930730e5778
v1.09:
https://github.com/intel/perfmon/commit/01c8c99f17a72460b2eaf7efe3495913f36c9d42

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
FP_INST_RETIRED.128B_DP,
FP_INST_RETIRED.128B_SP,
FP_INST_RETIRED.256B_DP,
FP_INST_RETIRED.32B_SP,
FP_INST_RETIRED.64B_DP,
FP_VINT_UOPS_EXECUTED.STD,
L2_LINES_OUT.USELESS_HWPF,
L2_RQSTS.SWPF_HIT,
L2_RQSTS.SWPF_MISS,
LOAD_HIT_PREFETCH.SWPF,
MACHINE_CLEARS.ANY,
MACHINE_CLEARS.MRN_NUKE,
MISC_RETIRED.LBR_INSERTS,
SW_PREFETCH_ACCESS.ANY.

The metrics aren't updated as they require retirement latency support
that is added in this series:
https://lore.kernel.org/lkml/20240613033631.199800-1-weilin.wang@intel.com/

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-24-irogers@google.com
3 months agoperf vendor events: Add lunarlake counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:36 +0000 (11:17 -0700)]
perf vendor events: Add lunarlake counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-23-irogers@google.com
3 months agoperf vendor events: Add knightslanding counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:35 +0000 (11:17 -0700)]
perf vendor events: Add knightslanding counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-22-irogers@google.com
3 months agoperf vendor events: Update jaketown metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:34 +0000 (11:17 -0700)]
perf vendor events: Update jaketown metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-21-irogers@google.com
3 months agoperf vendor events: Update ivytown metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:33 +0000 (11:17 -0700)]
perf vendor events: Update ivytown metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-20-irogers@google.com
3 months agoperf vendor events: Update ivybridge metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:32 +0000 (11:17 -0700)]
perf vendor events: Update ivybridge metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-19-irogers@google.com
3 months agoperf vendor events: Add/update icelakex events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:31 +0000 (11:17 -0700)]
perf vendor events: Add/update icelakex events/metrics

Update events from v1.24 to v1.26.
Add TMA metrics v4.8.

Bring in the event updates v1.26:
https://github.com/intel/perfmon/commit/c607c739e05f2569f95998cc98e1283f042b4fd1
v1.25:
https://github.com/intel/perfmon/commit/42d996769069921ec06f6fbb600b0c663b9ec5a9

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-18-irogers@google.com
3 months agoperf vendor events: Add/update icelake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:30 +0000 (11:17 -0700)]
perf vendor events: Add/update icelake events/metrics

Update events from v1.21 to v1.22.
Add TMA metrics v4.8.

Bring in the event updates v1.22:
https://github.com/intel/perfmon/commit/e5640646e96d59e3c1c1e0d0100a475220ff1dfe

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-17-irogers@google.com
3 months agoperf vendor events: Update haswellx metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:29 +0000 (11:17 -0700)]
perf vendor events: Update haswellx metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-16-irogers@google.com
3 months agoperf vendor events: Add haswell counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:28 +0000 (11:17 -0700)]
perf vendor events: Add haswell counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-15-irogers@google.com
3 months agoperf vendor events: Update graniterapids events and add counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:27 +0000 (11:17 -0700)]
perf vendor events: Update graniterapids events and add counter information

Update events from v1.01 to v1.02.

Bring in the event updates v1.02:
https://github.com/intel/perfmon/commit/0ff9f681bd07d0e84026c52f4941d21b1cd4c171

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

There are over 1000 new events.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-14-irogers@google.com
3 months agoperf vendor events: Update/add grandridge events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:26 +0000 (11:17 -0700)]
perf vendor events: Update/add grandridge events/metrics

Update events from v1.02 to v1.03.
Add TMA metrics v4.8.

Bring in the event updates v1.03:
https://github.com/intel/perfmon/commit/5ec7a252d0f6ec461f80cc397c9ac25abcd9184f

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

New events are:
FP_INST_RETIRED.128B_DP,
FP_INST_RETIRED.128B_SP,
FP_INST_RETIRED.256B_DP,
FP_INST_RETIRED.32B_SP,
FP_INST_RETIRED.64B_DP,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD,
OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM,
OCR.STREAMING_WR.ANY_RESPONSE.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-13-irogers@google.com
3 months agoperf vendor events: Add goldmontplus counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:25 +0000 (11:17 -0700)]
perf vendor events: Add goldmontplus counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-12-irogers@google.com
3 months agoperf vendor events: Add goldmont counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:24 +0000 (11:17 -0700)]
perf vendor events: Add goldmont counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-11-irogers@google.com
3 months agoperf vendor events: Add/update emeraldrapids events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:23 +0000 (11:17 -0700)]
perf vendor events: Add/update emeraldrapids events/metrics

Update events from v1.06 to v1.09.
Add TMA metrics v4.8.

Bring in the event updates v1.09:
https://github.com/intel/perfmon/commit/3fd5892bb4aece9c1e5c17630570d0462838e85d
v1.08:
https://github.com/intel/perfmon/commit/54525c4508f4a1ce4a8b854aa808a4ee2fb5930b

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SW_PREFETCH_ACCESS.ANY,
UNC_IIO_BANDWIDTH_OUT.PART[0-7]_FREERUN,
UOPS_ISSUED.CYCLES.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-10-irogers@google.com
3 months agoperf vendor events: Update elkhartlake events
Ian Rogers [Thu, 20 Jun 2024 18:17:22 +0000 (11:17 -0700)]
perf vendor events: Update elkhartlake events

Update events from v1.04 to v1.05. Bring in event updates from:
https://github.com/intel/perfmon/commit/fb91e1851ca40a5b443e2c3cd79bc7fc34c8237e

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-9-irogers@google.com
3 months agoperf vendor events: Update cascadelakex events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:21 +0000 (11:17 -0700)]
perf vendor events: Update cascadelakex events/metrics

Update events from v1.21 to v1.22.

Bring in the event updates v1.22
https://github.com/intel/perfmon/commit/013877729c4ed96427932ca48722bc3bfd2a0075

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

New events are:
SW_PREFETCH_ACCESS.ANY

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-8-irogers@google.com
3 months agoperf vendor events: Update broadwellx metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:20 +0000 (11:17 -0700)]
perf vendor events: Update broadwellx metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-7-irogers@google.com
3 months agoperf vendor events: Update broadwellde metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:19 +0000 (11:17 -0700)]
perf vendor events: Update broadwellde metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-6-irogers@google.com