linux-2.6-microblaze.git
3 years agotracing: Fix operator precedence for hist triggers expression
Kalesh Singh [Mon, 25 Oct 2021 20:08:35 +0000 (13:08 -0700)]
tracing: Fix operator precedence for hist triggers expression

The current histogram expression evaluation logic evaluates the
expression from right to left. This can lead to incorrect results
if the operations are not associative (as is the case for subtraction
and, the now added, division operators).
e.g. 16-8-4-2 should be 2 not 10 --> 16-8-4-2 = ((16-8)-4)-2
     64/8/4/2 should be 1 not 16 --> 64/8/4/2 = ((64/8)/4)/2

Division and multiplication are currently limited to single operation
expression due to operator precedence support not yet implemented.

Rework the expression parsing to support the correct evaluation of
expressions containing operators of different precedences; and fix
the associativity error by evaluating expressions with operators of
the same precedence from left to right.

Examples:
        (1) echo 'hist:keys=common_pid:a=8,b=4,c=2,d=1,w=$a-$b-$c-$d' \
                  >> event/trigger
        (2) echo 'hist:keys=common_pid:x=$a/$b/3/2' >> event/trigger
        (3) echo 'hist:keys=common_pid:y=$a+10/$c*1024' >> event/trigger
        (4) echo 'hist:keys=common_pid:z=$a/$b+$c*$d' >> event/trigger

Link: https://lkml.kernel.org/r/20211025200852.3002369-4-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Add division and multiplication support for hist triggers
Kalesh Singh [Mon, 25 Oct 2021 20:08:34 +0000 (13:08 -0700)]
tracing: Add division and multiplication support for hist triggers

Adds basic support for division and multiplication operations for
hist trigger variable expressions.

For simplicity this patch only supports, division and multiplication
for a single operation expression (e.g. x=$a/$b), as currently
expressions are always evaluated right to left. This can lead to some
incorrect results:

e.g. echo 'hist:keys=common_pid:x=8-4-2' >> event/trigger

     8-4-2 should evaluate to 2 i.e. (8-4)-2
     but currently x evaluate to  6 i.e. 8-(4-2)

Multiplication and division in sub-expressions will work correctly, once
correct operator precedence support is added (See next patch in this
series).

For the undefined case of division by 0, the histogram expression
evaluates to (u64)(-1). Since this cannot be detected when the
expression is created, it is the responsibility of the user to be
aware and account for this possibility.

Examples:
echo 'hist:keys=common_pid:a=8,b=4,x=$a/$b' \
                   >> event/trigger

echo 'hist:keys=common_pid:y=5*$b' \
                   >> event/trigger

Link: https://lkml.kernel.org/r/20211025200852.3002369-3-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Add support for creating hist trigger variables from literal
Kalesh Singh [Mon, 25 Oct 2021 20:08:33 +0000 (13:08 -0700)]
tracing: Add support for creating hist trigger variables from literal

Currently hist trigger expressions don't support the use of numeric
literals:
e.g. echo 'hist:keys=common_pid:x=$y-1234'
--> is not valid expression syntax

Having the ability to use numeric constants in hist triggers supports
a wider range of expressions for creating variables.

Add support for creating trace event histogram variables from numeric
literals.

e.g. echo 'hist:keys=common_pid:x=1234,y=size-1024' >> event/trigger

A negative numeric constant is created, using unary minus operator
(parentheses are required).

e.g. echo 'hist:keys=common_pid:z=-(2)' >> event/trigger

Constants can be used with division/multiplication (added in the
next patch in this series) to implement granularity filters for frequent
trace events. For instance we can limit emitting the rss_stat
trace event to when there is a 512KB cross over in the rss size:

  # Create a synthetic event to monitor instead of the high frequency
  # rss_stat event
  echo 'rss_stat_throttled unsigned int mm_id; unsigned int curr;
int member; long size' >> tracing/synthetic_events

  # Create a hist trigger that emits the synthetic rss_stat_throttled
  # event only when the rss size crosses a 512KB boundary.
  echo 'hist:keys=keys=mm_id,member:bucket=size/0x80000:onchange($bucket)
      .rss_stat_throttled(mm_id,curr,member,size)'
        >> events/kmem/rss_stat/trigger

A use case for using constants with addition/subtraction is not yet
known, but for completeness the use of constants are supported for all
operators.

Link: https://lkml.kernel.org/r/20211025200852.3002369-2-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoselftests/ftrace: Stop tracing while reading the trace file by default
Masami Hiramatsu [Tue, 26 Oct 2021 23:22:11 +0000 (08:22 +0900)]
selftests/ftrace: Stop tracing while reading the trace file by default

Stop tracing while reading the trace file by default, to prevent
the test results while checking it and to avoid taking a long time
to check the result.
If there is any testcase which wants to test the tracing while reading
the trace file, please override this setting inside the test case.

This also recovers the pause-on-trace when clean it up.

Link: https://lkml.kernel.org/r/163529053143.690749.15365238954175942026.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoMAINTAINERS: Update KPROBES and TRACING entries
Tiezhu Yang [Tue, 26 Oct 2021 01:51:31 +0000 (09:51 +0800)]
MAINTAINERS: Update KPROBES and TRACING entries

There is no git tree for KPROBES in MAINTAINERS, it is not convinent to
rebase, lib/test_kprobes.c and samples/kprobes belong to kprobe, so add
git tree and missing files for KPROBES, and also use linux-trace.git for
TRACING to avoid confusing.

Link: https://lkml.kernel.org/r/1635213091-24387-5-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotest_kprobes: Move it from kernel/ to lib/
Tiezhu Yang [Tue, 26 Oct 2021 01:51:30 +0000 (09:51 +0800)]
test_kprobes: Move it from kernel/ to lib/

Since config KPROBES_SANITY_TEST is in lib/Kconfig.debug, it is better to
let test_kprobes.c in lib/, just like other similar tests found in lib/.

Link: https://lkml.kernel.org/r/1635213091-24387-4-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agodocs, kprobes: Remove invalid URL and add new reference
Tiezhu Yang [Tue, 26 Oct 2021 01:51:29 +0000 (09:51 +0800)]
docs, kprobes: Remove invalid URL and add new reference

The following reference is invalid, remove it.
https://www.ibm.com/developerworks/library/l-kprobes/index.html

Add the following new reference "An introduction to KProbes":
https://lwn.net/Articles/132196/

Link: https://lkml.kernel.org/r/1635213091-24387-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agosamples/kretprobes: Fix return value if register_kretprobe() failed
Tiezhu Yang [Tue, 26 Oct 2021 01:51:28 +0000 (09:51 +0800)]
samples/kretprobes: Fix return value if register_kretprobe() failed

Use the actual return value instead of always -1 if register_kretprobe()
failed.

E.g. without this patch:

 # insmod samples/kprobes/kretprobe_example.ko func=no_such_func
 insmod: ERROR: could not insert module samples/kprobes/kretprobe_example.ko: Operation not permitted

With this patch:

 # insmod samples/kprobes/kretprobe_example.ko func=no_such_func
 insmod: ERROR: could not insert module samples/kprobes/kretprobe_example.ko: Unknown symbol in module

Link: https://lkml.kernel.org/r/1635213091-24387-2-git-send-email-yangtiezhu@loongson.cn
Fixes: 804defea1c02 ("Kprobes: move kprobe examples to samples/")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agolib/bootconfig: Fix the xbc_get_info kerneldoc
Masami Hiramatsu [Tue, 26 Oct 2021 12:21:07 +0000 (21:21 +0900)]
lib/bootconfig: Fix the xbc_get_info kerneldoc

Fix the kernel doc of xbc_get_info() to add '@' to the parameters.

Link: https://lkml.kernel.org/r/163525086738.676803.15352231787913236933.stgit@devnote2
Fixes: e306220cb7b7 ("bootconfig: Add xbc_get_info() for the node information")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Add a test case for stacktrace from kretprobe handler
Masami Hiramatsu [Mon, 25 Oct 2021 11:41:52 +0000 (20:41 +0900)]
kprobes: Add a test case for stacktrace from kretprobe handler

Add a test case for stacktrace from kretprobe handler and
nested kretprobe handlers.

This test checks both of stack trace inside kretprobe handler
and stack trace from pt_regs. Those stack trace must include
actual function return address instead of kretprobe trampoline.
The nested kretprobe stacktrace test checks whether the unwinder
can correctly unwind the call frame on the stack which has been
modified by the kretprobe.

Since the stacktrace on kretprobe is correctly fixed only on x86,
this introduces a meta kconfig ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
which tells user that the stacktrace on kretprobe is correct or not.

The test results will be shown like below;

 TAP version 14
 1..1
     # Subtest: kprobes_test
     1..6
     ok 1 - test_kprobe
     ok 2 - test_kprobes
     ok 3 - test_kretprobe
     ok 4 - test_kretprobes
     ok 5 - test_stacktrace_on_kretprobe
     ok 6 - test_stacktrace_on_nested_kretprobe
 # kprobes_test: pass:6 fail:0 skip:0 total:6
 # Totals: pass:6 fail:0 skip:0 total:6
 ok 1 - kprobes_test

Link: https://lkml.kernel.org/r/163516211244.604541.18350507860972214415.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agolib/bootconfig: Make xbc_alloc_mem() and xbc_free_mem() as __init function
Masami Hiramatsu [Mon, 25 Oct 2021 08:32:37 +0000 (17:32 +0900)]
lib/bootconfig: Make xbc_alloc_mem() and xbc_free_mem() as __init function

Since the xbc_alloc_mem() and xbc_free_mem() are used from
the __init functions and memblock_alloc() is __init function,
make them __init functions too.

Link: https://lkml.kernel.org/r/163515075747.547467.5746167540626712819.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 4ee1b4cac236 ("bootconfig: Cleanup dummy headers in tools/bootconfig")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace/sh: Add arch_ftrace_ops_list_func stub to have compressed image still link
Steven Rostedt (VMware) [Tue, 26 Oct 2021 21:23:36 +0000 (17:23 -0400)]
ftrace/sh: Add arch_ftrace_ops_list_func stub to have compressed image still link

Using the linker script to fix an issue where some archs call the
function tracer with just the ip (instruction pointer) and pip (parent
instruction pointer) where as more up to date archs also pass in the
associated ftrace_ops and the ftrace_regs pointer, the generic code
will be called either with two parameters or four. To avoid any C
undefined behavior of calling two parameters to four or four to two
parameter function, two functions are created, where a preprocessor
macro uses the one that matches the architecture. As the function
pointers for them may be different, a typecast is used. But this
triggers issues with newer compilers that will fail due to -Werror.

A linker trick is now used to map the generic function to the function
that is used (note the generic function is only used to set the default
function callback). The linker trick defines ftrace_ops_list_func (the
generic function) to arch_ftrace_ops_list_func (the arch defined one).

Link: https://lore.kernel.org/all/20200617165616.52241bde@oasis.local.home/
But this fails sh arch because their linker script is included in their
compressed image that does not define arch_ftrace_ops_list_func at all

  sh4-linux-ld:arch/sh/boot/compressed/../../kernel/vmlinux.lds:32: undefined symbol `arch_ftrace_ops_list_func' referenced in expression

Included a stub by that name in the misc.c to allow the code to
compile and link, even though it's not used.

This is similar to what was done for ftrace_stub:

  b83b43ffc6e4b ("fgraph: Fix function type mismatches of
  ftrace_graph_return using ftrace_stub")

Link: https://lkml.kernel.org/r/20211021221627.5d7270de@rorschach.local.home
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing/hwlat: Make some internal symbols static
Wang ShaoBo [Thu, 21 Oct 2021 03:52:25 +0000 (11:52 +0800)]
tracing/hwlat: Make some internal symbols static

The sparse tool complains as follows:

kernel/trace/trace_hwlat.c:82:27: warning: symbol 'hwlat_single_cpu_data' was not declared. Should it be static?
kernel/trace/trace_hwlat.c:83:1: warning: symbol '__pcpu_scope_hwlat_per_cpu_data' was not declared. Should it be static?

This symbol is not used outside of trace_hwlat.c, so this commit
marks it static.

Link: https://lkml.kernel.org/r/20211021035225.1050685-1-bobo.shaobowang@huawei.com
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Fix missing trace_boot_init_histograms kstrdup NULL checks
Mathieu Desnoyers [Fri, 15 Oct 2021 19:55:50 +0000 (15:55 -0400)]
tracing: Fix missing trace_boot_init_histograms kstrdup NULL checks

trace_boot_init_histograms misses NULL pointer checks for kstrdup
failure.

Link: https://lkml.kernel.org/r/20211015195550.22742-1-mathieu.desnoyers@efficios.com
Fixes: 64dc7f6958ef5 ("tracing/boot: Show correct histogram error command")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotrace/timerlat: Add migrate-disabled field to the timerlat header
Daniel Bristot de Oliveira [Fri, 15 Oct 2021 15:07:51 +0000 (17:07 +0200)]
trace/timerlat: Add migrate-disabled field to the timerlat header

Since "54357f0c9149 tracing: Add migrate-disabled counter to tracing
output," the migrate disabled field is also printed in the !PREEMPR_RT
kernel config. While this information was added to the vast majority of
tracers, osnoise and timerlat were not updated (because they are new
tracers).

Fix timerlat header by adding the information about migrate disabled.

Link: https://lkml.kernel.org/r/bc0c234ab49946cdd63effa6584e1d5e8662cb44.1634308385.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: 54357f0c9149 ("tracing: Add migrate-disabled counter to tracing output.")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotrace/osnoise: Add migrate-disabled field to the osnoise header
Daniel Bristot de Oliveira [Fri, 15 Oct 2021 15:07:50 +0000 (17:07 +0200)]
trace/osnoise: Add migrate-disabled field to the osnoise header

Since "54357f0c9149 tracing: Add migrate-disabled counter to tracing
output," the migrate disabled field is also printed in the !PREEMPR_RT
kernel config. While this information was added to the vast majority of
tracers, osnoise and timerlat were not updated (because they are new
tracers).

Fix osnoise header by adding the information about migrate disabled.

Link: https://lkml.kernel.org/r/9cb3d54e29e0588dbba12e81486bd8a09adcd8ca.1634308385.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: 54357f0c9149 ("tracing: Add migrate-disabled counter to tracing output.")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing/doc: Fix typos on the timerlat tracer documentation
Daniel Bristot de Oliveira [Fri, 15 Oct 2021 15:07:49 +0000 (17:07 +0200)]
tracing/doc: Fix typos on the timerlat tracer documentation

Fixes a series of typos in the timerlat doc.

Link: https://lkml.kernel.org/r/d3763eb376603890baab908141de6660ba18fff8.1634308385.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: a955d7eac177 ("trace: Add timerlat tracer")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotrace/osnoise: Fix an ifdef comment
Daniel Bristot de Oliveira [Fri, 15 Oct 2021 15:07:48 +0000 (17:07 +0200)]
trace/osnoise: Fix an ifdef comment

s/CONFIG_OSNOISE_TRAECR/CONFIG_OSNOISE_TRACER/

No functional changes.

Link: https://lkml.kernel.org/r/33924a16f6e5559ce24952ca7d62561604bfd94a.1634308385.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoperf/core: allow ftrace for functions in kernel/event/core.c
Song Liu [Wed, 6 Oct 2021 21:07:32 +0000 (14:07 -0700)]
perf/core: allow ftrace for functions in kernel/event/core.c

It is useful to trace functions in kernel/event/core.c. Allow ftrace for
them by removing $(CC_FLAGS_FTRACE) from Makefile.

Link: https://lkml.kernel.org/r/20211006210732.2826289-1-songliubraving@fb.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotools/latency-collector: Use correct size when writing queue_full_warning
Viktor Rosendahl [Tue, 19 Oct 2021 16:07:01 +0000 (18:07 +0200)]
tools/latency-collector: Use correct size when writing queue_full_warning

queue_full_warning is a pointer, so it is wrong to use sizeof to calculate
the number of characters of the string it points to. The effect is that we
only print out the first few characters of the warning string.

The correct way is to use strlen(). We don't need to add 1 to the strlen()
because we don't want to write the terminating null character to stdout.

Link: https://lkml.kernel.org/r/20211019160701.15587-1-Viktor.Rosendahl@bmw.de
Link: https://lore.kernel.org/r/8fd4bb65ef3da67feac9ce3258cdbe9824752cf1.1629198502.git.jing.yangyang@zte.com.cn
Link: https://lore.kernel.org/r/20211012025424.180781-1-davidcomponentone@gmail.com
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Make ftrace_profile_pages_init static
chongjiapeng [Tue, 19 Oct 2021 10:48:54 +0000 (18:48 +0800)]
ftrace: Make ftrace_profile_pages_init static

This symbol is not used outside of ftrace.c, so marks it static.

Fixes the following sparse warning:

kernel/trace/ftrace.c:579:5: warning: symbol 'ftrace_profile_pages_init'
was not declared. Should it be static?

Link: https://lkml.kernel.org/r/1634640534-18280-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: cafb168a1c92 ("tracing: make the function profiler per cpu")
Signed-off-by: chongjiapeng <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoARM: Recover kretprobe modified return address in stacktrace
Masami Hiramatsu [Thu, 21 Oct 2021 00:55:35 +0000 (09:55 +0900)]
ARM: Recover kretprobe modified return address in stacktrace

Since the kretprobe replaces the function return address with
the kretprobe_trampoline on the stack, arm unwinder shows it
instead of the correct return address.

This finds the correct return address from the per-task
kretprobe_instances list and verify it is in between the
caller fp and callee fp.

Note that this supports both GCC and clang if CONFIG_FRAME_POINTER=y
and CONFIG_ARM_UNWIND=n. For the ARM unwinder, this is still
not working correctly.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoARM: kprobes: Make a frame pointer on __kretprobe_trampoline
Masami Hiramatsu [Thu, 21 Oct 2021 00:55:26 +0000 (09:55 +0900)]
ARM: kprobes: Make a frame pointer on __kretprobe_trampoline

Currently kretprobe on ARM just fills r0-r11 of pt_regs, but
that is not enough for the stacktrace. Moreover, from the user
kretprobe handler, stacktrace needs a frame pointer on the
__kretprobe_trampoline.

This adds a frame pointer on __kretprobe_trampoline for both gcc
and clang case. Those have different frame pointer so we need
different but similar stack on pt_regs.

Gcc makes the frame pointer (fp) to point the 'pc' address of
the {fp, ip (=sp), lr, pc}, this means {r11, r13, r14, r15}.
Thus if we save the r11 (fp) on pt_regs->r12, we can make this
set on the end of pt_regs.

On the other hand, Clang makes the frame pointer to point the
'fp' address of {fp, lr} on stack. Since the next to the
pt_regs->lr is pt_regs->sp, I reused the pair of pt_regs->fp
and pt_regs->ip.
So this stores the 'lr' on pt_regs->ip and make the fp to point
pt_regs->fp.

For both cases, saves __kretprobe_trampoline address to
pt_regs->lr, so that the stack tracer can identify this frame
pointer has been made by the __kretprobe_trampoline.

Note that if the CONFIG_FRAME_POINTER is not set, this keeps
fp as is.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoARM: clang: Do not rely on lr register for stacktrace
Masami Hiramatsu [Thu, 21 Oct 2021 00:55:17 +0000 (09:55 +0900)]
ARM: clang: Do not rely on lr register for stacktrace

Currently the stacktrace on clang compiled arm kernel uses the 'lr'
register to find the first frame address from pt_regs. However, that
is wrong after calling another function, because the 'lr' register
is used by 'bl' instruction and never be recovered.

As same as gcc arm kernel, directly use the frame pointer (r11) of
the pt_regs to find the first frame address.

Note that this fixes kretprobe stacktrace issue only with
CONFIG_UNWINDER_FRAME_POINTER=y. For the CONFIG_UNWINDER_ARM,
we need another fix.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoarm64: Recover kretprobe modified return address in stacktrace
Masami Hiramatsu [Thu, 21 Oct 2021 00:55:09 +0000 (09:55 +0900)]
arm64: Recover kretprobe modified return address in stacktrace

Since the kretprobe replaces the function return address with
the kretprobe_trampoline on the stack, stack unwinder shows it
instead of the correct return address.

This checks whether the next return address is the
__kretprobe_trampoline(), and if so, try to find the correct
return address from the kretprobe instance list. For this purpose
this adds 'kr_cur' loop cursor to memorize the current kretprobe
instance.

With this fix, now arm64 can enable
CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE, and pass the
kprobe self tests.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoarm64: kprobes: Make a frame pointer on __kretprobe_trampoline
Masami Hiramatsu [Thu, 21 Oct 2021 00:55:00 +0000 (09:55 +0900)]
arm64: kprobes: Make a frame pointer on __kretprobe_trampoline

Make a frame pointer (make the x29 register points the
address of pt_regs->regs[29]) on __kretprobe_trampoline.

This frame pointer will be used by the stacktracer when it is
called from the kretprobe handlers. In this case, the stack
tracer will unwind stack to trampoline_probe_handler() and
find the next frame pointer in the stack frame of the
__kretprobe_trampoline().

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoarm64: kprobes: Record frame pointer with kretprobe instance
Masami Hiramatsu [Thu, 21 Oct 2021 00:54:51 +0000 (09:54 +0900)]
arm64: kprobes: Record frame pointer with kretprobe instance

Record the frame pointer instead of stack address with kretprobe
instance as the identifier on the instance list.
Since arm64 always enable CONFIG_FRAME_POINTER, we can use the
actual frame pointer (x29).

This will allow the stacktrace code to find the original return
address from the FP alone.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/unwind: Compile kretprobe fixup code only if CONFIG_KRETPROBES=y
Masami Hiramatsu [Thu, 21 Oct 2021 00:54:42 +0000 (09:54 +0900)]
x86/unwind: Compile kretprobe fixup code only if CONFIG_KRETPROBES=y

Compile kretprobe related stacktrace entry recovery code and
unwind_state::kr_cur field only when CONFIG_KRETPROBES=y.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: convert tests to kunit
Sven Schnelle [Thu, 21 Oct 2021 00:54:24 +0000 (09:54 +0900)]
kprobes: convert tests to kunit

This converts the kprobes testcases to use the kunit framework.
It adds a dependency on CONFIG_KUNIT, and the output will change
to TAP:

TAP version 14
1..1
    # Subtest: kprobes_test
    1..4
random: crng init done
    ok 1 - test_kprobe
    ok 2 - test_kprobes
    ok 3 - test_kretprobe
    ok 4 - test_kretprobes
ok 1 - kprobes_test

Note that the kprobes testcases are no longer run immediately after
kprobes initialization, but as a late initcall when kunit is
initialized. kprobes itself is initialized with an early initcall,
so the order is still correct.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: use %ps format string to print symbols
Arnd Bergmann [Tue, 19 Oct 2021 15:33:13 +0000 (17:33 +0200)]
tracing: use %ps format string to print symbols

clang started warning about excessive stack usage in
hist_trigger_print_key()

kernel/trace/trace_events_hist.c:4723:13: error: stack frame size (1336) exceeds limit (1024) in function 'hist_trigger_print_key' [-Werror,-Wframe-larger-than]

The problem is that there are two 512-byte arrays on the stack if
hist_trigger_stacktrace_print() gets inlined. I don't think this has
changed in the past five years, but something probably changed the
inlining decisions made by the compiler, so the problem is now made
more obvious.

Rather than printing the symbol names into separate buffers, it
seems we can simply use the special %ps format string modifier
to print the pointers symbolically and get rid of both buffers.

Marking hist_trigger_stacktrace_print() would be a simpler
way of avoiding the warning, but that would not address the
excessive stack usage.

Link: https://lkml.kernel.org/r/20211019153337.294790-1-arnd@kernel.org
Fixes: 69a0200c2e25 ("tracing: Add hist trigger support for stacktraces as keys")
Link: https://lore.kernel.org/all/20211015095704.49a99859@gandalf.local.home/
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Explain the trace recursion transition bit better
Steven Rostedt (VMware) [Tue, 19 Oct 2021 13:25:20 +0000 (09:25 -0400)]
tracing: Explain the trace recursion transition bit better

The current text of the explanation of the transition bit in the trace
recursion protection is not very clear. Improve the text, so that when all
the archs no longer have the issue of tracing between a start of a new
(interrupt) context and updating the preempt_count to reflect the new
context, that it may be removed.

Link: https://lore.kernel.org/all/20211018220203.064a42ed@gandalf.local.home/
Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace/direct: Do not disable when switching direct callers
Steven Rostedt (VMware) [Thu, 14 Oct 2021 20:11:14 +0000 (16:11 -0400)]
ftrace/direct: Do not disable when switching direct callers

Currently to switch a set of "multi" direct trampolines from one
trampoline to another, a full shutdown of the current set needs to be
done, followed by an update to what trampoline the direct callers would
call, and then re-enabling the callers. This leaves a time when the
functions will not be calling anything, and events may be missed.

Instead, use a trick to allow all the functions with direct trampolines
attached will always call either the new or old trampoline while the
switch is happening. To do this, first attach a "dummy" callback via
ftrace to all the functions that the current direct trampoline is attached
to. This will cause the functions to call the "list func" instead of the
direct trampoline. The list function will call the direct trampoline
"helper" that will set the function it should call as it returns back to
the ftrace trampoline.

At this moment, the direct caller descriptor can safely update the direct
call trampoline. The list function will pick either the new or old
function (depending on the memory coherency model of the architecture).

Now removing the dummy function from each of the locations of the direct
trampoline caller, will put back the direct call, but now to the new
trampoline.

A better visual is:

[ Changing direct call from my_direct_1 to my_direct_2 ]

  <traced_func>:
     call my_direct_1

 ||||||||||||||||||||
 vvvvvvvvvvvvvvvvvvvv

  <traced_func>:
     call ftrace_caller

  <ftrace_caller>:
    [..]
    call ftrace_ops_list_func

ftrace_ops_list_func()
{
ops->func() -> direct_helper -> set rax to my_direct_1 or my_direct_2
}

   call rax (to either my_direct_1 or my_direct_2

 ||||||||||||||||||||
 vvvvvvvvvvvvvvvvvvvv

  <traced_func>:
     call my_direct_2

Link: https://lore.kernel.org/all/20211014162819.5c85618b@gandalf.local.home/
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace/samples: Add multi direct interface test module
Jiri Olsa [Fri, 8 Oct 2021 09:13:36 +0000 (11:13 +0200)]
ftrace/samples: Add multi direct interface test module

Adding simple module that uses multi direct interface:

  register_ftrace_direct_multi
  unregister_ftrace_direct_multi

The init function registers trampoline for 2 functions,
and exit function unregisters them.

Link: https://lkml.kernel.org/r/20211008091336.33616-9-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Add multi direct modify interface
Jiri Olsa [Fri, 8 Oct 2021 09:13:35 +0000 (11:13 +0200)]
ftrace: Add multi direct modify interface

Adding interface to modify registered direct function
for ftrace_ops. Adding following function:

   modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)

The function changes the currently registered direct
function for all attached functions.

Link: https://lkml.kernel.org/r/20211008091336.33616-8-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Add multi direct register/unregister interface
Jiri Olsa [Fri, 8 Oct 2021 09:13:34 +0000 (11:13 +0200)]
ftrace: Add multi direct register/unregister interface

Adding interface to register multiple direct functions
within single call. Adding following functions:

  register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
  unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)

The register_ftrace_direct_multi registers direct function (addr)
with all functions in ops filter. The ops filter can be updated
before with ftrace_set_filter_ip calls.

All requested functions must not have direct function currently
registered, otherwise register_ftrace_direct_multi will fail.

The unregister_ftrace_direct_multi unregisters ops related direct
functions.

Link: https://lkml.kernel.org/r/20211008091336.33616-7-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Add ftrace_add_rec_direct function
Jiri Olsa [Fri, 8 Oct 2021 09:13:33 +0000 (11:13 +0200)]
ftrace: Add ftrace_add_rec_direct function

Factor out the code that adds (ip, addr) tuple to direct_functions
hash in new ftrace_add_rec_direct function. It will be used in
following patches.

Link: https://lkml.kernel.org/r/20211008091336.33616-6-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Fix selftest config check for function graph start up test
Steven Rostedt (VMware) [Thu, 21 Oct 2021 17:43:57 +0000 (13:43 -0400)]
tracing: Fix selftest config check for function graph start up test

There's a new test in trace_selftest_startup_function_graph() that
requires the use of ftrace args being supported as well does some tricks
with dynamic tracing. Although this code checks HAVE_DYNAMIC_FTRACE_WITH_ARGS
it fails to check DYNAMIC_FTRACE, and the kernel fails to build due to
that dependency.

Also only define the prototype of trace_direct_tramp() if it is used.

Link: https://lkml.kernel.org/r/20211021134357.7f48e173@gandalf.local.home
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Add trampoline/graph selftest
Jiri Olsa [Fri, 8 Oct 2021 09:13:32 +0000 (11:13 +0200)]
tracing: Add trampoline/graph selftest

Adding selftest for checking that direct trampoline can
co-exist together with graph tracer on same function.

This is supported for CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
config option, which is defined only for x86_64 for now.

Link: https://lkml.kernel.org/r/20211008091336.33616-5-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/ftrace: Make function graph use ftrace directly
Steven Rostedt (VMware) [Fri, 8 Oct 2021 09:13:31 +0000 (11:13 +0200)]
x86/ftrace: Make function graph use ftrace directly

We don't need special hook for graph tracer entry point,
but instead we can use graph_ops::func function to install
the return_hooker.

This moves the graph tracing setup _before_ the direct
trampoline prepares the stack, so the return_hooker will
be called when the direct trampoline is finished.

This simplifies the code, because we don't need to take into
account the direct trampoline setup when preparing the graph
tracer hooker and we can allow function graph tracer on entries
registered with direct trampoline.

Link: https://lkml.kernel.org/r/20211008091336.33616-4-jolsa@kernel.org
[fixed compile error reported by kernel test robot <lkp@intel.com>]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace/x86_64: Have function graph tracer depend on DYNAMIC_FTRACE
Steven Rostedt (VMware) [Thu, 21 Oct 2021 03:35:55 +0000 (23:35 -0400)]
ftrace/x86_64: Have function graph tracer depend on DYNAMIC_FTRACE

The function graph tracer is going to now depend on
ARCH_SUPPORTS_FTRACE_OPS, as that also means that it can support ftrace
args. Since ARCH_SUPPORTS_FTRACE_OPS depends on DYNAMIC_FTRACE, this
means that the function graph tracer for x86_64 will need to depend on
DYNAMIC_FTRACE.

Link: https://lkml.kernel.org/r/20211020233555.16b0dbf2@rorschach.local.home
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/ftrace: Remove fault protection code in prepare_ftrace_return
Steven Rostedt (VMware) [Fri, 8 Oct 2021 09:13:30 +0000 (11:13 +0200)]
x86/ftrace: Remove fault protection code in prepare_ftrace_return

Removing the fault protection code when writing return_hooker
to stack. As Steven noted:

> That protection was there from the beginning due to being "paranoid",
> considering ftrace was bricking network cards. But that protection
> would not have even protected against that.

Link: https://lkml.kernel.org/r/20211008091336.33616-3-jolsa@kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/ftrace: Remove extra orig rax move
Jiri Olsa [Fri, 8 Oct 2021 09:13:29 +0000 (11:13 +0200)]
x86/ftrace: Remove extra orig rax move

There's identical move 2 lines earlier.

Link: https://lkml.kernel.org/r/20211008091336.33616-2-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing/perf: Add interrupt_context_level() helper
Steven Rostedt (VMware) [Fri, 15 Oct 2021 19:01:19 +0000 (15:01 -0400)]
tracing/perf: Add interrupt_context_level() helper

Now that there are three different instances of doing the addition trick
to the preempt_count() and NMI_MASK, HARDIRQ_MASK and SOFTIRQ_OFFSET
macros, it deserves a helper function defined in the preempt.h header.

Add the interrupt_context_level() helper and replace the three instances
that do that logic with it.

Link: https://lore.kernel.org/all/20211015142541.4badd8a9@gandalf.local.home/
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Reuse logic from perf's get_recursion_context()
Steven Rostedt (VMware) [Fri, 15 Oct 2021 17:42:40 +0000 (13:42 -0400)]
tracing: Reuse logic from perf's get_recursion_context()

Instead of having branches that adds noise to the branch prediction, use
the addition logic to set the bit for the level of interrupt context that
the state is currently in. This copies the logic from perf's
get_recursion_context() function.

Link: https://lore.kernel.org/all/20211015161702.GF174703@worktop.programming.kicks-ass.net/
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing/cfi: Fix cmp_entries_* functions signature mismatch
Kalesh Singh [Thu, 14 Oct 2021 04:52:17 +0000 (21:52 -0700)]
tracing/cfi: Fix cmp_entries_* functions signature mismatch

If CONFIG_CFI_CLANG=y, attempting to read an event histogram will cause
the kernel to panic due to failed CFI check.

    1. echo 'hist:keys=common_pid' >> events/sched/sched_switch/trigger
    2. cat events/sched/sched_switch/hist
    3. kernel panics on attempting to read hist

This happens because the sort() function expects a generic
int (*)(const void *, const void *) pointer for the compare function.
To prevent this CFI failure, change tracing map cmp_entries_* function
signatures to match this.

Also, fix the build error reported by the kernel test robot [1].

[1] https://lore.kernel.org/r/202110141140.zzi4dRh4-lkp@intel.com/

Link: https://lkml.kernel.org/r/20211014045217.3265162-1-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Use linker magic instead of recasting ftrace_ops_list_func()
Steven Rostedt (VMware) [Wed, 17 Jun 2020 20:56:16 +0000 (16:56 -0400)]
tracing: Use linker magic instead of recasting ftrace_ops_list_func()

In an effort to enable -Wcast-function-type in the top-level Makefile to
support Control Flow Integrity builds, all function casts need to be
removed.

This means that ftrace_ops_list_func() can no longer be defined as
ftrace_ops_no_ops(). The reason for ftrace_ops_no_ops() is to use that when
an architecture calls ftrace_ops_list_func() with only two parameters
(called from assembly). And to make sure there's no C side-effects, those
archs call ftrace_ops_no_ops() which only has two parameters, as
ftrace_ops_list_func() has four parameters.

Instead of a typecast, use vmlinux.lds.h to define ftrace_ops_list_func() to
arch_ftrace_ops_list_func() that will define the proper set of parameters.

Link: https://lore.kernel.org/r/20200614070154.6039-1-oscar.carter@gmx.com
Link: https://lkml.kernel.org/r/20200617165616.52241bde@oasis.local.home
Link: https://lore.kernel.org/all/20211005053922.GA702049@embeddedor/
Requested-by: Oscar Carter <oscar.carter@gmx.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: in_irq() cleanup
Changbin Du [Thu, 30 Sep 2021 00:03:42 +0000 (08:03 +0800)]
tracing: in_irq() cleanup

Replace the obsolete and ambiguos macro in_irq() with new
macro in_hardirq().

Link: https://lkml.kernel.org/r/20210930000342.6016-1-changbin.du@gmail.com
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Add unit test for removing trace function
Carles Pey [Sat, 18 Sep 2021 15:30:43 +0000 (19:30 +0400)]
ftrace: Add unit test for removing trace function

A self test is provided for the trace function removal functionality.

Link: https://lkml.kernel.org/r/20210918153043.318016-2-carles.pey@gmail.com
Signed-off-by: Carles Pey <carles.pey@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Cleanup dummy headers in tools/bootconfig
Masami Hiramatsu [Fri, 17 Sep 2021 10:03:16 +0000 (19:03 +0900)]
bootconfig: Cleanup dummy headers in tools/bootconfig

Cleanup dummy headers in tools/bootconfig/include except
for tools/bootconfig/include/linux/bootconfig.h.
For this change, I use __KERNEL__ macro to split kernel
header #include and introduce xbc_alloc_mem() and
xbc_free_mem().

Link: https://lkml.kernel.org/r/163187299574.2366983.18371329724128746091.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Replace u16 and u32 with uint16_t and uint32_t
Masami Hiramatsu [Fri, 17 Sep 2021 10:03:08 +0000 (19:03 +0900)]
bootconfig: Replace u16 and u32 with uint16_t and uint32_t

Replace u16 and u32 with uint16_t and uint32_t so
that the tools/bootconfig only needs <stdint.h>.

Link: https://lkml.kernel.org/r/163187298835.2366983.9838262576854319669.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotools/bootconfig: Print all error message in stderr
Masami Hiramatsu [Fri, 17 Sep 2021 10:03:01 +0000 (19:03 +0900)]
tools/bootconfig: Print all error message in stderr

Print all error message in stderr. This also removes
unneeded tools/bootconfig/include/linux/printk.h.

Link: https://lkml.kernel.org/r/163187298106.2366983.15210300267326257397.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Remove unused debug function
Masami Hiramatsu [Fri, 17 Sep 2021 10:02:53 +0000 (19:02 +0900)]
bootconfig: Remove unused debug function

Remove unused xbc_debug_dump() from bootconfig for clean up
the code.

Link: https://lkml.kernel.org/r/163187297371.2366983.12943349701785875450.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Split parse-tree part from xbc_init
Masami Hiramatsu [Fri, 17 Sep 2021 10:02:46 +0000 (19:02 +0900)]
bootconfig: Split parse-tree part from xbc_init

Split bootconfig data parser to build tree code from
xbc_init(). This is an internal cosmetic change.

Link: https://lkml.kernel.org/r/163187296647.2366983.15590065167920474865.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Rename xbc_destroy_all() to xbc_exit()
Masami Hiramatsu [Fri, 17 Sep 2021 10:02:39 +0000 (19:02 +0900)]
bootconfig: Rename xbc_destroy_all() to xbc_exit()

Avoid using this noisy name and use more calm one.
This is just a name change. No functional change.

Link: https://lkml.kernel.org/r/163187295918.2366983.5231840238429996027.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotools/bootconfig: Run test script when build all
Masami Hiramatsu [Fri, 17 Sep 2021 10:02:32 +0000 (19:02 +0900)]
tools/bootconfig: Run test script when build all

Run the bootconfig test script when build all target
so that user can notice any issue when build it.

Link: https://lkml.kernel.org/r/163187295173.2366983.18295281097397499118.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Add xbc_get_info() for the node information
Masami Hiramatsu [Thu, 16 Sep 2021 06:23:29 +0000 (15:23 +0900)]
bootconfig: Add xbc_get_info() for the node information

Add xbc_get_info() API which allows user to get the
number of used xbc_nodes and the size of bootconfig
data. This is also useful for checking the bootconfig
is initialized or not.

Link: https://lkml.kernel.org/r/163177340877.682366.4360676589783197627.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobootconfig: Allocate xbc_data inside xbc_init()
Masami Hiramatsu [Thu, 16 Sep 2021 06:23:20 +0000 (15:23 +0900)]
bootconfig: Allocate xbc_data inside xbc_init()

Allocate 'xbc_data' in the xbc_init() so that it does
not need to care about the ownership of the copied
data.

Link: https://lkml.kernel.org/r/163177339986.682366.898762699429769117.stgit@devnote2
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Cleanup ftrace_dyn_arch_init()
Weizhao Ouyang [Thu, 9 Sep 2021 09:02:16 +0000 (17:02 +0800)]
ftrace: Cleanup ftrace_dyn_arch_init()

Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common
ftrace_dyn_arch_init() to cleanup them.

Link: https://lkml.kernel.org/r/20210909090216.1955240-1-o451686892@gmail.com
Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390)
Acked-by: Helge Deller <deller@gmx.de> (parisc)
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Disable "other" permission bits in the tracefs files
Steven Rostedt (VMware) [Wed, 18 Aug 2021 15:24:51 +0000 (11:24 -0400)]
tracing: Disable "other" permission bits in the tracefs files

When building the files in the tracefs file system, do not by default set
any permissions for OTH (other). This will make it easier for admins who
want to define a group for accessing tracefs and not having to first
disable all the permission bits for "other" in the file system.

As tracing can leak sensitive information, it should never by default
allowing all users access. An admin can still set the permission bits for
others to have access, which may be useful for creating a honeypot and
seeing who takes advantage of it and roots the machine.

Link: https://lkml.kernel.org/r/20210818153038.864149276@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracefs: Have tracefs directories not set OTH permission bits by default
Steven Rostedt (VMware) [Wed, 18 Aug 2021 15:24:50 +0000 (11:24 -0400)]
tracefs: Have tracefs directories not set OTH permission bits by default

The tracefs file system is by default mounted such that only root user can
access it. But there are legitimate reasons to create a group and allow
those added to the group to have access to tracing. By changing the
permissions of the tracefs mount point to allow access, it will allow
group access to the tracefs directory.

There should not be any real reason to allow all access to the tracefs
directory as it contains sensitive information. Have the default
permission of directories being created not have any OTH (other) bits set,
such that an admin that wants to give permission to a group has to first
disable all OTH bits in the file system.

Link: https://lkml.kernel.org/r/20210818153038.664127804@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Initialize upper and lower vars in pid_list_refill_irq()
Steven Rostedt (VMware) [Thu, 7 Oct 2021 13:53:53 +0000 (09:53 -0400)]
tracing: Initialize upper and lower vars in pid_list_refill_irq()

The upper and lower variables are set as link lists to add into the sparse
array. If they are NULL, after the needed allocations are done, then there
is nothing to add. But they need to be initialized to NULL for this to
work.

Link: https://lore.kernel.org/all/221bc7ba-a475-1cb9-1bbe-730bb9c2d448@canonical.com/
Fixes: 8d6e90983ade ("tracing: Create a sparse bitmask for pid filtering")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Create a sparse bitmask for pid filtering
Steven Rostedt (VMware) [Fri, 24 Sep 2021 02:20:57 +0000 (22:20 -0400)]
tracing: Create a sparse bitmask for pid filtering

When the trace_pid_list was created, the default pid max was 32768.
Creating a bitmask that can hold one bit for all 32768 took up 4096 (one
page). Having a one page bitmask was not much of a problem, and that was
used for mapping pids. But today, systems are bigger and can run more
tasks, and now the default pid_max is usually set to 4194304. Which means
to handle that many pids requires 524288 bytes. Worse yet, the pid_max can
be set to 2^30 (1073741824 or 1G) which would take 134217728 (128M) of
memory to store this array.

Since the pid_list array is very sparsely populated, it is a huge waste of
memory to store all possible bits for each pid when most will not be set.

Instead, use a page table scheme to store the array, and allow this to
handle up to 30 bit pids.

The pid_mask will start out with 256 entries for the first 8 MSB bits.
This will cost 1K for 32 bit architectures and 2K for 64 bit. Each of
these will have a 256 array to store the next 8 bits of the pid (another
1 or 2K). These will hold an 2K byte bitmask (which will cover the LSB
14 bits or 16384 pids).

When the trace_pid_list is allocated, it will have the 1/2K upper bits
allocated, and then it will allocate a cache for the next upper chunks and
the lower chunks (default 6 of each). Then when a bit is "set", these
chunks will be pulled from the free list and added to the array. If the
free list gets down to a lever (default 2), it will trigger an irqwork
that will refill the cache back up.

On clearing a bit, if the clear causes the bitmask to be zero, that chunk
will then be placed back into the free cache for later use, keeping the
need to allocate more down to a minimum.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Place trace_pid_list logic into abstract functions
Steven Rostedt (VMware) [Fri, 24 Sep 2021 01:03:49 +0000 (21:03 -0400)]
tracing: Place trace_pid_list logic into abstract functions

Instead of having the logic that does trace_pid_list open coded, wrap it in
abstract functions. This will allow a rewrite of the logic that implements
the trace_pid_list without affecting the users.

Note, this causes a change in behavior. Every time a pid is written into
the set_*_pid file, it creates a new list and uses RCU to update it. If
pid_max is lowered, but there was a pid currently in the list that was
higher than pid_max, those pids will now be removed on updating the list.
The old behavior kept that from happening.

The rewrite of the pid_list logic will no longer depend on pid_max,
and will return the old behavior.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/kprobes: Fixup return address in generic trampoline handler
Masami Hiramatsu [Tue, 14 Sep 2021 14:42:51 +0000 (23:42 +0900)]
x86/kprobes: Fixup return address in generic trampoline handler

In x86, the fake return address on the stack saved by
__kretprobe_trampoline() will be replaced with the real return
address after returning from trampoline_handler(). Before fixing
the return address, the real return address can be found in the
'current->kretprobe_instances'.

However, since there is a window between updating the
'current->kretprobe_instances' and fixing the address on the stack,
if an interrupt happens at that timing and the interrupt handler
does stacktrace, it may fail to unwind because it can not get
the correct return address from 'current->kretprobe_instances'.

This will eliminate that window by fixing the return address
right before updating 'current->kretprobe_instances'.

Link: https://lkml.kernel.org/r/163163057094.489837.9044470370440745866.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Show kretprobe unknown indicator only for kretprobe_trampoline
Masami Hiramatsu [Tue, 14 Sep 2021 14:42:40 +0000 (23:42 +0900)]
tracing: Show kretprobe unknown indicator only for kretprobe_trampoline

ftrace shows "[unknown/kretprobe'd]" indicator all addresses in the
kretprobe_trampoline, but the modified address by kretprobe should
be only kretprobe_trampoline+0.

Link: https://lkml.kernel.org/r/163163056044.489837.794883849706638013.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/unwind: Recover kretprobe trampoline entry
Masami Hiramatsu [Tue, 14 Sep 2021 14:42:31 +0000 (23:42 +0900)]
x86/unwind: Recover kretprobe trampoline entry

Since the kretprobe replaces the function return address with
the kretprobe_trampoline on the stack, x86 unwinders can not
continue the stack unwinding at that point, or record
kretprobe_trampoline instead of correct return address.

To fix this issue, find the correct return address from task's
kretprobe_instances as like as function-graph tracer does.

With this fix, the unwinder can correctly unwind the stack
from kretprobe event on x86, as below.

           <...>-135     [003] ...1     6.722338: r_full_proxy_read_0: (vfs_read+0xab/0x1a0 <- full_proxy_read)
           <...>-135     [003] ...1     6.722377: <stack trace>
 => kretprobe_trace_func+0x209/0x2f0
 => kretprobe_dispatcher+0x4a/0x70
 => __kretprobe_trampoline_handler+0xca/0x150
 => trampoline_handler+0x44/0x70
 => kretprobe_trampoline+0x2a/0x50
 => vfs_read+0xab/0x1a0
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x33/0x40
 => entry_SYSCALL_64_after_hwframe+0x44/0xae

Link: https://lkml.kernel.org/r/163163055130.489837.5161749078833497255.stgit@devnote2
Reported-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/kprobes: Push a fake return address at kretprobe_trampoline
Masami Hiramatsu [Tue, 14 Sep 2021 14:42:22 +0000 (23:42 +0900)]
x86/kprobes: Push a fake return address at kretprobe_trampoline

Change __kretprobe_trampoline() to push the address of the
__kretprobe_trampoline() as a fake return address at the bottom
of the stack frame. This fake return address will be replaced
with the correct return address in the trampoline_handler().

With this change, the ORC unwinder can check whether the return
address is modified by kretprobes or not.

Link: https://lkml.kernel.org/r/163163054185.489837.14338744048957727386.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Enable stacktrace from pt_regs in kretprobe handler
Masami Hiramatsu [Tue, 14 Sep 2021 14:42:12 +0000 (23:42 +0900)]
kprobes: Enable stacktrace from pt_regs in kretprobe handler

Since the ORC unwinder from pt_regs requires setting up regs->ip
correctly, set the correct return address to the regs->ip before
calling user kretprobe handler.

This allows the kretrprobe handler to trace stack from the
kretprobe's pt_regs by stack_trace_save_regs() (eBPF will do
this), instead of stack tracing from the handler context by
stack_trace_save() (ftrace will do this).

Link: https://lkml.kernel.org/r/163163053237.489837.4272653874525136832.stgit@devnote2
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoarm: kprobes: Make space for instruction pointer on stack
Masami Hiramatsu [Tue, 14 Sep 2021 14:42:02 +0000 (23:42 +0900)]
arm: kprobes: Make space for instruction pointer on stack

Since arm's __kretprobe_trampoline() saves partial 'pt_regs' on the
stack, 'regs->ARM_pc' (instruction pointer) is not accessible from
the kretprobe handler. This means if instruction_pointer_set() is
used from kretprobe handler, it will break the data on the stack.

Make space for instruction pointer (ARM_pc) on the stack in the
__kretprobe_trampoline() for fixing this problem.

Link: https://lkml.kernel.org/r/163163052262.489837.10327621053231461255.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoia64: Add instruction_pointer_set() API
Masami Hiramatsu [Tue, 14 Sep 2021 14:41:52 +0000 (23:41 +0900)]
ia64: Add instruction_pointer_set() API

Add instruction_pointer_set() API for ia64.

Link: https://lkml.kernel.org/r/163163051195.489837.1039597819838213481.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoARC: Add instruction_pointer_set() API
Masami Hiramatsu [Tue, 14 Sep 2021 14:41:41 +0000 (23:41 +0900)]
ARC: Add instruction_pointer_set() API

Add instruction_pointer_set() API for arc.

Link: https://lkml.kernel.org/r/163163050148.489837.15187799269793560256.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agox86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline()
Josh Poimboeuf [Tue, 14 Sep 2021 14:41:32 +0000 (23:41 +0900)]
x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline()

Add UNWIND_HINT_FUNC on __kretprobe_trampoline() code so that ORC
information is generated on the __kretprobe_trampoline() correctly.
Also, this uses STACK_FRAME_NON_STANDARD_FP(), CONFIG_FRAME_POINTER-
-specific version of STACK_FRAME_NON_STANDARD().

Link: https://lkml.kernel.org/r/163163049242.489837.11970969750993364293.stgit@devnote2
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoobjtool: Ignore unwind hints for ignored functions
Josh Poimboeuf [Tue, 14 Sep 2021 14:41:23 +0000 (23:41 +0900)]
objtool: Ignore unwind hints for ignored functions

If a function is ignored, also ignore its hints.  This is useful for the
case where the function ignore is conditional on frame pointers, e.g.
STACK_FRAME_NON_STANDARD_FP().

Link: https://lkml.kernel.org/r/163163048317.489837.10988954983369863209.stgit@devnote2
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoobjtool: Add frame-pointer-specific function ignore
Josh Poimboeuf [Tue, 14 Sep 2021 14:41:13 +0000 (23:41 +0900)]
objtool: Add frame-pointer-specific function ignore

Add a CONFIG_FRAME_POINTER-specific version of
STACK_FRAME_NON_STANDARD() for the case where a function is
intentionally missing frame pointer setup, but otherwise needs
objtool/ORC coverage when frame pointers are disabled.

Link: https://lkml.kernel.org/r/163163047364.489837.17377799909553689661.stgit@devnote2
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Add kretprobe_find_ret_addr() for searching return address
Masami Hiramatsu [Tue, 14 Sep 2021 14:41:04 +0000 (23:41 +0900)]
kprobes: Add kretprobe_find_ret_addr() for searching return address

Introduce kretprobe_find_ret_addr() and is_kretprobe_trampoline().
These APIs will be used by the ORC stack unwinder and ftrace, so that
they can check whether the given address points kretprobe trampoline
code and query the correct return address in that case.

Link: https://lkml.kernel.org/r/163163046461.489837.1044778356430293962.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: treewide: Make it harder to refer kretprobe_trampoline directly
Masami Hiramatsu [Tue, 14 Sep 2021 14:40:54 +0000 (23:40 +0900)]
kprobes: treewide: Make it harder to refer kretprobe_trampoline directly

Since now there is kretprobe_trampoline_addr() for referring the
address of kretprobe trampoline code, we don't need to access
kretprobe_trampoline directly.

Make it harder to refer by renaming it to __kretprobe_trampoline().

Link: https://lkml.kernel.org/r/163163045446.489837.14510577516938803097.stgit@devnote2
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()
Masami Hiramatsu [Tue, 14 Sep 2021 14:40:45 +0000 (23:40 +0900)]
kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()

The __kretprobe_trampoline_handler() callback, called from low level
arch kprobes methods, has the 'trampoline_address' parameter, which is
entirely superfluous as it basically just replicates:

  dereference_kernel_function_descriptor(kretprobe_trampoline)

In fact we had bugs in arch code where it wasn't replicated correctly.

So remove this superfluous parameter and use kretprobe_trampoline_addr()
instead.

Link: https://lkml.kernel.org/r/163163044546.489837.13505751885476015002.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor()
Masami Hiramatsu [Tue, 14 Sep 2021 14:40:36 +0000 (23:40 +0900)]
kprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor()

~15 years ago kprobes grew the 'arch_deref_entry_point()' __weak function:

  3d7e33825d87: ("jprobes: make jprobes a little safer for users")

But this is just open-coded dereference_symbol_descriptor() in essence, and
its obscure nature was causing bugs.

Just use the real thing and remove arch_deref_entry_point().

Link: https://lkml.kernel.org/r/163163043630.489837.7924988885652708696.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoia64: kprobes: Fix to pass correct trampoline address to the handler
Masami Hiramatsu [Tue, 14 Sep 2021 14:40:27 +0000 (23:40 +0900)]
ia64: kprobes: Fix to pass correct trampoline address to the handler

The following commit:

   Commit e792ff804f49 ("ia64: kprobes: Use generic kretprobe trampoline handler")

Passed the wrong trampoline address to __kretprobe_trampoline_handler(): it
passes the descriptor address instead of function entry address.

Pass the right parameter.

Also use correct symbol dereference function to get the function address
from 'kretprobe_trampoline' - an IA64 special.

Link: https://lkml.kernel.org/r/163163042696.489837.12551102356265354730.stgit@devnote2
Fixes: e792ff804f49 ("ia64: kprobes: Use generic kretprobe trampoline handler")
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: X86 ML <x86@kernel.org>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Abhishek Sagar <sagar.abhishek@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Use bool type for functions which returns boolean value
Masami Hiramatsu [Tue, 14 Sep 2021 14:40:16 +0000 (23:40 +0900)]
kprobes: Use bool type for functions which returns boolean value

Use the 'bool' type instead of 'int' for the functions which
returns a boolean value, because this makes clear that those
functions don't return any error code.

Link: https://lkml.kernel.org/r/163163041649.489837.17311187321419747536.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: treewide: Use 'kprobe_opcode_t *' for the code address in get_optimized_kprobe()
Masami Hiramatsu [Tue, 14 Sep 2021 14:40:07 +0000 (23:40 +0900)]
kprobes: treewide: Use 'kprobe_opcode_t *' for the code address in get_optimized_kprobe()

Since get_optimized_kprobe() is only used inside kprobes,
it doesn't need to use 'unsigned long' type for 'addr' parameter.
Make it use 'kprobe_opcode_t *' for the 'addr' parameter and
subsequent call of arch_within_optimized_kprobe() also should use
'kprobe_opcode_t *'.

Note that MAX_OPTIMIZED_LENGTH and RELATIVEJUMP_SIZE are defined
by byte-size, but the size of 'kprobe_opcode_t' depends on the
architecture. Therefore, we must be careful when calculating
addresses using those macros.

Link: https://lkml.kernel.org/r/163163040680.489837.12133032364499833736.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Add assertions for required lock
Masami Hiramatsu [Tue, 14 Sep 2021 14:39:55 +0000 (23:39 +0900)]
kprobes: Add assertions for required lock

Add assertions for required locks instead of comment it
so that the lockdep can inspect locks automatically.

Link: https://lkml.kernel.org/r/163163039572.489837.18011973177537476885.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Use IS_ENABLED() instead of kprobes_built_in()
Masami Hiramatsu [Tue, 14 Sep 2021 14:39:46 +0000 (23:39 +0900)]
kprobes: Use IS_ENABLED() instead of kprobes_built_in()

Use IS_ENABLED(CONFIG_KPROBES) instead of kprobes_built_in().
This inline function is introduced only for avoiding #ifdef.
But since now we have IS_ENABLED(), it is no longer needed.

Link: https://lkml.kernel.org/r/163163038581.489837.2805250706507372658.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Fix coding style issues
Masami Hiramatsu [Tue, 14 Sep 2021 14:39:34 +0000 (23:39 +0900)]
kprobes: Fix coding style issues

Fix coding style issues reported by checkpatch.pl and update
comments to quote variable names and add "()" to function
name.
One TODO comment in __disarm_kprobe() is removed because
it has been done by following commit.

Link: https://lkml.kernel.org/r/163163037468.489837.4282347782492003960.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: treewide: Cleanup the error messages for kprobes
Masami Hiramatsu [Tue, 14 Sep 2021 14:39:25 +0000 (23:39 +0900)]
kprobes: treewide: Cleanup the error messages for kprobes

This clean up the error/notification messages in kprobes related code.
Basically this defines 'pr_fmt()' macros for each files and update
the messages which describes

 - what happened,
 - what is the kernel going to do or not do,
 - is the kernel fine,
 - what can the user do about it.

Also, if the message is not needed (e.g. the function returns unique
error code, or other error message is already shown.) remove it,
and replace the message with WARN_*() macros if suitable.

Link: https://lkml.kernel.org/r/163163036568.489837.14085396178727185469.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Make arch_check_ftrace_location static
Punit Agrawal [Tue, 14 Sep 2021 14:39:16 +0000 (23:39 +0900)]
kprobes: Make arch_check_ftrace_location static

arch_check_ftrace_location() was introduced as a weak function in
commit f7f242ff004499 ("kprobes: introduce weak
arch_check_ftrace_location() helper function") to allow architectures
to handle kprobes call site on their own.

Recently, the only architecture (csky) to implement
arch_check_ftrace_location() was migrated to using the common
version.

As a result, further cleanup the code to drop the weak attribute and
rename the function to remove the architecture specific
implementation.

Link: https://lkml.kernel.org/r/163163035673.489837.2367816318195254104.stgit@devnote2
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agocsky: ftrace: Drop duplicate implementation of arch_check_ftrace_location()
Punit Agrawal [Tue, 14 Sep 2021 14:39:06 +0000 (23:39 +0900)]
csky: ftrace: Drop duplicate implementation of arch_check_ftrace_location()

The csky specific arch_check_ftrace_location() shadows a weak
implementation of the function in core code that offers the same
functionality but with additional error checking.

Drop the architecture specific function as a step towards further
cleanup in core code.

Link: https://lkml.kernel.org/r/163163034617.489837.7789033031868135258.stgit@devnote2
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobe: Simplify prepare_kprobe() by dropping redundant version
Punit Agrawal [Tue, 14 Sep 2021 14:38:57 +0000 (23:38 +0900)]
kprobe: Simplify prepare_kprobe() by dropping redundant version

The function prepare_kprobe() is called during kprobe registration and
is responsible for ensuring any architecture related preparation for
the kprobe is done before returning.

One of two versions of prepare_kprobe() is chosen depending on the
availability of KPROBE_ON_FTRACE in the kernel configuration.

Simplify the code by dropping the version when KPROBE_ON_FTRACE is not
selected - instead relying on kprobe_ftrace() to return false when
KPROBE_ON_FTRACE is not set.

No functional change.

Link: https://lkml.kernel.org/r/163163033696.489837.9264661820279300788.stgit@devnote2
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Use helper to parse boolean input from userspace
Punit Agrawal [Tue, 14 Sep 2021 14:38:46 +0000 (23:38 +0900)]
kprobes: Use helper to parse boolean input from userspace

The "enabled" file provides a debugfs interface to arm / disarm
kprobes in the kernel. In order to parse the buffer containing the
values written from userspace, the callback manually parses the user
input to convert it to a boolean value.

As taking a string value from userspace and converting it to boolean
is a common operation, a helper kstrtobool_from_user() is already
available in the kernel. Update the callback to use the common helper
to parse the write buffer from userspace.

Link: https://lkml.kernel.org/r/163163032637.489837.10678039554832855327.stgit@devnote2
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agokprobes: Do not use local variable when creating debugfs file
Punit Agrawal [Tue, 14 Sep 2021 14:38:37 +0000 (23:38 +0900)]
kprobes: Do not use local variable when creating debugfs file

debugfs_create_file() takes a pointer argument that can be used during
file operation callbacks (accessible via i_private in the inode
structure). An obvious requirement is for the pointer to refer to
valid memory when used.

When creating the debugfs file to dynamically enable / disable
kprobes, a pointer to local variable is passed to
debugfs_create_file(); which will go out of scope when the init
function returns. The reason this hasn't triggered random memory
corruption is because the pointer is not accessed during the debugfs
file callbacks.

Since the enabled state is managed by the kprobes_all_disabled global
variable, the local variable is not needed. Fix the incorrect (and
unnecessary) usage of local variable during debugfs_file_create() by
passing NULL instead.

Link: https://lkml.kernel.org/r/163163031686.489837.4476867635937014973.stgit@devnote2
Fixes: bf8f6e5b3e51 ("Kprobes: The ON/OFF knob thru debugfs")
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoLinux 5.15-rc3
Linus Torvalds [Sun, 26 Sep 2021 21:08:19 +0000 (14:08 -0700)]
Linux 5.15-rc3

3 years agoMerge tag '5.15-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Sun, 26 Sep 2021 19:46:45 +0000 (12:46 -0700)]
Merge tag '5.15-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd

Pull ksmbd fixes from Steve French:
 "Five fixes for the ksmbd kernel server, including three security
  fixes:

   - remove follow symlinks support

   - use LOOKUP_BENEATH to prevent out of share access

   - SMB3 compounding security fix

   - fix for returning the default streams correctly, fixing a bug when
     writing ppt or doc files from some clients

   - logging more clearly that ksmbd is experimental (at module load
     time)"

* tag '5.15-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd:
  ksmbd: use LOOKUP_BENEATH to prevent the out of share access
  ksmbd: remove follow symlinks support
  ksmbd: check protocol id in ksmbd_verify_smb_message()
  ksmbd: add default data stream name in FILE_STREAM_INFORMATION
  ksmbd: log that server is experimental at module load

3 years agoMerge tag 'edac_urgent_for_v5.15_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 26 Sep 2021 19:18:10 +0000 (12:18 -0700)]
Merge tag 'edac_urgent_for_v5.15_rc3' of git://git./linux/kernel/git/ras/ras

Pull EDAC fixes from Borislav Petkov:
 "Fix two EDAC drivers using the wrong value type for the DIMM mode"

* tag 'edac_urgent_for_v5.15_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/dmc520: Assign the proper type to dimm->edac_mode
  EDAC/synopsys: Fix wrong value type assignment for edac_mode

3 years agoMerge tag 'thermal-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/therma...
Linus Torvalds [Sun, 26 Sep 2021 19:11:58 +0000 (12:11 -0700)]
Merge tag 'thermal-v5.15-rc3' of git://git./linux/kernel/git/thermal/linux

Pull thermal fixes from Daniel Lezcano:

 - Fix thermal shutdown after a suspend/resume due to a wrong TCC value
   restored on Intel platform (Antoine Tenart)

 - Fix potential buffer overflow when building the list of policies. The
   buffer size is not updated after writing to it (Dan Carpenter)

 - Fix wrong check against IS_ERR instead of NULL (Ansuel Smith)

* tag 'thermal-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/tsens: Fix wrong check for tzd in irq handlers
  thermal/core: Potential buffer overflow in thermal_build_list_of_policies()
  thermal/drivers/int340x: Do not set a wrong tcc offset on resume

3 years agoMerge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 26 Sep 2021 17:09:20 +0000 (10:09 -0700)]
Merge tag 'x86-urgent-2021-09-26' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Prevent sending the wrong signal when protection keys are enabled
     and the kernel handles a fault in the vsyscall emulation.

   - Invoke early_reserve_memory() before invoking e820_memory_setup()
     which is required to make the Xen dom0 e820 hooks work correctly.

   - Use the correct data type for the SETZ operand in the EMQCMDS
     instruction wrapper.

   - Prevent undefined behaviour to the potential unaligned accesss in
     the instruction decoder library"

* tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
  x86/asm: Fix SETZ size enqcmds() build failure
  x86/setup: Call early_reserve_memory() earlier
  x86/fault: Fix wrong signal when vsyscall fails with pkey

3 years agoMerge tag 'timers-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 26 Sep 2021 17:00:16 +0000 (10:00 -0700)]
Merge tag 'timers-urgent-2021-09-26' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "A single fix for the recently introduced regression in posix CPU
  timers which failed to stop the timer when requested. That caused
  unexpected signals to be sent to the process/thread causing
  malfunction"

* tag 'timers-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Prevent spuriously armed 0-value itimer

3 years agoMerge tag 'irq-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 26 Sep 2021 16:55:22 +0000 (09:55 -0700)]
Merge tag 'irq-urgent-2021-09-26' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of fixes for interrupt chip drivers:

   - Work around a bad GIC integration on a Renesas platform which can't
     handle byte-sized MMIO access

   - Plug a potential memory leak in the GICv4 driver

   - Fix a regression in the Armada 370-XP IPI code which was caused by
     issuing EOI instack of ACK.

   - A couple of small fixes here and there"

* tag 'irq-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic: Work around broken Renesas integration
  irqchip/renesas-rza1: Use semicolons instead of commas
  irqchip/gic-v3-its: Fix potential VPE leak on error
  irqchip/goldfish-pic: Select GENERIC_IRQ_CHIP to fix build
  irqchip/mbigen: Repair non-kernel-doc notation
  irqdomain: Change the type of 'size' in __irq_domain_add() to be consistent
  irqchip/armada-370-xp: Fix ack/eoi breakage
  Documentation: Fix irq-domain.rst build warning

3 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 25 Sep 2021 23:20:34 +0000 (16:20 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "16 patches.

  Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts,
  lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache,
  debug, and pagemap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix uninitialized use in overcommit_policy_handler
  mm/memory_failure: fix the missing pte_unmap() call
  kasan: always respect CONFIG_KASAN_STACK
  sh: pgtable-3level: fix cast to pointer from integer of different size
  mm/debug: sync up latest migrate_reason to migrate_reason_names
  mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
  mm: fs: invalidate bh_lrus for only cold path
  lib/zlib_inflate/inffast: check config in C to avoid unused function warning
  tools/vm/page-types: remove dependency on opt_file for idle page tracking
  scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error
  ocfs2: drop acl cache for directories too
  mm/shmem.c: fix judgment error in shmem_is_huge()
  xtensa: increase size of gcc stack frame check
  mm/damon: don't use strnlen() with known-bogus source length
  kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
  mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()

3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 25 Sep 2021 23:05:56 +0000 (16:05 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Thirty-three fixes, I'm afraid.

  Essentially the build up from the last couple of weeks while I've been
  dealling with Linux Plumbers conference infrastructure issues. It's
  mostly the usual assortment of spelling fixes and minor corrections.

  The only core relevant changes are to the sd driver to reduce the spin
  up message spew and fix a small memory leak on the freeing path"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (33 commits)
  scsi: ses: Retry failed Send/Receive Diagnostic commands
  scsi: target: Fix spelling mistake "CONFLIFT" -> "CONFLICT"
  scsi: lpfc: Fix gcc -Wstringop-overread warning, again
  scsi: lpfc: Use correct scnprintf() limit
  scsi: lpfc: Fix sprintf() overflow in lpfc_display_fpin_wwpn()
  scsi: core: Remove 'current_tag'
  scsi: acornscsi: Remove tagged queuing vestiges
  scsi: fas216: Kill scmd->tag
  scsi: qla2xxx: Restore initiator in dual mode
  scsi: ufs: core: Unbreak the reset handler
  scsi: sd_zbc: Support disks with more than 2**32 logical blocks
  scsi: ufs: core: Revert "scsi: ufs: Synchronize SCSI and UFS error handling"
  scsi: bsg: Fix device unregistration
  scsi: sd: Make sd_spinup_disk() less noisy
  scsi: ufs: ufs-pci: Fix Intel LKF link stability
  scsi: mpt3sas: Clean up some inconsistent indenting
  scsi: megaraid: Clean up some inconsistent indenting
  scsi: sr: Fix spelling mistake "does'nt" -> "doesn't"
  scsi: Remove SCSI CDROM MAINTAINERS entry
  scsi: megaraid: Fix Coccinelle warning
  ...

3 years agoMerge tag 'io_uring-5.15-2021-09-25' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 25 Sep 2021 22:51:08 +0000 (15:51 -0700)]
Merge tag 'io_uring-5.15-2021-09-25' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "This one looks a bit bigger than it is, but that's mainly because 2/3
  of it is enabling IORING_OP_CLOSE to close direct file descriptors.

  We've had a few folks using them and finding it confusing that the way
  to close them is through using -1 for file update, this just brings
  API symmetry for direct descriptors. Hence I think we should just do
  this now and have a better API for 5.15 release. There's some room for
  de-duplicating the close code, but we're leaving that for the next
  merge window.

  Outside of that, just small fixes:

   - Poll race fixes (Hao)

   - io-wq core dump exit fix (me)

   - Reschedule around potentially intensive tctx and buffer iterators
     on teardown (me)

   - Fix for always ending up punting files update to io-wq (me)

   - Put the provided buffer meta data under memcg accounting (me)

   - Tweak for io_write(), removing dead code that was added with the
     iterator changes in this release (Pavel)"

* tag 'io_uring-5.15-2021-09-25' of git://git.kernel.dk/linux-block:
  io_uring: make OP_CLOSE consistent with direct open
  io_uring: kill extra checks in io_write()
  io_uring: don't punt files update to io-wq unconditionally
  io_uring: put provided buffer meta data under memcg accounting
  io_uring: allow conditional reschedule for intensive iterators
  io_uring: fix potential req refcount underflow
  io_uring: fix missing set of EPOLLONESHOT for CQ ring overflow
  io_uring: fix race between poll completion and cancel_hash insertion
  io-wq: ensure we exit if thread group is exiting