ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU
authorNicholas Piggin <npiggin@gmail.com>
Thu, 25 Jun 2020 05:34:03 +0000 (15:34 +1000)
committerSteven Rostedt (VMware) <rostedt@goodmis.org>
Tue, 30 Jun 2020 21:18:56 +0000 (17:18 -0400)
commitb23d7a5f4a07af02343cdd28fe1f7488bac3afda
treec775987b3e84915ec4c22a410f65c8c145ee70ce
parent10464b4aa605ef93c937452f442e74cc0a4a6ceb
ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU

On a 144 thread system, `perf ftrace` takes about 20 seconds to start
up, due to calling synchronize_rcu() for each CPU.

  cat /proc/108560/stack
    0xc0003e7eb336f470
    __switch_to+0x2e0/0x480
    __wait_rcu_gp+0x20c/0x220
    synchronize_rcu+0x9c/0xc0
    ring_buffer_reset_cpu+0x88/0x2e0
    tracing_reset_online_cpus+0x84/0xe0
    tracing_open+0x1d4/0x1f0

On a system with 10x more threads, it starts to become an annoyance.

Batch these up so we disable all the per-cpu buffers first, then
synchronize_rcu() once, then reset each of the buffers. This brings
the time down to about 0.5s.

Link: https://lkml.kernel.org/r/20200625053403.2386972-1-npiggin@gmail.com
Tested-by: Anton Blanchard <anton@ozlabs.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
include/linux/ring_buffer.h
kernel/trace/ring_buffer.c
kernel/trace/trace.c