sched/tracing: Don't re-read p->state when emitting sched_switch event
authorValentin Schneider <valentin.schneider@arm.com>
Thu, 20 Jan 2022 16:25:19 +0000 (16:25 +0000)
committerPeter Zijlstra <peterz@infradead.org>
Tue, 1 Mar 2022 15:18:39 +0000 (16:18 +0100)
commitfa2c3254d7cfff5f7a916ab928a562d1165f17bb
tree678cc10a62564212f526fc4a65ea345fde95794e
parent49bef33e4b87b743495627a529029156c6e09530
sched/tracing: Don't re-read p->state when emitting sched_switch event

As of commit

  c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")

the following sequence becomes possible:

      p->__state = TASK_INTERRUPTIBLE;
      __schedule()
deactivate_task(p);
  ttwu()
    READ !p->on_rq
    p->__state=TASK_WAKING
trace_sched_switch()
  __trace_sched_switch_state()
    task_state_index()
      return 0;

TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in
the trace event.

Prevent this by pushing the value read from __schedule() down the trace
event.

Reported-by: Abhijeet Dharmapurikar <adharmap@quicinc.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220120162520.570782-2-valentin.schneider@arm.com
include/linux/sched.h
include/trace/events/sched.h
kernel/sched/core.c
kernel/trace/fgraph.c
kernel/trace/ftrace.c
kernel/trace/trace_events.c
kernel/trace/trace_osnoise.c
kernel/trace/trace_sched_switch.c
kernel/trace/trace_sched_wakeup.c