sched/fair: Ensure that the CFS parent is added after unthrottling
authorRik van Riel <riel@surriel.com>
Mon, 21 Jun 2021 17:43:30 +0000 (19:43 +0200)
committerIngo Molnar <mingo@kernel.org>
Tue, 22 Jun 2021 12:06:57 +0000 (14:06 +0200)
Ensure that a CFS parent will be in the list whenever one of its children is also
in the list.

A warning on rq->tmp_alone_branch != &rq->leaf_cfs_rq_list has been
reported while running LTP test cfs_bandwidth01.

Odin Ugedal found the root cause:

$ tree /sys/fs/cgroup/ltp/ -d --charset=ascii
/sys/fs/cgroup/ltp/
|-- drain
`-- test-6851
    `-- level2
|-- level3a
|   |-- worker1
|   `-- worker2
`-- level3b
    `-- worker3

Timeline (ish):
- worker3 gets throttled
- level3b is decayed, since it has no more load
- level2 get throttled
- worker3 get unthrottled
- level2 get unthrottled
  - worker3 is added to list
  - level3b is not added to list, since nr_running==0 and is decayed

 [ Vincent Guittot: Rebased and updated to fix for the reported warning. ]

Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Acked-by: Odin Ugedal <odin@uged.al>
Link: https://lore.kernel.org/r/20210621174330.11258-1-vincent.guittot@linaro.org
kernel/sched/fair.c

index bfaa6e1..2366331 100644 (file)
@@ -3298,6 +3298,31 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags)
 
 #ifdef CONFIG_SMP
 #ifdef CONFIG_FAIR_GROUP_SCHED
+/*
+ * Because list_add_leaf_cfs_rq always places a child cfs_rq on the list
+ * immediately before a parent cfs_rq, and cfs_rqs are removed from the list
+ * bottom-up, we only have to test whether the cfs_rq before us on the list
+ * is our child.
+ * If cfs_rq is not on the list, test whether a child needs its to be added to
+ * connect a branch to the tree  * (see list_add_leaf_cfs_rq() for details).
+ */
+static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
+{
+       struct cfs_rq *prev_cfs_rq;
+       struct list_head *prev;
+
+       if (cfs_rq->on_list) {
+               prev = cfs_rq->leaf_cfs_rq_list.prev;
+       } else {
+               struct rq *rq = rq_of(cfs_rq);
+
+               prev = rq->tmp_alone_branch;
+       }
+
+       prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
+
+       return (prev_cfs_rq->tg->parent == cfs_rq->tg);
+}
 
 static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
 {
@@ -3313,6 +3338,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
        if (cfs_rq->avg.runnable_sum)
                return false;
 
+       if (child_cfs_rq_on_list(cfs_rq))
+               return false;
+
        return true;
 }