sched/fair: Improve spreading of utilization
authorVincent Guittot <vincent.guittot@linaro.org>
Thu, 12 Mar 2020 16:54:29 +0000 (17:54 +0100)
committerPeter Zijlstra <peterz@infradead.org>
Fri, 20 Mar 2020 12:06:20 +0000 (13:06 +0100)
commitc32b4308295aaaaedd5beae56cb42e205ae63e58
tree87ed7a5e12345b7c5f13f24787aac55bf76e918b
parent26cf52229efc87e2effa9d788f9b33c40fb3358a
sched/fair: Improve spreading of utilization

During load_balancing, a group with spare capacity will try to pull some
utilizations from an overloaded group. In such case, the load balance
looks for the runqueue with the highest utilization. Nevertheless, it
should also ensure that there are some pending tasks to pull otherwise
the load balance will fail to pull a task and the spread of the load will
be delayed.

This situation is quite transient but it's possible to highlight the
effect with a short run of sysbench test so the time to spread task impacts
the global result significantly.

Below are the average results for 15 iterations on an arm64 octo core:
sysbench --test=cpu --num-threads=8  --max-requests=1000 run

                           tip/sched/core  +patchset
total time:                172ms           158ms
per-request statistics:
         avg:                1.337ms         1.244ms
         max:               21.191ms        10.753ms

The average max doesn't fully reflect the wide spread of the value which
ranges from 1.350ms to more than 41ms for the tip/sched/core and from
1.350ms to 21ms with the patch.

Other factors like waiting for an idle load balance or cache hotness
can delay the spreading of the tasks which explains why we can still
have up to 21ms with the patch.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200312165429.990-1-vincent.guittot@linaro.org
kernel/sched/fair.c