smp: Run functions concurrently in smp_call_function_many_cond()
authorNadav Amit <namit@vmware.com>
Sat, 20 Feb 2021 23:17:04 +0000 (15:17 -0800)
committerIngo Molnar <mingo@kernel.org>
Sat, 6 Mar 2021 11:59:09 +0000 (12:59 +0100)
commita32a4d8a815c4eb6dc64b8962dc13a9dfae70868
tree48c6cc9e7a5ad613eaae0b9b4632182f5f63c979
parenta38fd8748464831584a19438cbb3082b5a2dab15
smp: Run functions concurrently in smp_call_function_many_cond()

Currently, on_each_cpu() and similar functions do not exploit the
potential of concurrency: the function is first executed remotely and
only then it is executed locally. Functions such as TLB flush can take
considerable time, so this provides an opportunity for performance
optimization.

To do so, modify smp_call_function_many_cond(), to allows the callers to
provide a function that should be executed (remotely/locally), and run
them concurrently. Keep other smp_call_function_many() semantic as it is
today for backward compatibility: the called function is not executed in
this case locally.

smp_call_function_many_cond() does not use the optimized version for a
single remote target that smp_call_function_single() implements. For
synchronous function call, smp_call_function_single() keeps a
call_single_data (which is used for synchronization) on the stack.
Interestingly, it seems that not using this optimization provides
greater performance improvements (greater speedup with a single remote
target than with multiple ones). Presumably, holding data structures
that are intended for synchronization on the stack can introduce
overheads due to TLB misses and false-sharing when the stack is used for
other purposes.

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20210220231712.2475218-2-namit@vmware.com
kernel/smp.c