bpf: optimize bpf_map_update_elem() for map-in-map types
authorRitesh Oedayrajsingh Varma <ritesh@superluminal.eu>
Fri, 28 Nov 2025 00:02:35 +0000 (01:02 +0100)
committerAlexei Starovoitov <ast@kernel.org>
Sat, 29 Nov 2025 17:48:41 +0000 (09:48 -0800)
commitff34657aa72a4dab9c2fd38e1b31a506951f4b1c
tree7f87a4ec066537cd1033ef96f537e7c48e35011f
parentc1af4465b9b983d9e7cefa01ec869e91c3dea11c
bpf: optimize bpf_map_update_elem() for map-in-map types

Updating a BPF_MAP_TYPE_HASH_OF_MAPS or BPF_MAP_TYPE_ARRAY_OF_MAPS via
bpf_map_update_elem() is very expensive.

In one of our workloads, we're inserting ~1400 maps of type
BPF_MAP_TYPE_ARRAY into a BPF_MAP_TYPE_ARRAY_OF_MAPS. This takes ~21
seconds on a single thread, with an average of ~15ms per call:

Function Name:    map_update_elem
Number of calls:  1369
Total time:       21s 182ms 966µs
Maximum:          47ms 937µs
Average:          15ms 473µs
Minimum:          7µs

Profiling shows that nearly all of this time is going to synchronize_rcu(),
via maybe_wait_bpf_programs() in map_update_elem().

The call to synchronize_rcu() is done to ensure that after
bpf_map_update_elem() returns, no BPF programs are still looking at the old
value of the map, per commit 1ae80cf31938 ("bpf: wait for running BPF
programs when updating map-in-map").

As discussed on the bpf mailing list, replace synchronize_rcu() with
synchronize_rcu_expedited(). This is 175x faster: it now takes an average
of 88 microseconds per call, for a total of 127 milliseconds in the same
benchmark:

Function Name:    map_update_elem
Number of calls:  1439
Total time:       127ms 626µs
Maximum:          445µs
Average:          88µs
Minimum:          10µs

Link: https://lore.kernel.org/bpf/CAH6OuBR=w2kybK6u7aH_35B=Bo1PCukeMZefR=7V4Z2tJNK--Q@mail.gmail.com/
Signed-off-by: Ritesh Oedayrajsingh Varma <ritesh@superluminal.eu>
Link: https://lore.kernel.org/r/20251128000422.20462-1-ritesh@superluminal.eu
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
kernel/bpf/syscall.c