gro_cells: avoid using synchronize_rcu() in gro_cells_destroy()
authorEric Dumazet <edumazet@google.com>
Sun, 20 Feb 2022 04:11:55 +0000 (20:11 -0800)
committerJakub Kicinski <kuba@kernel.org>
Tue, 22 Feb 2022 19:25:40 +0000 (11:25 -0800)
commitee8f97efa7a59e7f390ed2de627ddd139beb6243
treea3ef1388f4984ab201c7cd463e7050af6680cf1f
parentd4276e570a0cff6ad28b3b5cb7d3268c846de3a5
gro_cells: avoid using synchronize_rcu() in gro_cells_destroy()

Another thing making netns dismantles potentially very slow is located
in gro_cells_destroy(),
whenever cleanup_net() has to remove a device using gro_cells framework.

RTNL is not held at this stage, so synchronize_net()
is calling synchronize_rcu():

netdev_run_todo()
 ip_tunnel_dev_free()
  gro_cells_destroy()
   synchronize_net()
    synchronize_rcu() // Ouch.

This patch uses call_rcu(), and gave me a 25x performance improvement
in my tests.

cleanup_net() is no longer blocked ~10 ms per synchronize_rcu()
call.

In the case we could not allocate the memory needed to queue the
deferred free, use synchronize_rcu_expedited()

v2: made percpu_free_defer_callback() static

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20220220041155.607637-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/core/gro_cells.c