fs/epoll: make nesting accounting safe for -rt kernel
authorJason Baron <jbaron@akamai.com>
Tue, 7 Apr 2020 03:11:23 +0000 (20:11 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Tue, 7 Apr 2020 17:43:44 +0000 (10:43 -0700)
commitefcdd350d1f8a98fa9fb5280e5af0a22e2059a26
tree679dda708eeafad77d182b70e38f18e5152e513e
parent282144e04b9a88b53dc16dc46e47f7d810e0b63b
fs/epoll: make nesting accounting safe for -rt kernel

Davidlohr Bueso pointed out that when CONFIG_DEBUG_LOCK_ALLOC is set
ep_poll_safewake() can take several non-raw spinlocks after disabling
interrupts.  Since a spinlock can block in the -rt kernel, we can't take a
spinlock after disabling interrupts.  So let's re-work how we determine
the nesting level such that it plays nicely with the -rt kernel.

Let's introduce a 'nests' field in struct eventpoll that records the
current nesting level during ep_poll_callback().  Then, if we nest again
we can find the previous struct eventpoll that we were called from and
increase our count by 1.  The 'nests' field is protected by
ep->poll_wait.lock.

I've also moved the visited field to reduce the size of struct eventpoll
from 184 bytes to 176 bytes on x86_64 for !CONFIG_DEBUG_LOCK_ALLOC, which
is typical for a production config.

Reported-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: http://lkml.kernel.org/r/1582739816-13167-1-git-send-email-jbaron@akamai.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/eventpoll.c