sched/wait: Add add_wait_queue_priority()
authorDavid Woodhouse <dwmw@amazon.co.uk>
Tue, 27 Oct 2020 14:39:43 +0000 (14:39 +0000)
committerPaolo Bonzini <pbonzini@redhat.com>
Sun, 15 Nov 2020 14:49:09 +0000 (09:49 -0500)
commitc4d51a52c67a1e3a0fa3006e5ec21cdc07649cd6
tree8197d1c61f4fe778e4cc86e1c7715bdcd9116d17
parentbf0cd88ce363a2de3684baaa48d3f194acdc516c
sched/wait: Add add_wait_queue_priority()

This allows an exclusive wait_queue_entry to be added at the head of the
queue, instead of the tail as normal. Thus, it gets to consume events
first without allowing non-exclusive waiters to be woken at all.

The (first) intended use is for KVM IRQFD, which currently has
inconsistent behaviour depending on whether posted interrupts are
available or not. If they are, KVM will bypass the eventfd completely
and deliver interrupts directly to the appropriate vCPU. If not, events
are delivered through the eventfd and userspace will receive them when
polling on the eventfd.

By using add_wait_queue_priority(), KVM will be able to consistently
consume events within the kernel without accidentally exposing them
to userspace when they're supposed to be bypassed. This, in turn, means
that userspace doesn't have to jump through hoops to avoid listening
on the erroneously noisy eventfd and injecting duplicate interrupts.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20201027143944.648769-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
include/linux/wait.h
kernel/sched/wait.c