If task_state is cleared, io_req_task_work_add() will go the slow path
adding a task_work, setting the task_state, waking up the task and so
on. Not to mention it's expensive. tctx_task_work() first clears the
state and then executes all the work items queued, so if any of them
resubmits or adds new task_work items, it would unnecessarily go through
the slow path of io_req_task_work_add().
Let's clear the ->task_state at the end. We still have to check
->task_list for emptiness afterward to synchronise with
io_req_task_work_add(), do that, and set the state back if we're going
to retry, because clearing not-ours task_state on the next iteration
would be buggy.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1ef72cdac7022adf0cd7ce4bfe3bb5c82a62eb93.1623949695.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
task_work);
- clear_bit(0, &tctx->task_state);
-
while (1) {
struct io_wq_work_node *node;
req->task_work.func(&req->task_work);
node = next;
}
- if (wq_list_empty(&tctx->task_list))
- break;
+ if (wq_list_empty(&tctx->task_list)) {
+ clear_bit(0, &tctx->task_state);
+ if (wq_list_empty(&tctx->task_list))
+ break;
+ /* another tctx_task_work() is enqueued, yield */
+ if (test_and_set_bit(0, &tctx->task_state))
+ break;
+ }
cond_resched();
}