io_uring: improve in tctx_task_work() resubmission
authorPavel Begunkov <asml.silence@gmail.com>
Thu, 17 Jun 2021 17:14:10 +0000 (18:14 +0100)
committerJens Axboe <axboe@kernel.dk>
Fri, 18 Jun 2021 15:22:02 +0000 (09:22 -0600)
If task_state is cleared, io_req_task_work_add() will go the slow path
adding a task_work, setting the task_state, waking up the task and so
on. Not to mention it's expensive. tctx_task_work() first clears the
state and then executes all the work items queued, so if any of them
resubmits or adds new task_work items, it would unnecessarily go through
the slow path of io_req_task_work_add().

Let's clear the ->task_state at the end. We still have to check
->task_list for emptiness afterward to synchronise with
io_req_task_work_add(), do that, and set the state back if we're going
to retry, because clearing not-ours task_state on the next iteration
would be buggy.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1ef72cdac7022adf0cd7ce4bfe3bb5c82a62eb93.1623949695.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
fs/io_uring.c

index 55bc348..fc8637f 100644 (file)
@@ -1894,8 +1894,6 @@ static void tctx_task_work(struct callback_head *cb)
        struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
                                                  task_work);
 
-       clear_bit(0, &tctx->task_state);
-
        while (1) {
                struct io_wq_work_node *node;
 
@@ -1917,8 +1915,14 @@ static void tctx_task_work(struct callback_head *cb)
                        req->task_work.func(&req->task_work);
                        node = next;
                }
-               if (wq_list_empty(&tctx->task_list))
-                       break;
+               if (wq_list_empty(&tctx->task_list)) {
+                       clear_bit(0, &tctx->task_state);
+                       if (wq_list_empty(&tctx->task_list))
+                               break;
+                       /* another tctx_task_work() is enqueued, yield */
+                       if (test_and_set_bit(0, &tctx->task_state))
+                               break;
+               }
                cond_resched();
        }