accel/ivpu: Fix locking order in ivpu_job_submit
authorKarol Wachowski <karol.wachowski@intel.com>
Tue, 7 Jan 2025 17:32:34 +0000 (18:32 +0100)
committerJacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Thu, 9 Jan 2025 08:35:45 +0000 (09:35 +0100)
commitab680dc6c78aa035e944ecc8c48a1caab9f39924
treefd4b311e201ed7c676eb9a2f199ddfab4b771bff
parente52443608934952fc978234cf7d639d6aa3f1856
accel/ivpu: Fix locking order in ivpu_job_submit

Fix deadlock in job submission and abort handling.
When a thread aborts currently executing jobs due to a fault,
it first locks the global lock protecting submitted_jobs (#1).

After the last job is destroyed, it proceeds to release the related context
and locks file_priv (#2). Meanwhile, in the job submission thread,
the file_priv lock (#2) is taken first, and then the submitted_jobs
lock (#1) is obtained when a job is added to the submitted jobs list.

       CPU0                            CPU1
       ----                            ----
  (for example due to a fault)         (jobs submissions keep coming)

  lock(&vdev->submitted_jobs_lock) #1
  ivpu_jobs_abort_all()
  job_destroy()
                                      lock(&file_priv->lock)           #2
                                      lock(&vdev->submitted_jobs_lock) #1
  file_priv_release()
  lock(&vdev->context_list_lock)
  lock(&file_priv->lock)           #2

This order of locking causes a deadlock. To resolve this issue,
change the order of locking in ivpu_job_submit().

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-12-maciej.falkowski@linux.intel.com
drivers/accel/ivpu/ivpu_job.c