dma-debug: fix a possible deadlock on radix_lock
authorLevi Yun <yeoreum.yun@arm.com>
Fri, 25 Oct 2024 10:06:00 +0000 (11:06 +0100)
committerChristoph Hellwig <hch@lst.de>
Tue, 29 Oct 2024 07:51:25 +0000 (08:51 +0100)
radix_lock() shouldn't be held while holding dma_hash_entry[idx].lock
otherwise, there's a possible deadlock scenario when
dma debug API is called holding rq_lock():

CPU0                   CPU1                       CPU2
dma_free_attrs()
check_unmap()          add_dma_entry()            __schedule() //out
                                                  (A) rq_lock()
get_hash_bucket()
(A) dma_entry_hash
                                                  check_sync()
                       (A) radix_lock()           (W) dma_entry_hash
dma_entry_free()
(W) radix_lock()
                       // CPU2's one
                       (W) rq_lock()

CPU1 situation can happen when it extending radix tree and
it tries to wake up kswapd via wake_all_kswapd().

CPU2 situation can happen while perf_event_task_sched_out()
(i.e. dma sync operation is called while deleting perf_event using
 etm and etr tmc which are Arm Coresight hwtracing driver backends).

To remove this possible situation, call dma_entry_free() after
put_hash_bucket() in check_unmap().

Reported-by: Denis Nikitin <denik@chromium.org>
Closes: https://lists.linaro.org/archives/list/coresight@lists.linaro.org/thread/2WMS7BBSF5OZYB63VT44U5YWLFP5HL6U/#RWM6MLQX5ANBTEQ2PRM7OXCBGCE6NPWU
Signed-off-by: Levi Yun <yeoreum.yun@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
kernel/dma/debug.c

index d570535..f6f0387 100644 (file)
@@ -1052,9 +1052,13 @@ static void check_unmap(struct dma_debug_entry *ref)
        }
 
        hash_bucket_del(entry);
-       dma_entry_free(entry);
-
        put_hash_bucket(bucket, flags);
+
+       /*
+        * Free the entry outside of bucket_lock to avoid ABBA deadlocks
+        * between that and radix_lock.
+        */
+       dma_entry_free(entry);
 }
 
 static void check_for_stack(struct device *dev,