drm/xe: Improve schedule disable response failure
authorMatthew Brost <matthew.brost@intel.com>
Thu, 14 Nov 2024 02:25:20 +0000 (18:25 -0800)
committerMatthew Brost <matthew.brost@intel.com>
Thu, 14 Nov 2024 14:38:45 +0000 (06:38 -0800)
Print Guc ID and take devcoredump on schedule disable response failure.
GuC ID is useful information and a schedule disable response failure is
possible the LRC state is corrupted so a devcoredump is helpful to debug.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-6-matthew.brost@intel.com
drivers/gpu/drm/xe/xe_guc_submit.c

index 08a6578..46fd462 100644 (file)
@@ -1124,7 +1124,10 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
                if (!ret || xe_guc_read_stopped(guc)) {
 trigger_reset:
                        if (!ret)
-                               xe_gt_warn(guc_to_gt(guc), "Schedule disable failed to respond");
+                               xe_gt_warn(guc_to_gt(guc),
+                                          "Schedule disable failed to respond, guc_id=%d",
+                                          q->guc->id);
+                       xe_devcoredump(q, job);
                        set_exec_queue_extra_ref(q);
                        xe_exec_queue_get(q);   /* GT reset owns this */
                        set_exec_queue_banned(q);