From: Oded Gabbay Date: Thu, 13 Jan 2022 08:05:38 +0000 (+0200) Subject: habanalabs: there is no kernel TDR in future ASICs X-Git-Tag: microblaze-v5.19~89^2~59^2~34 X-Git-Url: http://git.monstr.eu/?a=commitdiff_plain;h=e24a62cb68d117858f311d14ca366a18a44120a8;p=linux-2.6-microblaze.git habanalabs: there is no kernel TDR in future ASICs In future ASICs, there is no kernel TDR for new workloads that are submitted directly from user-space to the device. Therefore, the driver can NEVER know that a workload has timed-out. So, when the user asks us to wait for interrupt on the workload's completion, and the wait has timed-out, it doesn't mean the workload has timed-out. It only means the wait has timed-out, which is NOT an error from driver's perspective. Signed-off-by: Oded Gabbay --- diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c index 2f40b937c59f..29e0549ff31d 100644 --- a/drivers/misc/habanalabs/common/command_submission.c +++ b/drivers/misc/habanalabs/common/command_submission.c @@ -2932,11 +2932,14 @@ static int _hl_interrupt_wait_ioctl(struct hl_device *hdev, struct hl_ctx *ctx, rc = -EIO; *status = HL_WAIT_CS_STATUS_ABORTED; } else { - dev_err_ratelimited(hdev->dev, "Waiting for interrupt ID %d timedout\n", - interrupt->interrupt_id); - rc = -ETIMEDOUT; + /* The wait has timed-out. We don't know anything beyond that + * because the workload wasn't submitted through the driver. + * Therefore, from driver's perspective, the workload is still + * executing. + */ + rc = 0; + *status = HL_WAIT_CS_STATUS_BUSY; } - *status = HL_WAIT_CS_STATUS_BUSY; } } @@ -3049,6 +3052,12 @@ wait_again: interrupt->interrupt_id); rc = -EINTR; } else { + /* The wait has timed-out. We don't know anything beyond that + * because the workload wasn't submitted through the driver. + * Therefore, from driver's perspective, the workload is still + * executing. + */ + rc = 0; *status = HL_WAIT_CS_STATUS_BUSY; }