drm/i915/gt: Harden the heartbeat against a stuck driver
authorChris Wilson <chris@chris-wilson.co.uk>
Thu, 2 Jul 2020 09:52:18 +0000 (10:52 +0100)
committerChris Wilson <chris@chris-wilson.co.uk>
Thu, 2 Jul 2020 11:30:23 +0000 (12:30 +0100)
commitaab4707fdd754d4c4f0df718f3c7546b6eb40d20
treeb01604812aeb55e7673adff00c19eba441e897cc
parent680c45c767f63e35f063d3ea04f388a9f7ae7079
drm/i915/gt: Harden the heartbeat against a stuck driver

If the driver gets stuck holding the kernel timeline, we cannot issue a
heartbeat and so fail to discover that the driver is indeed stuck and do
not issue a GPU reset (which would hopefully unstick the driver!).
Switch to using a trylock so that we can query if the heartbeat's
timeline mutex is locked elsewhere, and then use the timer to probe if it
remains stuck at the same spot for consecutive heartbeats, indicating
that the mutex has not been released and the engine has not progressed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702095219.963-1-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
drivers/gpu/drm/i915/gt/intel_engine_types.h