drm/amdgpu: Fixed psp fence and memory issues when removing amdgpu device
authorYiPeng Chai <YiPeng.Chai@amd.com>
Thu, 8 Sep 2022 01:44:36 +0000 (09:44 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Mon, 19 Sep 2022 19:17:47 +0000 (15:17 -0400)
commit83d29a5f8a5a8ac76fdf8b8ccca65899345e6a9e
treeaaaa3c7b526854b3a0793dbedd710dfc87c7bdcf
parentf5c7e7797060255dbc8160734ccc5ad6183c5e04
drm/amdgpu: Fixed psp fence and memory issues when removing amdgpu device

V3:
Fixed psp fence and memory issues for the asic
using smu v13_0_2 when removing amdgpu device.

[Why]:
1. psp_suspend->psp_free_shared_bufs->
       psp_ta_free_shared_buf->
           amdgpu_bo_free_kernel->
             ...->amdgpu_bo_release_notify->
                    amdgpu_fill_buffer
   psp will free vram memory used by psp when psp_suspend
   is called. But for the asic using smu v13_0_2, because
   psp_suspend is called before adev->shutdown is set to
   true when removing the first hive device, amdgpu fill_buffer
   will be called, which will cause fence issues when evicting
   all vram resources in amdgpu vram mgr_fini.
2. Since psp_hw_fini is not called after calling psp_suspend
   and psp_suspend only calls psp_ring_stop, the psp ring memory
   will not be released when amdgpu device is removed.

[How]:
1. Set shutdown to true before calling amdgpu_device_gpu_recover,
   then amdgpu_fill_buffer will not be called when psp_suspend is
   called.
2. Free psp ring memory in psp_sw_fini.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c