drm/amdgpu: Wipe all VRAM on free when RAS is enabled
authorFelix Kuehling <Felix.Kuehling@amd.com>
Tue, 25 Jan 2022 15:51:49 +0000 (10:51 -0500)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 27 Jan 2022 20:48:41 +0000 (15:48 -0500)
On GPUs with RAS, poison can propagate between processes if VRAM is not
cleared when it is freed or allocated. The reason is, that not all write
accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS
poison. Clearing memory in the background when it is freed should avoid
major performance impact. KFD has been doing this already for a long time.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index 5661b82..ec29365 100644 (file)
@@ -575,6 +575,9 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
        if (!amdgpu_bo_support_uswc(bo->flags))
                bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
 
+       if (adev->ras_enabled)
+               bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
+
        bo->tbo.bdev = &adev->mman.bdev;
        if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
                          AMDGPU_GEM_DOMAIN_GDS))