linux-2.6-microblaze.git
3 years agodrm/amdgpu: Fix a redundant kfree
xinhui pan [Tue, 1 Sep 2020 07:28:06 +0000 (15:28 +0800)]
drm/amdgpu: Fix a redundant kfree

drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
itself.
Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
So drm_dev_release try to free a wrong pointer then.

Also driver's release trys to free adev, but drm_dev_release will
access dev after call drvier's release.

To fix it, remove driver's release and set managed.final_kfree to adev.

[   36.269348] BUG: unable to handle page fault for address: ffffa0c279940028
[   36.276841] #PF: supervisor read access in kernel mode
[   36.282434] #PF: error_code(0x0000) - not-present page
[   36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060
[   36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[   36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G           O      5.9.0-rc2+ #46
[   36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019
[   36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]
[   36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5
[   36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282
[   36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006
[   36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010
[   36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001
[   36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010
[   36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2
[   36.391845] FS:  00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000
[   36.400669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0
[   36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   36.430354] Call Trace:
[   36.433044]  drm_dev_put.part.0+0x40/0x60 [drm]
[   36.438017]  drm_dev_put+0x13/0x20 [drm]
[   36.442398]  amdgpu_pci_remove+0x56/0x60 [amdgpu]
[   36.447528]  pci_device_remove+0x3e/0xb0
[   36.451807]  device_release_driver_internal+0xff/0x1d0
[   36.457416]  device_release_driver+0x12/0x20
[   36.462094]  pci_stop_bus_device+0x70/0xa0
[   36.466588]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
[   36.472786]  remove_store+0x7b/0x90
[   36.476614]  dev_attr_store+0x17/0x30
[   36.480646]  sysfs_kf_write+0x4b/0x60
[   36.484655]  kernfs_fop_write+0xe8/0x1d0
[   36.488952]  vfs_write+0xf5/0x230
[   36.492562]  ksys_write+0x70/0xf0
[   36.496206]  __x64_sys_write+0x1a/0x20
[   36.500292]  do_syscall_64+0x38/0x90
[   36.504219]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Acked-by: Alex Deucher <alexancer.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: block ring buffer access during GPU recovery
Dennis Li [Tue, 1 Sep 2020 01:03:53 +0000 (09:03 +0800)]
drm/amdgpu: block ring buffer access during GPU recovery

When GPU is in reset, its status isn't stable and ring buffer also need
be reset when resuming. Therefore driver should protect GPU recovery
thread from ring buffer accessed by other threads. Otherwise GPU will
randomly hang during recovery.

v2: correct indent

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: handle manual fan readback on SMU11
Alex Deucher [Thu, 27 Aug 2020 04:12:38 +0000 (00:12 -0400)]
drm/amdgpu/swsmu: handle manual fan readback on SMU11

Need to read back from registers for manual mode rather than
using the metrics table.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1164
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: add smu11 helper to get manual fan speed (v2)
Alex Deucher [Tue, 25 Aug 2020 18:35:50 +0000 (14:35 -0400)]
drm/amdgpu/swsmu: add smu11 helper to get manual fan speed (v2)

Will be used to fetch the fan speeds when manual fan mode is
set.

v2: squash in a Coverity fix from Colin Ian King

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: drop set_fan_speed_percent (v2)
Alex Deucher [Thu, 27 Aug 2020 04:04:24 +0000 (00:04 -0400)]
drm/amdgpu/swsmu: drop set_fan_speed_percent (v2)

No longer needed as we can calculate it based on
the fan's max rpm.

v2: minor code rework

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: drop get_fan_speed_percent (v2)
Alex Deucher [Thu, 27 Aug 2020 03:49:37 +0000 (23:49 -0400)]
drm/amdgpu/swsmu: drop get_fan_speed_percent (v2)

No longer needed as we can calculate it based on
the fan's max rpm.

v2: rework code to avoid possible uninitialized
variable use.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: add get_fan_parameters callbacks for smu11 asics
Alex Deucher [Thu, 27 Aug 2020 03:34:57 +0000 (23:34 -0400)]
drm/amdgpu/swsmu: add get_fan_parameters callbacks for smu11 asics

grab the value from the pptable.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: add new callback for getting fan parameters
Alex Deucher [Thu, 27 Aug 2020 03:22:24 +0000 (23:22 -0400)]
drm/amdgpu/swsmu: add new callback for getting fan parameters

To fetch the max rpm from pptable.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: disable gpu-sched load balance for uvd
Nirmoy Das [Sat, 29 Aug 2020 04:51:41 +0000 (06:51 +0200)]
drm/amdgpu: disable gpu-sched load balance for uvd

On hardware with multiple uvd instances, dependent uvd jobs
may get scheduled to different uvd instances. Because uvd_enc
jobs retain hw context, dependent jobs should always run on the
same uvd instance. This patch disables GPU scheduler's load balancer
for a context that binds jobs from the same context to a uvd
instance.

v2: Squash in uvd_enc fix

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoinclude/uapi/linux: Fix indentation in kfd_smi_event enum
Mukul Joshi [Fri, 28 Aug 2020 23:53:08 +0000 (19:53 -0400)]
include/uapi/linux: Fix indentation in kfd_smi_event enum

Replace spaces with Tabs to fix indentation in kfd_smi_event
enum.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: Add GPU reset SMI event
Mukul Joshi [Fri, 28 Aug 2020 22:50:42 +0000 (18:50 -0400)]
drm/amdkfd: Add GPU reset SMI event

Add support for reporting GPU reset events through SMI. KFD
would report both pre and post GPU reset events.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: fix compiler warnings
Nirmoy Das [Thu, 27 Aug 2020 15:50:36 +0000 (17:50 +0200)]
drm/amdgpu: fix compiler warnings

Fixes below compiler warnings:
 CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_device.o
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
  381 | void static inline amdgpu_mm_wreg_mmio(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags)
      | ^~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_fini’:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3381:6: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]
 3381 |  int r;
      |      ^

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/radeon: Prefer lower feedback dividers
Kai-Heng Feng [Tue, 25 Aug 2020 17:33:48 +0000 (01:33 +0800)]
drm/radeon: Prefer lower feedback dividers

Commit 2e26ccb119bd ("drm/radeon: prefer lower reference dividers")
fixed screen flicker for HP Compaq nx9420 but breaks other laptops like
Asus X50SL.

Turns out we also need to favor lower feedback dividers.

Users confirmed this change fixes the regression and doesn't regress the
original fix.

Fixes: 2e26ccb119bd ("drm/radeon: prefer lower reference dividers")
BugLink: https://bugs.launchpad.net/bugs/1791312
BugLink: https://bugs.launchpad.net/bugs/1861554
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Fix bug in reporting voltage for CIK
Sandeep Raghuraman [Thu, 27 Aug 2020 13:13:37 +0000 (18:43 +0530)]
drm/amdgpu: Fix bug in reporting voltage for CIK

On my R9 390, the voltage was reported as a constant 1000 mV.
This was due to a bug in smu7_hwmgr.c, in the smu7_read_sensor()
function, where some magic constants were used in a condition,
to determine whether the voltage should be read from PLANE2_VID
or PLANE1_VID. The VDDC mask was incorrectly used, instead of
the VDDGFX mask.

This patch changes the code to use the correct defined constants
(and apply the correct bitshift), thus resulting in correct voltage reporting.

Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Specify get_argument function for ci_smu_funcs
Sandeep Raghuraman [Thu, 27 Aug 2020 11:37:33 +0000 (17:07 +0530)]
drm/amdgpu: Specify get_argument function for ci_smu_funcs

Starting in Linux 5.8, the graphics and memory clock frequency were not being
reported for CIK cards. This is a regression, since they were reported correctly
in Linux 5.7.

After investigation, I discovered that the smum_send_msg_to_smc() function,
attempts to call the corresponding get_argument() function of ci_smu_funcs.
However, the get_argument() function is not defined in ci_smu_funcs.

This patch fixes the bug by specifying the correct get_argument() function.

Fixes: a0ec225633d9f6 ("drm/amd/powerplay: unified interfaces for message issuing and response checking")
Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: enable MP0 DPM for sienna_cichlid
Jiansong Chen [Thu, 27 Aug 2020 06:31:20 +0000 (14:31 +0800)]
drm/amd/pm: enable MP0 DPM for sienna_cichlid

Enable MP0 clock DPM for sienna_cichlid.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: simplify hw status clear/set logic
Jiawei [Thu, 27 Aug 2020 02:07:52 +0000 (10:07 +0800)]
drm/amdgpu: simplify hw status clear/set logic

Optimize code to iterate less loops in
amdgpu_device_ip_reinit_early_sriov()

Signed-off-by: Jiawei <Jiawei.Gu@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: suppress static checker warning
Evan Quan [Wed, 26 Aug 2020 03:28:19 +0000 (11:28 +0800)]
drm/amd/pm: suppress static checker warning

Suppress the warning below:
drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/hardwaremanager.c:274 phm_check_smc_update_required_for_display_configuration()
warn: signedness bug returning '(-22)'

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting
Evan Quan [Tue, 25 Aug 2020 05:51:29 +0000 (13:51 +0800)]
drm/amd/pm: avoid false alarm due to confusing softwareshutdowntemp setting

Normally softwareshutdowntemp should be greater than Thotspotlimit.
However, on some VEGA10 ASIC, the softwareshutdowntemp is 91C while
Thotspotlimit is 105C. This seems not right and may trigger some
false alarms.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: fix is_dpm_running() run error on 32bit system
Kevin Wang [Mon, 24 Aug 2020 08:50:12 +0000 (16:50 +0800)]
drm/amd/pm: fix is_dpm_running() run error on 32bit system

v1:
the C type "unsigned long" size is 32bit on 32bit system,
it will cause code logic error, so replace it with "uint64_t".

v2:
remove duplicate cast operation.

Signed-off-by: Kevin <kevin1.wang@amd.com>
Suggest-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: correct SE number for arcturus gfx ras
Guchun Chen [Wed, 26 Aug 2020 07:43:42 +0000 (15:43 +0800)]
drm/amdgpu: correct SE number for arcturus gfx ras

When resetting EDC related register, all CUs needs to be visited,
otherwise, garbage data from EDC register of missed SEs would present.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix memleak in amdgpu_dm_mode_config_init
Dinghao Liu [Wed, 26 Aug 2020 13:24:58 +0000 (21:24 +0800)]
drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init

When amdgpu_display_modeset_create_props() fails, state and
state->context should be freed to prevent memleak. It's the
same when amdgpu_dm_audio_init() fails.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: disable runtime pm for navy_flounder
Jiansong Chen [Wed, 26 Aug 2020 06:11:52 +0000 (14:11 +0800)]
drm/amdgpu: disable runtime pm for navy_flounder

Disable runtime pm for navy_flounder temporarily.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Retry AUX write when fail occurs
Wayne Lin [Tue, 18 Aug 2020 03:19:42 +0000 (11:19 +0800)]
drm/amd/display: Retry AUX write when fail occurs

[Why]
In dm_dp_aux_transfer() now, we forget to handle AUX_WR fail cases. We
suppose every write wil get done successfully and hence some AUX
commands might not sent out indeed.

[How]
Check if AUX_WR success. If not, retry it.

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Fix buffer overflow in INFO ioctl
Alex Deucher [Tue, 25 Aug 2020 15:43:45 +0000 (11:43 -0400)]
drm/amdgpu: Fix buffer overflow in INFO ioctl

The values for "se_num" and "sh_num" come from the user in the ioctl.
They can be in the 0-255 range but if they're more than
AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in
an out of bounds read.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: report DC not supported if virtual display is enabled (v2)
Alex Deucher [Tue, 25 Aug 2020 00:17:30 +0000 (20:17 -0400)]
drm/amdgpu: report DC not supported if virtual display is enabled (v2)

Virtual display is non-atomic so report false to avoid checking
atomic state and other atomic things at runtime.

v2: squash into the sr-iov check

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/powerplay: Fix hardmins not being sent to SMU for RV
Nicholas Kazlauskas [Fri, 14 Aug 2020 15:49:13 +0000 (11:49 -0400)]
drm/amd/powerplay: Fix hardmins not being sent to SMU for RV

[Why]
DC uses these to raise the voltage as needed for higher dispclk/dppclk
and to ensure that we have enough bandwidth to drive the displays.

There's a bug preventing these from actuially sending messages since
it's checking the actual clock (which is 0) instead of the incoming
clock (which shouldn't be 0) when deciding to send the hardmin.

[How]
Check the clocks != 0 instead of the actual clocks.

Fixes: 9ed9203c3ee7 ("drm/amd/powerplay: rv dal-pplib interface refactor powerplay part")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: use MODE1 reset for navy_flounder by default
Jiansong Chen [Tue, 25 Aug 2020 07:39:57 +0000 (15:39 +0800)]
drm/amdgpu: use MODE1 reset for navy_flounder by default

Switch default gpu reset method to MODE1 for navy_flounder.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: fix set kfd node ras properties value
Stanley.Yang [Mon, 17 Aug 2020 07:48:21 +0000 (15:48 +0800)]
drm/amdkfd: fix set kfd node ras properties value

The ctx->features are new RAS implementation which
is only available for Vega20 and onwards, it is not
available for vega10, vega10 should follow legacy
ECC implementation.

Changed from V1:
    wrap function to initialize kfd node properties

Changed from V2:
    remove wrap function and SDMA SRAM ECC check

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: correct the thermal alert temperature limit settings
Evan Quan [Tue, 25 Aug 2020 02:35:11 +0000 (10:35 +0800)]
drm/amd/pm: correct the thermal alert temperature limit settings

Do the maths in celsius degree. This can fix the issues caused
by the changes below:

drm/amd/pm: correct Vega20 swctf limit setting
drm/amd/pm: correct Vega12 swctf limit setting
drm/amd/pm: correct Vega10 swctf limit setting

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add asd fw check before loading asd
Tao Zhou [Tue, 28 Jul 2020 04:44:59 +0000 (12:44 +0800)]
drm/amdgpu: add asd fw check before loading asd

asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: use kmemdup() rather than kmalloc+memcpy
Alex Dewar [Mon, 24 Aug 2020 21:15:25 +0000 (22:15 +0100)]
drm/amd/pm: use kmemdup() rather than kmalloc+memcpy

Issue identified with Coccinelle.

Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add a wrapper for atom asic_init
Alex Deucher [Mon, 24 Aug 2020 16:34:10 +0000 (12:34 -0400)]
drm/amdgpu: add a wrapper for atom asic_init

This allows us to add asic specific workarounds for atom
asic init while keeping the adev specifics out of the
atombios parser code.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add pre_asic_init callback for navi
Alex Deucher [Wed, 19 Aug 2020 21:04:47 +0000 (17:04 -0400)]
drm/amdgpu: add pre_asic_init callback for navi

Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add pre_asic_init callback for SOC15
Alex Deucher [Wed, 19 Aug 2020 20:48:17 +0000 (16:48 -0400)]
drm/amdgpu: add pre_asic_init callback for SOC15

We need to restore some registers prior to running asic
init to work around a firmware bug.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add pre_asic_init callback for VI
Alex Deucher [Wed, 19 Aug 2020 21:04:31 +0000 (17:04 -0400)]
drm/amdgpu: add pre_asic_init callback for VI

Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add pre_asic_init callback for CIK
Alex Deucher [Wed, 19 Aug 2020 21:03:39 +0000 (17:03 -0400)]
drm/amdgpu: add pre_asic_init callback for CIK

Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add pre_asic_init callback for SI
Alex Deucher [Wed, 19 Aug 2020 21:02:41 +0000 (17:02 -0400)]
drm/amdgpu: add pre_asic_init callback for SI

Nothing to do for this family.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: add an asic callback for pre asic init
Alex Deucher [Wed, 19 Aug 2020 17:40:56 +0000 (13:40 -0400)]
drm/amdgpu: add an asic callback for pre asic init

This callback can be used by asics that need to
do something special prior to calling atom asic init.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: fix up DCHUBBUB_SDPIF_MMIO_CNTRL_0 handling
Alex Deucher [Tue, 18 Aug 2020 23:24:03 +0000 (19:24 -0400)]
drm/amdgpu: fix up DCHUBBUB_SDPIF_MMIO_CNTRL_0 handling

Properly define this register using a relative offset rather
than an absolute offset and use the proper SOC15 macros to
access it.  It's also DCN, not DCE, so remove it from the
DCE12 header.

No functional change.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add DPCS regs for dcn3 link encoder
Bhawanpreet Lakha [Fri, 21 Aug 2020 15:57:15 +0000 (11:57 -0400)]
drm/amd/display: Add DPCS regs for dcn3 link encoder

dpcs reg are missing for dcn3 link encoder regs list, so add them.

Also remove
DPCSTX_DEBUG_CONFIG and RDPCSTX_DEBUG_CONFIG as they are unused and
cause compile errors for dcn3

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: call amdgpu_amdkfd_get_hive_id directly
Felix Kuehling [Mon, 24 Aug 2020 14:18:37 +0000 (10:18 -0400)]
drm/amdkfd: call amdgpu_amdkfd_get_hive_id directly

No need to use a function pointer because the implementation is not
ASIC-specific.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: call amdgpu_amdkfd_get_unique_id directly
Felix Kuehling [Mon, 24 Aug 2020 13:55:16 +0000 (09:55 -0400)]
drm/amdkfd: call amdgpu_amdkfd_get_unique_id directly

No need to use a function pointer because the implementation is not
ASIC-specific. This fixes missing support due to a missing function
pointer on Arcturus.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoamdgpu: fix Documentation builds for pm/ file movement
Randy Dunlap [Sun, 23 Aug 2020 22:35:36 +0000 (15:35 -0700)]
amdgpu: fix Documentation builds for pm/ file movement

Fix Documentation errors for amdgpu.rst due to file rename (moved
to another subdirectory).

Error: Cannot open file ../drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
WARNING: kernel-doc '../scripts/kernel-doc -rst -enable-lineno -function hwmon ../drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c' failed with return code 1

Fixes: e098bc9612c2 ("drm/amd/pm: optimize the power related source code layout")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agogpu: amd: Remove duplicate semicolons at the end of line
Youling Tang [Sat, 22 Aug 2020 08:27:23 +0000 (16:27 +0800)]
gpu: amd: Remove duplicate semicolons at the end of line

Remove duplicate semicolons at the end of line.

Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Keep current gain when ABM disable immediately
Brandon Syu [Wed, 10 Jun 2020 08:44:33 +0000 (16:44 +0800)]
drm/amd/display: Keep current gain when ABM disable immediately

[Why]
When system enters s3/s0i3, backlight PWM would set user level.

[How]
ABM disable function add keep current gain to avoid it.

Signed-off-by: Brandon Syu <Brandon.Syu@amd.com>
Reviewed-by: Josip Pavic <Josip.Pavic@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation
Samson Tam [Thu, 13 Aug 2020 14:50:21 +0000 (10:50 -0400)]
drm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation

[Why]
dongle_type is set during dongle connection but for passive dongles,
dongle_type is not set. If user starts with an active dongle and
then switches to a passive dongle, it will still report as an active
dongle. Trying to emulate the wrong connecter type results in display
not lighting up.

[How]
Set dpcd_caps.dongle_type for passive dongles in detect_dp().

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Joshua Aberback <Joshua.Aberback@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add connector HPD trigger debugfs entry
Eryk Brol [Mon, 10 Aug 2020 18:08:11 +0000 (14:08 -0400)]
drm/amd/display: Add connector HPD trigger debugfs entry

[why]
Need a tool to retrigger a virtual hotplug for testing purposes with
force redetection in both DC and DM.

[how]
Emulate handle_hpd_irq for connector as if usermode would trigger
a hotplug. Perform DC link discovery, DM connector update, and
DM force atomic commit.

In order to trigger HPD on the connector user needs to echo 1 into
"trigger_hotplug" debugfs entry on its respective connector.

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add debugfs for connector's FEC & DSC capabilities
Eryk Brol [Mon, 10 Aug 2020 18:02:55 +0000 (14:02 -0400)]
drm/amd/display: Add debugfs for connector's FEC & DSC capabilities

[why & how]
Useful entry to understand if link has DSC or FEC capabilities,
implemented to read DPCD caps stored on the link. Better than
manually reading the registers with aux dpcd helper.

Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Revert HDCP disable sequence change
Jaehyun Chung [Mon, 10 Aug 2020 20:02:47 +0000 (16:02 -0400)]
drm/amd/display: Revert HDCP disable sequence change

[Why]
Revert HDCP disable sequence change that blanks stream before
disabling HDCP. PSP and HW teams are currently investigating the
root cause of why HDCP cannot be disabled before stream blank,
which is expected to work without issues.

Signed-off-by: Jaehyun Chung <jaehyun.chung@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Send H14b-VSIF specified in HDMI
Wayne Lin [Wed, 15 Jul 2020 08:45:09 +0000 (16:45 +0800)]
drm/amd/display: Send H14b-VSIF specified in HDMI

[Why]
Current function excludes the logic to generate H14b-VSIF. Now it
constructs HF-VSIF only and causes HDMI compliace test fail.

[How]
According to HDMI spec, source devices shall utilize the H14b-VSIF
whenever the signaling capabilities of the H14b-VSIF allow this.

Here keep the logic for HF-VSIF and add H14b-VSIF construction part.

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Call DMUB for eDP power control
Chris Park [Mon, 10 Aug 2020 18:20:16 +0000 (14:20 -0400)]
drm/amd/display: Call DMUB for eDP power control

[Why]
If DMUB is used, LVTMA VBIOS call can be used to control eDP instead
of tranditional transmitter control. Interface is agreed with VBIOS
for eDP to use this new path to program LVTMA registers.

[How]
Expose DAL interface to send DMUB command for LVTMA control that VBIOS
currently uses.

Signed-off-by: Chris Park <Chris.Park@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: 3.2.99
Aric Cyr [Mon, 10 Aug 2020 14:19:04 +0000 (10:19 -0400)]
drm/amd/display: 3.2.99

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Send DISPLAY_OFF after power down on boot
Sung Lee [Tue, 11 Aug 2020 21:23:20 +0000 (17:23 -0400)]
drm/amd/display: Send DISPLAY_OFF after power down on boot

[WHY]
update_clocks might not be called on headless adapters. This means
DISPLAY_OFF may not be sent in headless cases.

[HOW]
If hardware is powered down on boot because it is headless (mode set
does not happen on that adapter) also send DISPLAY_OFF notification.

Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/gfx10: refine mgcg setting
Jiansong Chen [Mon, 24 Aug 2020 10:44:09 +0000 (18:44 +0800)]
drm/amdgpu/gfx10: refine mgcg setting

1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang.
2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: implement the dGPU fallback path for apu (v6)
Huang Rui [Tue, 18 Aug 2020 06:54:23 +0000 (14:54 +0800)]
drm/amdkfd: implement the dGPU fallback path for apu (v6)

We still have a few iommu issues which need to address, so force raven
as "dgpu" path for the moment.

This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
or ACPI CRAT table not correct.

v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2.
v3: Align with existed thunk, don't change the way of raven, only renoir
    will use "dgpu" path by default.
v4: don't update global ignore_crat in the driver, and revise fallback
    function if CRAT is broken.
v5: refine acpi crat good but no iommu support case, and rename the
    title.
v6: fix the issue of dGPU initialized firstly, just modify the report
    value in the node_show().

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: correct Vega20 swctf limit setting
Evan Quan [Fri, 21 Aug 2020 04:21:30 +0000 (12:21 +0800)]
drm/amd/pm: correct Vega20 swctf limit setting

Correct the Vega20 thermal swctf limit.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: correct Vega12 swctf limit setting
Evan Quan [Fri, 21 Aug 2020 04:18:58 +0000 (12:18 +0800)]
drm/amd/pm: correct Vega12 swctf limit setting

Correct the Vega12 thermal swctf limit.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: correct Vega10 swctf limit setting
Evan Quan [Fri, 21 Aug 2020 04:05:03 +0000 (12:05 +0800)]
drm/amd/pm: correct Vega10 swctf limit setting

Correct the Vega10 thermal swctf limit.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1267

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Embed drm_device into amdgpu_device (v3)
Luben Tuikov [Sat, 15 Aug 2020 00:41:55 +0000 (20:41 -0400)]
drm/amdgpu: Embed drm_device into amdgpu_device (v3)

a) Embed struct drm_device into struct amdgpu_device.
b) Modify the inline-f drm_to_adev() accordingly.
c) Modify the inline-f adev_to_drm() accordingly.
d) Eliminate the use of drm_device.dev_private,
   in amdgpu.
e) Switch from using drm_dev_alloc() to
   drm_dev_init().
f) Add a DRM driver release function, which frees
   the container amdgpu_device after all krefs on
   the contained drm_device have been released.

v2: Split out adding adev_to_drm() into its own
    patch (previous commit), making this patch
    more succinct and clear. More detailed commit
    description.
v3: squash in fix to call drmm_add_final_kfree()
    to avoid a warning.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Get DRM dev from adev by inline-f
Luben Tuikov [Mon, 24 Aug 2020 16:29:45 +0000 (12:29 -0400)]
drm/amdgpu: Get DRM dev from adev by inline-f

Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: drm_device to amdgpu_device by inline-f (v2)
Luben Tuikov [Mon, 24 Aug 2020 16:27:47 +0000 (12:27 -0400)]
drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)

Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: enable HDP clock gatting
Prike.Liang [Mon, 1 Jun 2020 06:10:54 +0000 (14:10 +0800)]
drm/amdgpu: enable HDP clock gatting

Enabe HDP SD/DS clock gatting in Renoir series.

Signed-off-by: Prike.Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: enable ATHUB clock gatting
Prike.Liang [Mon, 1 Jun 2020 06:07:13 +0000 (14:07 +0800)]
drm/amdgpu: enable ATHUB clock gatting

Enable ATHUB clock gatting set in Renoir series.

Signed-off-by: Prike.Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: set VCN pg per instances
Jiansong Chen [Fri, 21 Aug 2020 08:20:47 +0000 (16:20 +0800)]
drm/amd/pm: set VCN pg per instances

When deciding whether to set pg for vcn1, instances
number is more generic than chip name.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: annotate a false positive recursive locking
Dennis Li [Thu, 6 Aug 2020 06:48:15 +0000 (14:48 +0800)]
drm/amdgpu: annotate a false positive recursive locking

Re-apply commit 72e14ebf9fc09e33b28b70f00a2ed9821c198633

[  584.110304] ============================================
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
[  584.111164] --------------------------------------------
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
               but task is already holding lock:
[  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
               other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]        CPU0
[  584.114685]        ----
[  584.115014]   lock(&adev->reset_sem);
[  584.115349]   lock(&adev->reset_sem);
[  584.115678]
                *** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.119222]
               stack backtrace:
[  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  584.121638] Call Trace:
[  584.122050]  dump_stack+0x98/0xd5
[  584.122499]  __lock_acquire+0x1139/0x16e0
[  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
[  584.123358]  ? cancel_delayed_work+0xa6/0xc0
[  584.123771]  lock_acquire+0xb8/0x1c0
[  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.124599]  down_write+0x49/0x120
[  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[  584.126789]  process_one_work+0x29e/0x630
[  584.127208]  worker_thread+0x3c/0x3f0
[  584.127621]  ? __kthread_parkme+0x61/0x90
[  584.128014]  kthread+0x12f/0x150
[  584.128402]  ? process_one_work+0x630/0x630
[  584.128790]  ? kthread_park+0x90/0x90
[  584.129174]  ret_from_fork+0x3a/0x50

Each adev has owned lock_class_key to avoid false positive
recursive locking.

v2:
1. register adev->lock_key into lockdep, otherwise lockdep will
report the below warning

[ 1216.705820] BUG: key ffff890183b647d0 has not been registered!
[ 1216.705924] ------------[ cut here ]------------
[ 1216.705972] DEBUG_LOCKS_WARN_ON(1)
[ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210

v3:
change to use down_write_nest_lock to annotate the false dead-lock
warning.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: refine create and release logic of hive info
Dennis Li [Tue, 18 Aug 2020 10:44:17 +0000 (18:44 +0800)]
drm/amdgpu: refine create and release logic of hive info

Change to dynamically create and release hive info object,
which help driver support more hives in the future.

v2:
Change to save hive object pointer in adev, to avoid locking
xgmi_mutex every time when calling amdgpu_get_xgmi_hive.

v3:
1. Change type of hive object pointer in adev from void* to
amdgpu_hive_info*.
2. remove unnecessary variable initialization.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: refine message print for devices of hive
Dennis Li [Thu, 20 Aug 2020 02:40:53 +0000 (10:40 +0800)]
drm/amdgpu: refine message print for devices of hive

Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device
of a hive is the message for.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: fix the nullptr issue when reenter GPU recovery
Dennis Li [Thu, 20 Aug 2020 02:17:39 +0000 (10:17 +0800)]
drm/amdgpu: fix the nullptr issue when reenter GPU recovery

in single gpu system, if driver reenter gpu recovery,
amdgpu_device_lock_adev will return false, but hive is
nullptr now.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: change reset lock from mutex to rw_semaphore
Dennis Li [Thu, 20 Aug 2020 02:06:32 +0000 (10:06 +0800)]
drm/amdgpu: change reset lock from mutex to rw_semaphore

clients don't need reset-lock for synchronization when no
GPU recovery.

v2:
change to return the return value of down_read_killable.

v3:
if GPU recovery begin, VF ignore FLR notification.

Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: enable run_btc callback for sienna_cichlid
Jiansong Chen [Fri, 21 Aug 2020 03:30:19 +0000 (11:30 +0800)]
drm/amd/pm: enable run_btc callback for sienna_cichlid

DC BTC support for sienna_cichlid is added, it provides
the DC tolerance and aging measurements.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_updat...
Furquan Shaikh [Thu, 20 Aug 2020 07:52:41 +0000 (00:52 -0700)]
drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_update_backlight_caps

In `amdgpu_dm_update_backlight_caps()`, there is a local
`amdgpu_dm_backlight_caps` object that is filled in by
`amdgpu_acpi_get_backlight_caps()`. However, this object is
uninitialized before the call and hence the subsequent check for
aux_support can fail since it is not initialized by
`amdgpu_acpi_get_backlight_caps()` as well. This change initializes
this local `amdgpu_dm_backlight_caps` object to 0.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Furquan Shaikh <furquan@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: Remove unnecessary cast
Alex Dewar [Thu, 20 Aug 2020 17:58:06 +0000 (18:58 +0100)]
drm/amd/pm: Remove unnecessary cast

In init_powerplay_table_information() the value returned from kmalloc()
is cast unnecessarily. Remove cast.

Issue identified with Coccinelle.

Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/powerplay: remove duplicate include
Wang Hai [Wed, 19 Aug 2020 11:34:09 +0000 (19:34 +0800)]
drm/amd/powerplay: remove duplicate include

Remove asic_reg/nbio/nbio_6_1_offset.h which is included more than once

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: remove unintended executable mode
Lukas Bulwahn [Wed, 19 Aug 2020 08:18:08 +0000 (10:18 +0200)]
drm/amd/display: remove unintended executable mode

Besides the intended change, commit 4cc1178e166a ("drm/amdgpu: replace DRM
prefix with PCI device info for gfx/mmhub") also set the source files
mmhub_v1_0.c and gfx_v9_4.c to be executable, i.e., changed fromold mode
644 to new mode 755.

Commit 241b2ec9317e ("drm/amd/display: Add dcn30 Headers (v2)") added the
four header files {dpcs,dcn}_3_0_0_{offset,sh_mask}.h as executable, i.e.,
mode 755.

Set to the usual modes for source and headers files and clean up those
mistakes. No functional change.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: refine codes to avoid reentering GPU recovery
Dennis Li [Wed, 19 Aug 2020 09:23:03 +0000 (17:23 +0800)]
drm/amdgpu: refine codes to avoid reentering GPU recovery

if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Reject overlay plane configurations in multi-display scenarios
Nicholas Kazlauskas [Wed, 19 Aug 2020 17:37:54 +0000 (13:37 -0400)]
drm/amd/display: Reject overlay plane configurations in multi-display scenarios

[Why]
These aren't stable on some platform configurations when driving
multiple displays, especially on higher resolution.

In particular the delay in asserting p-state and validating from
x86 outweights any power or performance benefit from the hardware
composition.

Under some configurations this will manifest itself as extreme stutter
or unresponsiveness especially when combined with cursor movement.

[How]
Disable these for now. Exposing overlays to userspace doesn't guarantee
that they'll be able to use them in any and all configurations and it's
part of the DRM contract to have userspace gracefully handle validation
failures when they occur.

Valdiation occurs as part of DC and this in particular affects RV, so
disable this in dcn10_global_validation.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: sparse: Fix warning in reading SDMA counters
Mukul Joshi [Tue, 18 Aug 2020 18:51:41 +0000 (14:51 -0400)]
drm/amdkfd: sparse: Fix warning in reading SDMA counters

Add __user annotation to fix related sparse warning while reading
SDMA counters from userland.
Also, rework the read SDMA counters function by removing redundant
checks.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: use correct scale for actual_brightness
Alexander Monakov [Tue, 4 Aug 2020 20:13:13 +0000 (23:13 +0300)]
drm/amd/display: use correct scale for actual_brightness

Documentation for sysfs backlight level interface requires that
values in both 'brightness' and 'actual_brightness' files are
interpreted to be in range from 0 to the value given in the
'max_brightness' file.

With amdgpu, max_brightness gives 255, and values written by the user
into 'brightness' are internally rescaled to a wider range. However,
reading from 'actual_brightness' gives the raw register value without
inverse rescaling. This causes issues for various userspace tools such
as PowerTop and systemd that expect the value to be in the correct
range.

Introduce a helper to retrieve internal backlight range. Use it to
reimplement 'convert_brightness' as 'convert_brightness_from_user' and
introduce 'convert_brightness_to_user'.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: should check error using DC_OK
Tong Zhang [Sun, 16 Aug 2020 07:32:12 +0000 (03:32 -0400)]
drm/amd/display: should check error using DC_OK

core_link_read_dpcd returns only DC_OK(1) and DC_ERROR_UNEXPECTED(-1),
the caller should check error using DC_OK instead of checking against 0

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: fix potential integer overflow when shifting 32 bit variable bl_pwm
Colin Ian King [Tue, 18 Aug 2020 12:09:14 +0000 (13:09 +0100)]
drm/amd/display: fix potential integer overflow when shifting 32 bit variable bl_pwm

The 32 bit unsigned integer bl_pwm is being shifted using 32 bit arithmetic
and then being assigned to a 64 bit unsigned integer.  There is a potential
for a 32 bit overflow so cast bl_pwm to enforce a 64 bit shift operation
to avoid this.

Addresses-Coverity: ("unintentional integer overflow")
Fixes: 3ba01817365c ("drm/amd/display: Move panel_cntl specific register from abm to panel_cntl.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/pm: only hide average power on SI and pre-RENOIR APUs
Alex Deucher [Mon, 17 Aug 2020 19:18:01 +0000 (15:18 -0400)]
drm/amdgpu/pm: only hide average power on SI and pre-RENOIR APUs

We can get this on RENOIR and newer via the SMU metrics
table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: implement power metrics for RENOIR
Alex Deucher [Mon, 17 Aug 2020 19:47:31 +0000 (15:47 -0400)]
drm/amdgpu/swsmu: implement power metrics for RENOIR

Grab the data from the SMU metrics table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/swsmu: implement voltage metrics for RENOIR
Alex Deucher [Mon, 17 Aug 2020 19:37:54 +0000 (15:37 -0400)]
drm/amdgpu/swsmu: implement voltage metrics for RENOIR

Grab the data from the SMU metrics table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/pm: remove duplicate check
Alex Deucher [Mon, 17 Aug 2020 19:04:44 +0000 (15:04 -0400)]
drm/amdgpu/pm: remove duplicate check

FAMILY_KV is APUs and we already check for APUs.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/jpeg: remove redundant check when it returns
Leo Liu [Fri, 14 Aug 2020 15:08:59 +0000 (11:08 -0400)]
drm/amdgpu/jpeg: remove redundant check when it returns

Fix warning from kernel test robot
v2: remove the local variable as well

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Limit the error info print rate
jqdeng [Thu, 30 Jul 2020 06:48:40 +0000 (14:48 +0800)]
drm/amdgpu: Limit the error info print rate

Use function printk_ratelimit to limit the print rate.

Signed-off-by: jqdeng <Emily.Deng@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: Fix repeatly flr issue
jqdeng [Fri, 7 Aug 2020 09:31:19 +0000 (17:31 +0800)]
drm/amdgpu: Fix repeatly flr issue

Only for no job running test case need to do recover in
flr notification.
For having job in mirror list, then let guest driver to
hit job timeout, and then do recover.

Signed-off-by: jqdeng <Emily.Deng@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: add SMU11 common deep sleep control interface
Evan Quan [Mon, 17 Aug 2020 08:05:10 +0000 (16:05 +0800)]
drm/amd/pm: add SMU11 common deep sleep control interface

Considering the same logic can be applied to Arcturus, Navi1X
and Sienna Cichlid.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: disable/enable deep sleep features on UMD pstate enter/exit
Evan Quan [Mon, 17 Aug 2020 07:52:42 +0000 (15:52 +0800)]
drm/amd/pm: disable/enable deep sleep features on UMD pstate enter/exit

Add deep sleep disablement/enablement on UMD pstate entering/exiting.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: add SMU11 common gfx ulv control interface
Evan Quan [Mon, 17 Aug 2020 07:08:16 +0000 (15:08 +0800)]
drm/amd/pm: add SMU11 common gfx ulv control interface

Considering the same logic can be applied to Arcturus, Navi1X
and Sienna Cichlid.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: disable/enable gfx ulv on UMD pstate enter/exit
Evan Quan [Mon, 17 Aug 2020 06:52:06 +0000 (14:52 +0800)]
drm/amd/pm: disable/enable gfx ulv on UMD pstate enter/exit

Add gfx ulv disablement/enablement on UMD pstate entering/exiting.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: update driver if version for navy_flounder
Jiansong Chen [Tue, 18 Aug 2020 02:54:06 +0000 (10:54 +0800)]
drm/amd/pm: update driver if version for navy_flounder

It's in accordance with pmfw 65.7.0 for navy_flounder.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoRevert "drm/amdgpu: disable gfxoff for navy_flounder"
Jiansong Chen [Mon, 17 Aug 2020 14:38:38 +0000 (22:38 +0800)]
Revert "drm/amdgpu: disable gfxoff for navy_flounder"

This reverts commit ba4e049e63b607ac2e0c070b1406826390d5047e.
Newly released sdma fw (51.52) provides a fix for the issue.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/scheduler: Remove priority macro INVALID (v2)
Luben Tuikov [Wed, 12 Aug 2020 00:56:58 +0000 (20:56 -0400)]
drm/scheduler: Remove priority macro INVALID (v2)

Remove DRM_SCHED_PRIORITY_INVALID. We no longer
carry around an invalid priority and cut it off
at the source.

Backwards compatibility behaviour of AMDGPU CTX
IOCTL passing in garbage for context priority
from user space and then mapping that to
DRM_SCHED_PRIORITY_NORMAL is preserved.

v2: Revert "res"  --> "r" and
           "prio" --> "priority".

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/scheduler: Scheduler priority fixes (v2)
Luben Tuikov [Tue, 11 Aug 2020 23:59:58 +0000 (19:59 -0400)]
drm/scheduler: Scheduler priority fixes (v2)

Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add dsc_to_stream_resource for dcn3
Bhawanpreet Lakha [Mon, 17 Aug 2020 19:26:41 +0000 (15:26 -0400)]
drm/amd/display: Add dsc_to_stream_resource for dcn3

Without this, enabling dsc will cause a nullptr

Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdkfd: Initialize SDMA activity counter to 0
Mukul Joshi [Mon, 17 Aug 2020 17:46:56 +0000 (13:46 -0400)]
drm/amdkfd: Initialize SDMA activity counter to 0

To prevent reporting erroneous SDMA usage, initialize SDMA
activity counter to 0 before using.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Add DSC_DBG_EN shift/mask for dcn3
Bhawanpreet Lakha [Thu, 13 Aug 2020 20:51:09 +0000 (16:51 -0400)]
drm/amd/display: Add DSC_DBG_EN shift/mask for dcn3

This field is not defined for DCN3

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: [FW Promotion] Release 0.0.29
Anthony Koo [Sat, 8 Aug 2020 01:06:11 +0000 (21:06 -0400)]
drm/amd/display: [FW Promotion] Release 0.0.29

[Header Changes]
 - Add command for panel power seq control

Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>