linux-2.6-microblaze.git
4 years agodrm/amd/powerplay: retrieve the enabled feature mask from cache
Evan Quan [Mon, 30 Dec 2019 09:08:29 +0000 (17:08 +0800)]
drm/amd/powerplay: retrieve the enabled feature mask from cache

This is why those feature mask members designed for. And this
can reduce the SMU workload.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: avoid deadlock on Vega20 swSMU routine
Evan Quan [Mon, 30 Dec 2019 03:27:32 +0000 (11:27 +0800)]
drm/amd/powerplay: avoid deadlock on Vega20 swSMU routine

The lock required was already hold by its parent API.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update UMC 6.1 RAS error counter register access path
John Clements [Thu, 2 Jan 2020 03:32:15 +0000 (11:32 +0800)]
drm/amdgpu: update UMC 6.1 RAS error counter register access path

use proper method for SMN register access

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add helper function smu_get_dpm_level_range() for smu driver
Kevin Wang [Thu, 26 Dec 2019 06:41:22 +0000 (14:41 +0800)]
drm/amdgpu/smu: add helper function smu_get_dpm_level_range() for smu driver

this function can help smu driver to query dpm level clock range from
smu firmware.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: remove three set but not used variable
yu kuai [Thu, 26 Dec 2019 12:07:50 +0000 (20:07 +0800)]
drm/radeon: remove three set but not used variable

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/radeon/radeon_atombios.c: In function
‘radeon_get_atom_connector_info_from_object_table’:
drivers/gpu/drm/radeon/radeon_atombios.c:651:26: warning: variable
‘grph_obj_num’ set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/radeon/radeon_atombios.c:651:13: warning: variable
‘grph_obj_id’ set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/radeon/radeon_atombios.c:573:37: warning: variable
‘con_obj_type’ set but not used [-Wunused-but-set-variable]

They are never used, and so can be removed.

Signed-off-by: yu kuai <yukuai3@huawei.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/powerplay: fix NULL pointer issue when SMU disabled
Likun Gao [Wed, 25 Dec 2019 09:42:35 +0000 (17:42 +0800)]
drm/amdgpu/powerplay: fix NULL pointer issue when SMU disabled

Fix smu related NULL pointer issue which occurs when SMU is disabled.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: use unified variable smu->is_apu to check apu asic platform
Kevin Wang [Mon, 23 Dec 2019 10:17:36 +0000 (18:17 +0800)]
drm/amdgpu/smu: use unified variable smu->is_apu to check apu asic platform

use unified variable smu->is_apu to check apu asic in smu driver.

related patch:
drm/amd/powerplay: bypass dpm_context null pointer check guard for some
smu series

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: amalgamated PSP TA invoke functions
John Clements [Thu, 26 Dec 2019 03:27:46 +0000 (11:27 +0800)]
drm/amdgpu: amalgamated PSP TA invoke functions

reduce duplicate code

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: amalgamate PSP TA load/unload functions
John Clements [Thu, 26 Dec 2019 03:19:36 +0000 (11:19 +0800)]
drm/amdgpu: amalgamate PSP TA load/unload functions

reduce duplicate code

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: by default output PSP ret status in event of cmd failure
John Clements [Thu, 26 Dec 2019 03:13:53 +0000 (11:13 +0800)]
drm/amdgpu: by default output PSP ret status in event of cmd failure

update log level from DRM_DEBUG_DRIVER to DRM_WARN

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: correct RLC firmwares loading sequence
Evan Quan [Mon, 23 Dec 2019 08:13:48 +0000 (16:13 +0800)]
drm/amdgpu: correct RLC firmwares loading sequence

Per confirmation with RLC firmware team, the RLC should
be unhalted after all RLC related firmwares uploaded.
However, in fact the RLC is unhalted immediately after
RLCG firmware uploaded. And that may causes unexpected
PSP hang on loading the succeeding RLC save restore
list related firmwares.
So, we correct the firmware loading sequence to load
RLC save restore list related firmwares before RLCG
ucode. That will help to get around this issue.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: add check for baco support on Arcturus
Evan Quan [Tue, 24 Dec 2019 09:22:18 +0000 (17:22 +0800)]
drm/amd/powerplay: add check for baco support on Arcturus

This is used to determine whether runtime pm can be
supported or not.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in display_rq_dlg_calc_21.c
zhengbin [Tue, 24 Dec 2019 03:27:43 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in display_rq_dlg_calc_21.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:85:6-13: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:88:2-9: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:225:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:226:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:251:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:252:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:256:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:257:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:267:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:269:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:682:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c:1013:1-9: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in display_rq_dlg_calc_20v2.c
zhengbin [Tue, 24 Dec 2019 03:27:42 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in display_rq_dlg_calc_20v2.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:110:6-13: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:113:2-9: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:243:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:244:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:267:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:268:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:272:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:273:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:283:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:285:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:673:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c:962:1-9: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in display_rq_dlg_calc_20.c
zhengbin [Tue, 24 Dec 2019 03:27:41 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in display_rq_dlg_calc_20.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:110:6-13: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:113:2-9: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:243:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:244:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:267:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:268:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:272:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:273:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:283:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:285:3-11: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:673:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c:961:1-9: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in dce_calcs.c
zhengbin [Tue, 24 Dec 2019 03:27:40 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in dce_calcs.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:157:46-64: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:159:2-20: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:161:46-64: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:163:2-20: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:289:1-12: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:290:1-12: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:341:3-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c:343:4-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in display_mode_vba_21.c
zhengbin [Tue, 24 Dec 2019 03:27:39 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in display_mode_vba_21.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c:4124:3-28: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c:4128:5-30: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c:5207:3-37: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in dcn20_hwseq.c
zhengbin [Tue, 24 Dec 2019 03:27:38 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in dcn20_hwseq.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c:186:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c:189:2-10: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in dcn10_hw_sequencer.c
zhengbin [Tue, 24 Dec 2019 03:27:37 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in dcn10_hw_sequencer.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c:482:6-14: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c:485:2-10: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: use true, false for bool variable in dc_link_ddc.c
zhengbin [Tue, 24 Dec 2019 03:27:36 +0000 (11:27 +0800)]
drm/amd/display: use true, false for bool variable in dc_link_ddc.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c:593:6-9: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: use true, false for bool variable in vega20_hwmgr.c
zhengbin [Tue, 24 Dec 2019 02:08:14 +0000 (10:08 +0800)]
drm/amd/powerplay: use true, false for bool variable in vega20_hwmgr.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c:875:1-31: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: make the set_performance_level logic easier to follow
Alex Deucher [Fri, 20 Dec 2019 21:34:42 +0000 (16:34 -0500)]
drm/amdgpu/smu: make the set_performance_level logic easier to follow

Have every asic provide a callback for this rather than a mix
of generic and asic specific code.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoRevert "drm/amdgpu: simplify ATPX detection"
Alex Deucher [Fri, 20 Dec 2019 23:57:16 +0000 (18:57 -0500)]
Revert "drm/amdgpu: simplify ATPX detection"

This reverts commit f5fda6d89afe6e9cedaa1c3303903c905262f6e8.

You can't use BASE_CLASS in pci_get_class.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/995
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: simplify function return logic
Guchun Chen [Tue, 24 Dec 2019 06:28:30 +0000 (14:28 +0800)]
drm/amdgpu: simplify function return logic

Former return logic is redundant.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: support custom power profile setting
Evan Quan [Tue, 17 Dec 2019 07:25:56 +0000 (15:25 +0800)]
drm/amd/powerplay: support custom power profile setting

Support custom power profile mode settings on Arcturus.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: fix kernel_fpu_begin/_end() warnings
Xiaojie Yuan [Mon, 23 Dec 2019 03:01:10 +0000 (11:01 +0800)]
drm/amd/display: fix kernel_fpu_begin/_end() warnings

kernel_fpu_begin/_end() are already called inside dcn20_resource_construct,
and calling kernel_fpu_begin/_end() recursively triggers WARN_ON() when
CONFIG_X86_DEBUG_FPU is enabled.

[  107.060434] WARNING: CPU: 6 PID: 1370 at arch/x86/kernel/fpu/core.c:90 kernel_fpu_begin+0xbd/0xe0
<snip>
[  107.268197] Call Trace:
[  107.270751]  dcn20_patch_bounding_box+0x17/0x100 [amdgpu]
[  107.276204]  init_soc_bounding_box+0x1b3/0x5f0 [amdgpu]
[  107.281536]  ? _cond_resched+0x19/0x30
[  107.285307]  dcn20_resource_construct+0x3a9/0xa90 [amdgpu]
[  107.290957]  ? dcn20_resource_construct+0x3a9/0xa90 [amdgpu]
[  107.296621]  ? __alloc_pages_nodemask+0x16a/0x330
[  107.301476]  ? _cond_resched+0x19/0x30
[  107.305284]  ? kmem_cache_alloc_trace+0x197/0x230
[  107.310063]  ? _cond_resched+0x19/0x30
[  107.313783]  ? kmem_cache_alloc_trace+0x197/0x230
[  107.318691]  dcn20_create_resource_pool+0x42/0x70 [amdgpu]
[  107.324315]  dc_create_resource_pool+0x12d/0x170 [amdgpu]
[  107.329851]  dc_create+0x1b8/0x6a0 [amdgpu]
[  107.334013]  ? kmem_cache_alloc_trace+0x1e2/0x230
[  107.338832]  amdgpu_dm_init+0x13e/0x1c0 [amdgpu]
<snip>

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Avoid hanging hardware in stop_cpsch
Felix Kuehling [Fri, 20 Dec 2019 07:46:55 +0000 (02:46 -0500)]
drm/amdkfd: Avoid hanging hardware in stop_cpsch

Don't use the HWS if it's known to be hanging. In a reset also
don't try to destroy the HIQ because that may hang on SRIOV if the
KIQ is unresponsive.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Improve HWS hang detection and handling
Felix Kuehling [Fri, 20 Dec 2019 07:07:54 +0000 (02:07 -0500)]
drm/amdkfd: Improve HWS hang detection and handling

Move HWS hang detection into unmap_queues_cpsch to catch hangs in all
cases. If this happens during a reset, don't schedule another reset
because the reset already in progress is expected to take care of it.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Remove unused variable
Felix Kuehling [Fri, 20 Dec 2019 07:01:24 +0000 (02:01 -0500)]
drm/amdkfd: Remove unused variable

dqm->pipeline_mem wasn't used anywhere.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: Fix permissions of hang_hws
Felix Kuehling [Fri, 20 Dec 2019 03:36:55 +0000 (22:36 -0500)]
drm/amdkfd: Fix permissions of hang_hws

Reading from /sys/kernel/debug/kfd/hang_hws would cause a kernel
oops because we didn't implement a read callback. Set the permission
to write-only to prevent that.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu
Chunming Zhou [Tue, 28 May 2019 02:46:04 +0000 (10:46 +0800)]
drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu

Can expose it now that the khronos has exposed the
vlk extension.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use true, false for bool variable in amdgpu_psp.c
zhengbin [Mon, 23 Dec 2019 13:46:21 +0000 (21:46 +0800)]
drm/amdgpu: use true, false for bool variable in amdgpu_psp.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:674:2-26: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:794:1-25: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:897:2-36: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1016:1-35: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1087:2-34: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:1177:1-33: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use true, false for bool variable in amdgpu_debugfs.c
zhengbin [Mon, 23 Dec 2019 13:46:20 +0000 (21:46 +0800)]
drm/amdgpu: use true, false for bool variable in amdgpu_debugfs.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:132:2-10: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:140:2-10: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:142:13-21: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use true, false for bool variable in amdgpu_device.c
zhengbin [Mon, 23 Dec 2019 13:46:19 +0000 (21:46 +0800)]
drm/amdgpu: use true, false for bool variable in amdgpu_device.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3961:1-19: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3981:1-19: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use true, false for bool variable in mxgpu_nv.c
zhengbin [Mon, 23 Dec 2019 13:46:18 +0000 (21:46 +0800)]
drm/amdgpu: use true, false for bool variable in mxgpu_nv.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:255:2-20: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:267:2-20: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: use true, false for bool variable in mxgpu_ai.c
zhengbin [Mon, 23 Dec 2019 13:46:17 +0000 (21:46 +0800)]
drm/amdgpu: use true, false for bool variable in mxgpu_ai.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:253:2-20: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:265:2-20: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true,false for bool variable in ni.c
zhengbin [Mon, 23 Dec 2019 09:25:52 +0000 (17:25 +0800)]
drm/radeon: use true,false for bool variable in ni.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/ni.c:2020:2-15: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/radeon/ni.c:2088:2-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true,false for bool variable in cik.c
zhengbin [Mon, 23 Dec 2019 09:25:51 +0000 (17:25 +0800)]
drm/radeon: use true,false for bool variable in cik.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/cik.c:8140:2-15: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/radeon/cik.c:8212:2-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true,false for bool variable in rv770.c
zhengbin [Mon, 23 Dec 2019 09:25:50 +0000 (17:25 +0800)]
drm/radeon: use true,false for bool variable in rv770.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/rv770.c:1706:2-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true, false for bool variable in evergreen.c
zhengbin [Mon, 23 Dec 2019 09:25:49 +0000 (17:25 +0800)]
drm/radeon: use true, false for bool variable in evergreen.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/evergreen.c:4948:2-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true,false for bool variable in r600.c
zhengbin [Mon, 23 Dec 2019 09:25:48 +0000 (17:25 +0800)]
drm/radeon: use true,false for bool variable in r600.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/r600.c:3056:2-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true,false for bool variable in si.c
zhengbin [Mon, 23 Dec 2019 09:25:47 +0000 (17:25 +0800)]
drm/radeon: use true,false for bool variable in si.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/si.c:6475:2-15: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/radeon/si.c:6542:2-15: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/radeon: use true,false for bool variable in r100.c
zhengbin [Mon, 23 Dec 2019 09:25:46 +0000 (17:25 +0800)]
drm/radeon: use true,false for bool variable in r100.c

Fixes coccicheck warning:

drivers/gpu/drm/radeon/r100.c:1826:3-31: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/radeon/r100.c:1828:3-31: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/radeon/r100.c:2390:2-22: WARNING: Assignment of 0/1 to bool variable
drivers/gpu/drm/radeon/r100.c:2395:2-22: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add peak profile support for navi12
Alex Deucher [Fri, 20 Dec 2019 20:03:03 +0000 (15:03 -0500)]
drm/amdgpu/smu: add peak profile support for navi12

Add defined peak sclk for navi12 peak profile mode.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu/navi: Adjust default behavior for peak sclk profile
Alex Deucher [Fri, 20 Dec 2019 19:53:35 +0000 (14:53 -0500)]
drm/amdgpu/smu/navi: Adjust default behavior for peak sclk profile

Fetch the sclk from the pptable if there is no specified sclk for
the board.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add missed return value set for error case
Guchun Chen [Mon, 23 Dec 2019 03:40:13 +0000 (11:40 +0800)]
drm/amdgpu: add missed return value set for error case

Return value should be set when going to error handle tag
for error case, this can avoid potential invalid array
access by upper caller.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove FB location config for sriov
Frank.Min [Thu, 19 Dec 2019 09:29:54 +0000 (17:29 +0800)]
drm/amdgpu: remove FB location config for sriov

FB location is already programmed by HV driver
for arcutus so remove this part

Signed-off-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: enable xgmi init for sriov use case
Frank.Min [Wed, 18 Dec 2019 11:01:43 +0000 (19:01 +0800)]
drm/amdgpu: enable xgmi init for sriov use case

1. enable xgmi ta initialization for sriov
2. enable xgmi initialization for sriov

Signed-off-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: remove memory training p2c buffer reservation(V2)
Tianci.Yin [Tue, 17 Dec 2019 06:34:45 +0000 (14:34 +0800)]
drm/amdgpu: remove memory training p2c buffer reservation(V2)

IP discovery TMR(occupied the top VRAM with size DISCOVERY_TMR_SIZE)
has been reserved, and the p2c buffer is in the range of this TMR, so
the p2c buffer reservation is unnecessary.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update the method to get fb_loc of memory training(V4)
Tianci.Yin [Mon, 16 Dec 2019 07:17:01 +0000 (15:17 +0800)]
drm/amdgpu: update the method to get fb_loc of memory training(V4)

The method of getting fb_loc changed from parsing VBIOS to
taking certain offset from top of VRAM

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Remove unneeded variable 'ret' in navi10_ih.c
Ma Feng [Fri, 20 Dec 2019 09:36:08 +0000 (17:36 +0800)]
drm/amdgpu: Remove unneeded variable 'ret' in navi10_ih.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/navi10_ih.c:113:5-8: Unneeded variable: "ret". Return "0" on line 182

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ma Feng <mafeng.ma@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Remove unneeded variable 'ret' in amdgpu_device.c
Ma Feng [Mon, 23 Dec 2019 19:58:27 +0000 (14:58 -0500)]
drm/amdgpu: Remove unneeded variable 'ret' in amdgpu_device.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1036:5-8: Unneeded variable: "ret". Return "0" on line 1079

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ma Feng <mafeng.ma@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx: Add mmSDMA2-7_EDC_COUNTER to support Arcturus
James Zhu [Mon, 16 Dec 2019 20:49:11 +0000 (15:49 -0500)]
drm/amdgpu/gfx: Add mmSDMA2-7_EDC_COUNTER to support Arcturus

Add mmSDMA2-7_EDC_COUNTER to support Arcturus

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus
James Zhu [Mon, 16 Dec 2019 20:46:27 +0000 (15:46 -0500)]
drm/amdgpu/gfx: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus

Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/gfx: Replace ARRAY_SIZE with size variable
James Zhu [Mon, 16 Dec 2019 20:42:43 +0000 (15:42 -0500)]
drm/amdgpu/gfx: Replace ARRAY_SIZE with size variable

Replace ARRAY_SIZE with size variables to support
different ASICs.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus
James Zhu [Mon, 16 Dec 2019 20:31:07 +0000 (15:31 -0500)]
drm/amdgpu: Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7 to support Arcturus

Arcturus has 8 SEs. Add mmCOMPUTE_STATIC_THREAD_MGMT_SE4-7
for EDC GPR _workarounds,

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Added ASIC specific check in gmc v9.0 ECC interrupt programming sequence
John Clements [Fri, 20 Dec 2019 08:21:32 +0000 (16:21 +0800)]
drm/amdgpu: Added ASIC specific check in gmc v9.0 ECC interrupt programming sequence

Devices newer then VEGA10/12 shall have these programming sequences performed by PSP BL

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: enlarge agp_start address into 48bit
Frank.Min [Wed, 18 Dec 2019 10:37:11 +0000 (18:37 +0800)]
drm/amdgpu: enlarge agp_start address into 48bit

max range of the agp aperture is 48 bits, so
enlarge agp_start address into 48bit with all bits set

Signed-off-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: disable VCN2.5 ib test for Arcturus sriov
Jane Jian [Wed, 18 Dec 2019 10:53:46 +0000 (18:53 +0800)]
drm/amdgpu: disable VCN2.5 ib test for Arcturus sriov

currently using TMR loading VCN fw MMSCH would fail
to init after FLR, just disable ib test for temporarily
daily testing, continuing debug with mm team.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix ctx init failure for asics without gfx ring
Le Ma [Thu, 19 Dec 2019 11:26:02 +0000 (19:26 +0800)]
drm/amdgpu: fix ctx init failure for asics without gfx ring

This workaround does not affect other asics because amdgpu only need expose
one gfx sched to user for now.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: attempt xgmi perfmon re-arm on failed arm
Jonathan Kim [Mon, 16 Dec 2019 17:31:57 +0000 (12:31 -0500)]
drm/amdgpu: attempt xgmi perfmon re-arm on failed arm

The DF routines to arm xGMI performance will attempt to re-arm both on
performance monitoring start and read on initial failure to arm.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add perfmons accessible during df c-states
Jonathan Kim [Thu, 12 Dec 2019 16:46:05 +0000 (11:46 -0500)]
drm/amdgpu: add perfmons accessible during df c-states

During DF C-State, Perfmon counters outside of range 1D700-1D7FF will
encounter SLVERR affecting xGMI performance monitoring.  PerfmonCtr[7:4]
is being added to avoid SLVERR during read since it falls within this
range.  PerfmonCtl[7:4] is being added in order to arm PerfmonCtr[7:4].
Since PerfmonCtl[7:4] exists outside of range 1D700-1D7FF, DF routines
will be enabled to opportunistically re-arm PerfmonCtl[7:4] on retry
after SLVERR.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: simplify padding calculations (v2)
Luben Tuikov [Thu, 24 Oct 2019 23:30:13 +0000 (19:30 -0400)]
drm/amdgpu: simplify padding calculations (v2)

Simplify padding calculations.

v2: Comment update and spacing.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: expose num_cp_queues data field to topology node (v2)
Huang Rui [Mon, 16 Dec 2019 07:02:51 +0000 (15:02 +0800)]
drm/amdkfd: expose num_cp_queues data field to topology node (v2)

Thunk driver would like to know the num_cp_queues data, however this data relied
on different asic specific. So it's better to get it from kfd driver.

v2: don't update name size.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdkfd: expose num_sdma_queues_per_engine data field to topology node (v2)
Huang Rui [Mon, 16 Dec 2019 07:02:50 +0000 (15:02 +0800)]
drm/amdkfd: expose num_sdma_queues_per_engine data field to topology node (v2)

Thunk driver would like to know the num_sdma_queues_per_engine data, however
this data relied on different asic specific. So it's better to get it from kfd
driver.

v2: don't update the name size.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: skip disable dynamic state management
Yintian Tao [Wed, 18 Dec 2019 10:11:57 +0000 (18:11 +0800)]
drm/amd/powerplay: skip disable dynamic state management

Under sriov, the disable operation is no allowed.

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: enable VCN0 and VCN1 sriov instances support for Arcturus
Jane Jian [Mon, 16 Dec 2019 09:04:01 +0000 (17:04 +0800)]
drm/amdgpu: enable VCN0 and VCN1 sriov instances support for Arcturus

v1: compared to bare-metal: sriov support psp loading VCN firmware; only one
encoding ring would be used in each instance.
v2: keep unchange for bare-metal VCN2.5 hw_init, just add a flag with sriov
and also remove multiple lines.
v3: squash in warning fix

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: skip VCN2.5 power gating and clock gating for sriov Arcturus
Jane Jian [Mon, 16 Dec 2019 08:24:13 +0000 (16:24 +0800)]
drm/amdgpu: skip VCN2.5 power gating and clock gating for sriov Arcturus

v1: skip gating in serveral called functions by power gating and clock gating
v2: from suggestion, skip setting gate in both set function, which is where
it being called.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: update VCN1(dual instances) fw types ID and VCN ip block type
Jane Jian [Mon, 16 Dec 2019 06:56:35 +0000 (14:56 +0800)]
drm/amdgpu: update VCN1(dual instances) fw types ID and VCN ip block type

Previously there is no VCN1 type ID in psp gfx interface. Also add VCN ip
block type unless the reinit after FLR for sriov would fail.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add VCN2.5 sriov start for Arctrus
Jane Jian [Mon, 16 Dec 2019 06:23:37 +0000 (14:23 +0800)]
drm/amdgpu: add VCN2.5 sriov start for Arctrus

Use MMSCH V1 to finish Memory Controller
programming as well as start MMSCH to do
VCN2.5 initialization.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: add VCN2.5 MMSCH start for Arcturus
Jane Jian [Mon, 16 Dec 2019 06:14:49 +0000 (14:14 +0800)]
drm/amdgpu: add VCN2.5 MMSCH start for Arcturus

Use MMSCH to do the initialization since MMSCH
manages VCN2.5 instances and its world switch.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: move umc offset to one new header file for Arcturus
Guchun Chen [Tue, 17 Dec 2019 09:01:28 +0000 (17:01 +0800)]
drm/amdgpu: move umc offset to one new header file for Arcturus

Code refactor and no functional change.

Fixes: 4cf781c24c3b ("drm/amdgpu: Added RAS UMC error query support for Arcturus")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/display: include delay.h
Alex Deucher [Tue, 17 Dec 2019 20:39:04 +0000 (15:39 -0500)]
drm/amdgpu/display: include delay.h

For udelay.  This is needed for some platforms.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazluaskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add metrics table lock for vega20 (v2)
Alex Deucher [Tue, 17 Dec 2019 14:51:40 +0000 (09:51 -0500)]
drm/amdgpu/smu: add metrics table lock for vega20 (v2)

To protect access to the metrics table.

v2: unlock on error

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add metrics table lock for renoir (v2)
Alex Deucher [Tue, 17 Dec 2019 14:51:13 +0000 (09:51 -0500)]
drm/amdgpu/smu: add metrics table lock for renoir (v2)

To protect access to the metrics table.

v2: unlock on error

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add metrics table lock for navi (v2)
Alex Deucher [Tue, 17 Dec 2019 14:50:42 +0000 (09:50 -0500)]
drm/amdgpu/smu: add metrics table lock for navi (v2)

To protect access to the metrics table.

v2: unlock on error

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add metrics table lock for arcturus (v2)
Alex Deucher [Tue, 17 Dec 2019 14:49:52 +0000 (09:49 -0500)]
drm/amdgpu/smu: add metrics table lock for arcturus (v2)

To protect access to the metrics table.

v2: unlock on error

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: add metrics table lock
Alex Deucher [Tue, 17 Dec 2019 14:35:01 +0000 (09:35 -0500)]
drm/amdgpu/smu: add metrics table lock

This table is used for lots of things, add it's own lock.

Bug: https://gitlab.freedesktop.org/drm/amd/issues/900
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agogpu: drm: dead code elimination
Pan Zhang [Wed, 18 Dec 2019 03:50:08 +0000 (11:50 +0800)]
gpu: drm: dead code elimination

this set adds support for removal of gpu drm dead code.

patch3 is similar with patch 1:
`num` is a data of u8 type and ATOM_MAX_HW_I2C_READ == 255,

so there is a impossible condition '(num > 255) => (0-255 > 255)'.

Signed-off-by: Pan Zhang <zhangpan26@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: wait for all rings to drain before runtime suspending
Alex Deucher [Tue, 10 Dec 2019 21:21:44 +0000 (16:21 -0500)]
drm/amdgpu: wait for all rings to drain before runtime suspending

Add a safety check to runtime suspend to make sure all outstanding
fences have signaled before we suspend.  Doesn't fix any known issue.

We already do this via the fence driver suspend function, but we
just force completion rather than bailing.  This bails on runtime
suspend so we can try again later once the fences are signaled to
avoid missing any outstanding work.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/smu: fix spelling
Alex Deucher [Mon, 16 Dec 2019 20:05:22 +0000 (15:05 -0500)]
drm/amdgpu/smu: fix spelling

s/dispaly/display/g

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Switch from system_highpri_wq to system_unbound_wq
Andrey Grodzovsky [Wed, 11 Dec 2019 19:25:36 +0000 (14:25 -0500)]
drm/amdgpu: Switch from system_highpri_wq to system_unbound_wq

This is to avoid queueing jobs to same CPU during XGMI hive reset
because there is a strict timeline for when the reset commands
must reach all the GPUs in the hive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Redo XGMI reset synchronization.
Andrey Grodzovsky [Wed, 11 Dec 2019 19:18:31 +0000 (14:18 -0500)]
drm/amdgpu: Redo XGMI reset synchronization.

Use task barrier in XGMI hive to synchronize ASIC resets
across devices in XGMI hive.

v2: Return right away with a warning if no xgmi hive, update doc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Add task barrier to XGMI hive.
Andrey Grodzovsky [Fri, 6 Dec 2019 17:43:30 +0000 (12:43 -0500)]
drm/amdgpu: Add task barrier to XGMI hive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm: Add Reusable task barrier.
Andrey Grodzovsky [Fri, 6 Dec 2019 17:26:33 +0000 (12:26 -0500)]
drm: Add Reusable task barrier.

It is used to synchronize N threads at a rendevouz point before execution
of critical code that has to be started by all the threads at approximatly
the same time.

v2: Remove mention of reset use case, improve doc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: reverts commit ce316fa55ef0f1751276b846a54fb3b835bd5e64.
Andrey Grodzovsky [Fri, 6 Dec 2019 18:19:15 +0000 (13:19 -0500)]
drm/amdgpu: reverts commit ce316fa55ef0f1751276b846a54fb3b835bd5e64.

In preparation for doing XGMI reset synchronization using task barrier.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/vcn: remove unnecessary included headers
Leo Liu [Mon, 16 Dec 2019 16:01:51 +0000 (11:01 -0500)]
drm/amdgpu/vcn: remove unnecessary included headers

Esp. VCN1.0 headers should not be here

v2: add back the <linux/module.h> to keep consistent.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix KIQ ring test fail in TDR of SRIOV
Monk Liu [Tue, 17 Dec 2019 10:16:44 +0000 (18:16 +0800)]
drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

issues:
MEC is ruined by the amdkfd_pre_reset after VF FLR done

fix:
amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR,
the correct sequence is do amdkfd_pre_reset before VF FLR but there is
a limitation to block this sequence:
if we do pre_reset() before VF FLR, it would go KIQ way to do register
access and stuck there, because KIQ probably won't work by that time
(e.g. you already made GFX hang)

so the best way right now is to simply remove it.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: fix double gpu_recovery for NV of SRIOV
Monk Liu [Tue, 17 Dec 2019 10:18:31 +0000 (18:18 +0800)]
drm/amdgpu: fix double gpu_recovery for NV of SRIOV

issues:
gpu_recover() is re-entered by the mailbox interrupt
handler mxgpu_nv.c

fix:
we need to bypass the gpu_recover() invoke in mailbox
interrupt as long as the timeout is not infinite (thus the TDR
will be triggered automatically after time out, no need to invoke
gpu_recover() through mailbox interrupt.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: skip soc clk setting under pp one vf
Yintian Tao [Tue, 17 Dec 2019 03:43:40 +0000 (11:43 +0800)]
drm/amd/powerplay: skip soc clk setting under pp one vf

Under sriov pp one vf mode, there is no need to set
soc clk under pp one vf because smu firmware will depend
on the mclk to set the appropriate soc clk for it.

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by : Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/scheduler: do not keep a copy of sched list
Nirmoy Das [Mon, 9 Dec 2019 21:52:25 +0000 (22:52 +0100)]
drm/scheduler: do not keep a copy of sched list

entity should not keep copy and maintain sched list for
itself.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agoamd/amdgpu: add sched array to IPs with multiple run-queues
Nirmoy Das [Mon, 16 Dec 2019 13:43:34 +0000 (14:43 +0100)]
amd/amdgpu: add sched array to IPs with multiple run-queues

This sched array can be passed on to entity creation routine
instead of manually creating such sched array on every context creation.

v2: squash in missing break fix

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds list
Nirmoy Das [Fri, 6 Dec 2019 15:55:49 +0000 (16:55 +0100)]
drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds list

drm_sched_entity_init() takes drm gpu scheduler list instead of
drm_sched_rq list. This makes conversion of drm_sched_rq list
to drm gpu scheduler list unnecessary

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/scheduler: rework entity creation
Nirmoy Das [Thu, 5 Dec 2019 10:38:00 +0000 (11:38 +0100)]
drm/scheduler: rework entity creation

Entity currently keeps a copy of run_queue list and modify it in
drm_sched_entity_set_priority(). Entities shouldn't modify run_queue
list. Use drm_gpu_scheduler list instead of drm_sched_rq list
in drm_sched_entity struct. In this way we can select a runqueue based
on entity/ctx's priority for a  drm scheduler.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu/pm_runtime: update usage count in fence handling
Alex Deucher [Thu, 12 Dec 2019 22:43:36 +0000 (17:43 -0500)]
drm/amdgpu/pm_runtime: update usage count in fence handling

Increment the usage count in emit fence, and decrement in
process fence to make sure the GPU is always considered in
use while there are fences outstanding.  We always wait for
the engines to drain in runtime suspend, but in practice
that only covers short lived jobs for gfx.  This should
cover us for longer lived fences.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/powerplay: Add SMU WMTABLE Validity Check for Renoir
Zhan Liu [Mon, 16 Dec 2019 19:56:50 +0000 (14:56 -0500)]
drm/amd/powerplay: Add SMU WMTABLE Validity Check for Renoir

[Why]
SMU watermark table (WMTABLE) validity check is missing on Renoir.
This validity check is very useful for checking whether
WMTABLE is updated successfully.

[How]
Add SMU watermark validity check.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Remove unneeded semicolon in amdgpu_ras.c
zhengbin [Sat, 14 Dec 2019 09:02:24 +0000 (17:02 +0800)]
drm/amdgpu: Remove unneeded semicolon in amdgpu_ras.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:318:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Remove unneeded semicolon in gfx_v10_0.c
zhengbin [Sat, 14 Dec 2019 09:02:23 +0000 (17:02 +0800)]
drm/amdgpu: Remove unneeded semicolon in gfx_v10_0.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:1967:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amdgpu: Remove unneeded semicolon in amdgpu_pmu.c
zhengbin [Sat, 14 Dec 2019 09:02:22 +0000 (17:02 +0800)]
drm/amdgpu: Remove unneeded semicolon in amdgpu_pmu.c

Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:110:3-4: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:133:2-3: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:163:2-3: Unneeded semicolon
drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:191:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 years agodrm/amd/display: Remove unneeded semicolon
zhengbin [Sat, 14 Dec 2019 09:12:33 +0000 (17:12 +0800)]
drm/amd/display: Remove unneeded semicolon

Fixes coccicheck warning:

drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c:412:90-91: Unneeded semicolon

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>