linux-2.6-microblaze.git
6 years agodrm/amd/display: Add detile buffer size for DCN20
Dmytro Laktyushkin [Tue, 3 Sep 2019 18:10:28 +0000 (14:10 -0400)]
drm/amd/display: Add detile buffer size for DCN20

Detile buffer size affects dcc caps and therefore needs to be
corrected for each ip.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: fix use of uninitialized variable
Martin Leung [Tue, 3 Sep 2019 19:22:30 +0000 (15:22 -0400)]
drm/amd/display: fix use of uninitialized variable

tg_inst may be used uninitialized, so initialize it to 0.

Signed-off-by: Martin Leung <martin.leung@amd.com>
Reviewed-by: Jaehyun Chung <Jaehyun.Chung@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: 3.2.51.1
Aric Cyr [Tue, 3 Sep 2019 19:54:43 +0000 (15:54 -0400)]
drm/amd/display: 3.2.51.1

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Add missing HBM support and raise Vega20's uclk.
Zhan Liu [Thu, 22 Aug 2019 18:54:18 +0000 (14:54 -0400)]
drm/amd/display: Add missing HBM support and raise Vega20's uclk.

[Why]
When more than 2 displays are connected to the graphics card,
only the minimum memory clock is needed. However, when more
displays are connected, the minimum memory clock is not
sufficient enough to support the overwhelming bandwidth.
System will hang under this circumstance.

Also, the old code didn't address HBM cards, which has 2
pseudo channels. We need to add the HBM part here.

[How]
When graphics card connects to 2 or more displays,
switch to high memory clock. Also, choose memory
multiplier based on whether its regular DRAM or HBM.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: 3.2.51
Aric Cyr [Tue, 3 Sep 2019 14:39:16 +0000 (10:39 -0400)]
drm/amd/display: 3.2.51

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: enable single dp seamless boot
Martin Leung [Fri, 16 Aug 2019 21:26:23 +0000 (17:26 -0400)]
drm/amd/display: enable single dp seamless boot

[why]
seamless boot didn't work for non edp's before

[how]
removed edp-specific code, made dp read uefi-set link settings. Also fixed
a hubbub code line to be consistent with usage of function.

Signed-off-by: Martin Leung <martin.leung@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: update odm mode validation to be in line with policy
Dmytro Laktyushkin [Fri, 30 Aug 2019 20:32:13 +0000 (16:32 -0400)]
drm/amd/display: update odm mode validation to be in line with policy

Previously 8k30 worked with dsc and odm combine due to a workaround that ran
the formula a second time with dsc support enable should dsc validation fail.
This worked when clocks were low enough for formula to enable odm to lower
voltage, however now broke due to increased clocks.

This change updates the ODM combine policy within the formula to properly
reflect our current policy within DC, only enabling ODM when we have to, as
well as adding a check for viewport width when dsc is enabled.

As a side effect the redundant call to dml when odm is required is now
unnecessary.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Optimize clocks on clock change
Wesley Chalmers [Fri, 30 Aug 2019 18:59:00 +0000 (14:59 -0400)]
drm/amd/display: Optimize clocks on clock change

[WHY]
Presently, there is no way for clocks to be lowered, only raised.

[HOW]
Compare clock status against previous known clock status, and optimize
if different.
This requires re-ordering the layout of the dc_clocks structure, as the
current ordering allows identical clock states to appear different.

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Set number of pipes to 1 if the second pipe was disabled
Nikola Cornij [Wed, 28 Aug 2019 22:30:43 +0000 (18:30 -0400)]
drm/amd/display: Set number of pipes to 1 if the second pipe was disabled

[why]
Some ODM-related register settings are inconsistently updated by VBIOS, causing
the state in DC to be invalid, which would then end up crashing in certain
use-cases (such as disable/enable device).

[how]
Check the enabled status of the second pipe when determining the number of
OPTC sources. If the second pipe is disabled, set the number of sources to 1
regardless of other settings (that may not be updated correctly).

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: set minimum abm backlight level
Anthony Koo [Thu, 29 Aug 2019 14:49:12 +0000 (10:49 -0400)]
drm/amd/display: set minimum abm backlight level

[Why]
A lot of the time, the backlight characteristic curve maps min backlight
to a non-zero value.
But there are cases where we want the curve to intersect at 0.
In this scenario even if OS never asks to set 0% backlight, the ABM
reduction can result in backlight being lowered close to 0.
This particularly can cause problems in some LED drivers, and in
general just looks like backlight is completely off.

[How]
Add default cap to disallow backlight from dropping below 1%
even after ABM reduction is applied.

Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Revert fixup DPP programming sequence
Wesley Chalmers [Mon, 26 Aug 2019 19:02:47 +0000 (15:02 -0400)]
drm/amd/display: Revert fixup DPP programming sequence

[WHY]
This change was made because DTO programming was double-buffered, which
is itself an issue. After deactivating the DTO double buffer, this
change becomes unnecessary.

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Do not double-buffer DTO adjustments
Wesley Chalmers [Mon, 16 Sep 2019 20:42:38 +0000 (15:42 -0500)]
drm/amd/display: Do not double-buffer DTO adjustments

[WHY]
When changing DPP global ref clock, DTO adjustments must take effect
immediately, or else underflow may occur.
It appears the original decision to double-buffer DTO adjustments was made to
prevent underflows that occur when raising DPP ref clock (which is not
double-buffered), but that same decision causes similar issues when
lowering DPP global ref clock. The better solution is to order the
adjustments according to whether clocks are being raised or lowered.

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Separate hardware initialization from creation
Julian Parkin [Mon, 12 Aug 2019 22:47:50 +0000 (18:47 -0400)]
drm/amd/display: Separate hardware initialization from creation

[Why]
Separating the hardware initialization from the creation of the
dc structures gives greater flexibility to the dm to override
options for debugging.

[How]
Move the hardware initialization call to a new function,
dc_hardware_init. No functional change is intended.

Signed-off-by: Julian Parkin <julian.parkin@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: fix i2c wtire mot incorrect issue
Lewis Huang [Tue, 27 Aug 2019 09:03:41 +0000 (17:03 +0800)]
drm/amd/display: fix i2c wtire mot incorrect issue

[Why]
I2C write command always send mot = true will cause sink state incorrect.

[How]
1. Remove default i2c write mot = true.
2. Deciding mot flag by is_end_of_payload flag.

Signed-off-by: Lewis Huang <Lewis.Huang@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Handle virtual signal type in disable_link()
Martin Tsai [Thu, 22 Aug 2019 02:02:13 +0000 (10:02 +0800)]
drm/amd/display: Handle virtual signal type in disable_link()

[Why]
The new implementation changed the behavior to allow process setMode
to DAL when DAL returns empty mode query for unplugged display.
This will trigger additional disable_link().
When unplug HDMI from MST dock, driver will update stream->signal to
"Virtual". disable_link() will call disable_output() if the signal type
is not DP and induce other displays on MST dock show black screen.

[How]
Don't need to process disable_output() if the signal type is virtual.

Signed-off-by: Martin Tsai <martin.tsai@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: fix global sync param extraction indexing
Dmytro Laktyushkin [Fri, 23 Aug 2019 18:22:40 +0000 (14:22 -0400)]
drm/amd/display: fix global sync param extraction indexing

dcn20_calculate_dlg_params was incorrectly indexing pipe src and
dst structs when extracting global sync params.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Jaehyun Chung <Jaehyun.Chung@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: add vtg update after global sync update
Dmytro Laktyushkin [Mon, 26 Aug 2019 19:04:18 +0000 (15:04 -0400)]
drm/amd/display: add vtg update after global sync update

Global sync update was missing vtg update resulting in underflow if
vstartup decreased a significant amount.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Jaehyun Chung <Jaehyun.Chung@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Add debugfs entry to force YUV420 output
Stylon Wang [Tue, 20 Aug 2019 18:48:37 +0000 (14:48 -0400)]
drm/amd/display: Add debugfs entry to force YUV420 output

[Why]
Even if YUV420 is available for video mode, YUV444 is still
automatically selected. This poses a problem for compliance tests.

[How]
Add a per-connector debugfs entry "force_yuv420_output" to force
selection of YUV420 mode.

Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: add additional flag consideration for surface update
Dmytro Laktyushkin [Thu, 22 Aug 2019 18:12:57 +0000 (14:12 -0400)]
drm/amd/display: add additional flag consideration for surface update

Surface dchub/dpp update would not trigger if a stream update was the
only cause. This change now allows stream flags to trigger this update.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Replace for loop w/ function call
Wesley Chalmers [Wed, 21 Aug 2019 20:09:27 +0000 (16:09 -0400)]
drm/amd/display: Replace for loop w/ function call

[WHY]
A function to adjust DPP clocks with DTO already exists; function code
is identical to the code replaced here

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Rebuild mapped resources after pipe split
Mikita Lipski [Fri, 23 Aug 2019 17:26:24 +0000 (13:26 -0400)]
drm/amd/display: Rebuild mapped resources after pipe split

[why]
The issue is specific for linux, as on timings such as 8K@60
or 4K@144 DSC should be working in combination with ODM Combine
in order to ensure that we can run those timings. The validation
for those timings was passing, but when pipe split was happening
second pipe wasn't being programmed.

[how]
Rebuild mapped resources if we split stream for ODM.

Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: 3.2.50
Aric Cyr [Mon, 26 Aug 2019 19:54:27 +0000 (15:54 -0400)]
drm/amd/display: 3.2.50

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: define parameters for abm 2.3
Josip Pavic [Sat, 24 Aug 2019 00:54:12 +0000 (20:54 -0400)]
drm/amd/display: define parameters for abm 2.3

[Why]
Current configuration 0 is just a placeholder, and final parameters needed.
Also, configuration 1 is expected to emulate ABM 2.1 but is too aggressive.

[How]
Redefine configuration 0 with the finalized parameters, and increase the
contrast gain of configuration 1 so that it properly emulates ABM 2.1.

Signed-off-by: Josip Pavic <Josip.Pavic@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Fix HUBP secondary viewport programming
Ilya Bakoulin [Fri, 23 Aug 2019 16:45:34 +0000 (12:45 -0400)]
drm/amd/display: Fix HUBP secondary viewport programming

[Why]
Secondary viewport dimension/position registers are not programmed,
which can cause issues in some stereo configurations.

[How]
Add register definitions and register programming.

Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: cleanup creating BOs at fixed location (v2)
Christian König [Fri, 13 Sep 2019 11:43:15 +0000 (13:43 +0200)]
drm/amdgpu: cleanup creating BOs at fixed location (v2)

The placement is something TTM/BO internal and the RAS code should
avoid touching that directly.

Add a helper to create a BO at a fixed location and use that instead.

v2: squash in fixes (Alex)

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu:Fix EEPROM checksum calculation.
Andrey Grodzovsky [Thu, 12 Sep 2019 21:16:32 +0000 (17:16 -0400)]
drm/amdgpu:Fix EEPROM checksum calculation.

Fix checksum calculation after manually resetting the table.
Unify reset and empty EEPROM init flow.
Protect the table reset with lock.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: fix ras ctrl debugfs node leak
Guchun Chen [Mon, 16 Sep 2019 05:42:46 +0000 (13:42 +0800)]
drm/amdgpu: fix ras ctrl debugfs node leak

Use debugfs_remove_recursive to remove the whole debugfs
directory instead of removing the node one by one.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: trace if a PD/PT update is done directly
Christian König [Wed, 3 Apr 2019 11:30:56 +0000 (13:30 +0200)]
drm/amdgpu: trace if a PD/PT update is done directly

This is usfull for debugging.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: drop double HDP flush in the VM code
Christian König [Wed, 3 Apr 2019 12:11:53 +0000 (14:11 +0200)]
drm/amdgpu: drop double HDP flush in the VM code

Already done in the CPU based backend code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: cleanup coding style in the VM code a bit
Christian König [Fri, 13 Sep 2019 10:12:40 +0000 (12:12 +0200)]
drm/amdgpu: cleanup coding style in the VM code a bit

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: revert "disable bulk moves for now"
Christian König [Thu, 12 Sep 2019 10:13:50 +0000 (12:13 +0200)]
drm/amdgpu: revert "disable bulk moves for now"

This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zhou, David(ChunMing) <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu/SRIOV: Navi12 SRIOV VF gets GTT base
Jiange Zhao [Mon, 16 Sep 2019 06:56:06 +0000 (14:56 +0800)]
drm/amdgpu/SRIOV: Navi12 SRIOV VF gets GTT base

With changes in PSP and HV, SRIOV VF will handle

vram gtt location just like bare metal. There is

no need to differentiate it anymore.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: remove program of lbpw for renoir
Aaron Liu [Mon, 16 Sep 2019 01:26:28 +0000 (09:26 +0800)]
drm/amdgpu: remove program of lbpw for renoir

These is no LBPW on Renoir. So removing program of lbpw for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: Swap trap temporary registers in gfx10 trap handler
Jay Cornwall [Thu, 12 Sep 2019 19:03:41 +0000 (14:03 -0500)]
drm/amdkfd: Swap trap temporary registers in gfx10 trap handler

ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
instead, aligning implementation with gfx9 trap handler.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: shaoyun liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: remove the redundant null checks
zhong jiang [Tue, 3 Sep 2019 06:15:05 +0000 (14:15 +0800)]
drm/amdgpu: remove the redundant null checks

debugfs_remove and kfree has taken the null check in account.
hence it is unnecessary to check it. Just remove the condition.
No functional change.

This issue was detected by using the Coccinelle software.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/radeon: be quiet when no SAD block is found
Jean Delvare [Wed, 4 Sep 2019 09:13:37 +0000 (11:13 +0200)]
drm/radeon: be quiet when no SAD block is found

It is fine for displays without audio functionality to not provide
any SAD block in their EDID. Do not log an error in that case,
just return quietly.

Inspired by a similar fix to the amdgpu driver in the context of bug
fdo#107825:
https://bugs.freedesktop.org/show_bug.cgi?id=107825

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd: be quiet when no SAD block is found
Jean Delvare [Wed, 4 Sep 2019 09:12:48 +0000 (11:12 +0200)]
drm/amd: be quiet when no SAD block is found

It is fine for displays without audio functionality to not provide
any SAD block in their EDID. Do not log an error in that case,
just return quietly.

This fixes half of bug fdo#107825:
https://bugs.freedesktop.org/show_bug.cgi?id=107825

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Check for valid number of registers to read
Trek [Sat, 31 Aug 2019 19:25:36 +0000 (21:25 +0200)]
drm/amdgpu: Check for valid number of registers to read

Do not try to allocate any amount of memory requested by the user.
Instead limit it to 128 registers. Actually the longest series of
consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and
mmGB_MACROTILE_MODE0-15 (0x2644-0x2673).

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273
Signed-off-by: Trek <trek00@inbox.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/radeon: Bail earlier when radeon.cik_/si_support=0 is passed
Hans de Goede [Sat, 7 Sep 2019 20:32:38 +0000 (22:32 +0200)]
drm/radeon: Bail earlier when radeon.cik_/si_support=0 is passed

Bail from the pci_driver probe function instead of from the drm_driver
load function.

This avoid /dev/dri/card0 temporarily getting registered and then
unregistered again, sending unwanted add / remove udev events to
userspace.

Specifically this avoids triggering the (userspace) bug fixed by this
plymouth merge-request:
https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59

Note that despite that being an userspace bug, not sending unnecessary
udev events is a good idea in general.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: rename variable eanble -> enable
Colin Ian King [Fri, 13 Sep 2019 08:02:48 +0000 (09:02 +0100)]
drm/amd/display: rename variable eanble -> enable

There is a spelling mistake in the variable name eanble,
rename it to enable.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agoRevert "drm/amdgpu/nbio7.4: add hw bug workaround for vega20"
Kent Russell [Tue, 10 Sep 2019 19:48:55 +0000 (15:48 -0400)]
Revert "drm/amdgpu/nbio7.4: add hw bug workaround for vega20"

This reverts commit e01f2d41895102d824c6b8f5e011dd5e286d5e8b.

VG20 did not require this workaround, as the fix is in the VBIOS.
Leave VG10/12 workaround as some older shipped cards do not have the
VBIOS fix in place, and the kernel workaround is required in those
situations

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: add graceful VM fault handling v3
Christian König [Fri, 7 Dec 2018 14:18:43 +0000 (15:18 +0100)]
drm/amdgpu: add graceful VM fault handling v3

Next step towards HMM support. For now just silence the retry fault and
optionally redirect the request to the dummy page.

v2: make sure the VM is not destroyed while we handle the fault.
v3: fix VM destroy check, cleanup comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: reserve the root PD while freeing PASIDs
Christian König [Wed, 17 Jul 2019 07:58:47 +0000 (09:58 +0200)]
drm/amdgpu: reserve the root PD while freeing PASIDs

Free the pasid only while the root PD is reserved. This prevents use after
free in the page fault handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page fault
Christian König [Mon, 16 Sep 2019 15:35:53 +0000 (10:35 -0500)]
drm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page fault

While handling a page fault we can't wait for other ongoing GPU
operations or we can potentially run into deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: allow direct submission of clears
Christian König [Thu, 28 Mar 2019 09:53:33 +0000 (10:53 +0100)]
drm/amdgpu: allow direct submission of clears

For handling PD/PT clears directly in the fault handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: allow direct submission of PTE updates
Christian König [Wed, 27 Mar 2019 12:59:23 +0000 (13:59 +0100)]
drm/amdgpu: allow direct submission of PTE updates

For handling PTE updates directly in the fault handler.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: allow direct submission of PDE updates v2
Christian König [Thu, 14 Mar 2019 08:10:01 +0000 (09:10 +0100)]
drm/amdgpu: allow direct submission of PDE updates v2

For handling PDE updates directly in the fault handler.

v2: fix typo in comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: allow direct submission in the VM backends v2
Christian König [Mon, 16 Sep 2019 15:33:28 +0000 (10:33 -0500)]
drm/amdgpu: allow direct submission in the VM backends v2

This allows us to update page tables directly while in a page fault.

v2: use direct/delayed entities and still wait for moves

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: split the VM entity into direct and delayed
Christian König [Fri, 19 Jul 2019 12:41:12 +0000 (14:41 +0200)]
drm/amdgpu: split the VM entity into direct and delayed

For page fault handling we need to use a direct update which can't be
blocked by ongoing user CS.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/ttm: return -EBUSY on pipelining with no_gpu_wait (v2)
Christian König [Mon, 16 Sep 2019 15:20:47 +0000 (10:20 -0500)]
drm/ttm: return -EBUSY on pipelining with no_gpu_wait (v2)

Setting the no_gpu_wait flag means that the allocate BO must be available
immediately and we can't wait for any GPU operation to finish.

v2: squash in mem leak fix, rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: grab the id mgr lock while accessing passid_mapping
Christian König [Mon, 9 Sep 2019 11:57:32 +0000 (13:57 +0200)]
drm/amdgpu: grab the id mgr lock while accessing passid_mapping

Need to make sure that we actually dropping the right fence.
Could be done with RCU as well, but to complicated for a fix.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu/SRIOV: Navi12 SRIOV VF doesn't load TOC
Jiange Zhao [Thu, 12 Sep 2019 05:18:41 +0000 (13:18 +0800)]
drm/amdgpu/SRIOV: Navi12 SRIOV VF doesn't load TOC

In SRIOV case, the autoload sequence is the same

as bare metal, except VF won't load TOC.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu/SRIOV: Navi10/12 VF doesn't support SMU
Jiange Zhao [Thu, 12 Sep 2019 05:15:35 +0000 (13:15 +0800)]
drm/amdgpu/SRIOV: Navi10/12 VF doesn't support SMU

In SRIOV case, SMU and powerplay are handled in HV.

VF shouldn't have control over SMU and powerplay.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/amdgpu: power up sdma engine when S3 resume back
Prike Liang [Wed, 11 Sep 2019 05:15:17 +0000 (13:15 +0800)]
drm/amd/amdgpu: power up sdma engine when S3 resume back

The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: For Navi12 SRIOV VF, register mailbox functions
Jiange Zhao [Wed, 11 Sep 2019 09:29:07 +0000 (17:29 +0800)]
drm/amdgpu: For Navi12 SRIOV VF, register mailbox functions

Mailbox functions and interrupts are only for Navi12 VF.

Register functions and irqs during initialization.

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu/sriov: add ring_stop before ring_create in psp v11 code
Jack Zhang [Tue, 10 Sep 2019 04:29:14 +0000 (12:29 +0800)]
drm/amdgpu/sriov: add ring_stop before ring_create in psp v11 code

psp  v11 code missed ring stop in ring create function(VMR)
while psp v3.1 code had the code. This will cause VM destroy1
fail and psp ring create fail.

For SIOV-VF, ring_stop should not be deleted in ring_create
function.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/powerplay: check SMU engine readiness before proceeding on S3 resume
Evan Quan [Wed, 11 Sep 2019 11:39:34 +0000 (19:39 +0800)]
drm/amd/powerplay: check SMU engine readiness before proceeding on S3 resume

This is especially needed for non-psp loading way.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/powerplay: properly set mp1 state for SW SMU suspend/reset routine
Evan Quan [Wed, 11 Sep 2019 11:35:45 +0000 (19:35 +0800)]
drm/amd/powerplay: properly set mp1 state for SW SMU suspend/reset routine

Set mp1 state properly for SW SMU suspend/reset routine.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Fix KFD-related kernel oops on Hawaii
Felix Kuehling [Thu, 5 Sep 2019 23:22:02 +0000 (19:22 -0400)]
drm/amdgpu: Fix KFD-related kernel oops on Hawaii

Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.

Fixes: eb3961a57424 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Fix mutex lock from atomic context.
Andrey Grodzovsky [Tue, 10 Sep 2019 19:34:16 +0000 (15:34 -0400)]
drm/amdgpu: Fix mutex lock from atomic context.

Problem:
amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
because writing to EEPROM during ASIC reset was unstable.
But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
directly from ISR context and so locking is not allowed. Also it's
irrelevant for this partilcular interrupt as this is generic RAS
interrupt and not memory errors specific.

Fix:
Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Add SRIOV mailbox backend for Navi1x
Jiange Zhao [Wed, 4 Sep 2019 06:42:01 +0000 (14:42 +0800)]
drm/amdgpu: Add SRIOV mailbox backend for Navi1x

Mimic the ones for Vega10, add mailbox backend for Navi1x

Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: implement ras query function for pcie bif
Guchun Chen [Wed, 11 Sep 2019 03:12:17 +0000 (11:12 +0800)]
drm/amdgpu: implement ras query function for pcie bif

ras error query funtionality implementation

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: add pcie bif ras related registers
Guchun Chen [Wed, 11 Sep 2019 03:10:02 +0000 (11:10 +0800)]
drm/amdgpu: add pcie bif ras related registers

These registers will be accessed for querying ras errors.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: support pcie bif ras query and inject
Guchun Chen [Wed, 11 Sep 2019 03:07:15 +0000 (11:07 +0800)]
drm/amdgpu: support pcie bif ras query and inject

Call pcie bif ras query/inject in amdgpu ras.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: add ras error query count interface for nbio
Guchun Chen [Wed, 4 Sep 2019 06:58:54 +0000 (14:58 +0800)]
drm/amdgpu: add ras error query count interface for nbio

Add the interface query_ras_error_count for nbio.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: fix CPDMA hang in PRT mode for VEGA10
Tianci.Yin [Tue, 10 Sep 2019 08:54:14 +0000 (16:54 +0800)]
drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10

add and_mask since the programming logic of golden setting changed

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: enable error injection to XGMI block via debugfs
Hawking Zhang [Sun, 8 Sep 2019 01:09:15 +0000 (09:09 +0800)]
drm/amdgpu: enable error injection to XGMI block via debugfs

allow inject error to XGMI block via debugfs node ras_ctrl

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: initialize ras structures for xgmi block (v2)
Hawking Zhang [Tue, 10 Sep 2019 03:13:39 +0000 (11:13 +0800)]
drm/amdgpu: initialize ras structures for xgmi block (v2)

init ras common interface and fs node for xgmi block

v2: remove unnecessary physical node number check before
invoking amdgpu_xgmi_ras_late_init

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: fix the missed asic name while inited renoir_device_info
Huang Rui [Tue, 10 Sep 2019 11:10:42 +0000 (19:10 +0800)]
drm/amdkfd: fix the missed asic name while inited renoir_device_info

This patch fixes null pointer issue below, I missed to init the asic renior name
while I rebase the patches.

[  106.004250] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  106.004254] #PF: supervisor read access in kernel mode
[  106.004256] #PF: error_code(0x0000) - not-present page
[  106.004257] PGD 0 P4D 0
[  106.004261] Oops: 0000 [#1] SMP NOPTI
[  106.004264] CPU: 3 PID: 1422 Comm: modprobe Not tainted 5.2.0-rc1-custom #1
[  106.004266] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS
WCD9814N_Weekly_19_08_1 08/14/2019
[  106.004272] RIP: 0010:strncpy+0x12/0x30
[  106.004274] Code: c1 c0 11 48 c1 c6 15 48 31 d0 48 c1 c2 20 31 c2 89 d0 31 f0
41 5c 5d c3 55 48 85 d2 48 89 f8 48 89 e5 74 1e 48 01 fa 48 89 f9 <44> 0f b6 06
41 80 f8 01 44 88 01 48 83 de ff 48 83 c1 01 48 39 d1
[  106.004278] RSP: 0018:ffffc092c1fd37a8 EFLAGS: 00010286
[  106.004281] RAX: ffff9e943466a28c RBX: 00000000000036ed RCX: ffff9e943466a28c
[  106.004283] RDX: ffff9e943466a2ac RSI: 0000000000000000 RDI: ffff9e943466a28c
[  106.004285] RBP: ffffc092c1fd37a8 R08: ffff9e943d100000 R09: 0000000000000228
[  106.004287] R10: ffff9e94418dc5a8 R11: ffff9e944746c0d0 R12: 0000000000000000
[  106.004289] R13: ffff9e943fa1ec00 R14: ffff9e943466a200 R15: ffff9e943466a200
[  106.004291] FS:  00007f7a022c5540(0000) GS:ffff9e9447ac0000(0000)
knlGS:0000000000000000
[  106.004294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.004296] CR2: 0000000000000000 CR3: 00000001ff0b0000 CR4: 0000000000340ee0
[  106.004298] Call Trace:
[  106.004382]  kfd_topology_add_device+0x150/0x610 [amdgpu]
[  106.004445]  kgd2kfd_device_init+0x2e0/0x4f0 [amdgpu]
[  106.004509]  amdgpu_amdkfd_device_init+0x14c/0x1b0 [amdgpu]

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: Implement voltage limitation for dali
Bhawanpreet Lakha [Fri, 6 Sep 2019 20:32:44 +0000 (16:32 -0400)]
drm/amd/display: Implement voltage limitation for dali

[Why]
we only want the lowest voltage to be available for dali.

[How]
Use the get_highest_allowed_voltage_level function
to return 0 for dali

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/display: add Asic ID for Dali
Bhawanpreet Lakha [Fri, 6 Sep 2019 20:31:21 +0000 (16:31 -0400)]
drm/amd/display: add Asic ID for Dali

Dali is a new asic revision based on raven2

Add the ID and ASICREV_IS_DALI define

Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Allow to reset to EERPOM table.
Andrey Grodzovsky [Mon, 9 Sep 2019 20:00:56 +0000 (16:00 -0400)]
drm/amdgpu: Allow to reset to EERPOM table.

The table grows quickly during debug/development effort when
multiple RAS errors are injected. Allow to avoid this by setting
table header back to empty if needed.

v2: Switch to debugfs entry instead of load time parameter.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Add amdgpu_ras_eeprom_reset_table
Andrey Grodzovsky [Mon, 9 Sep 2019 19:59:45 +0000 (15:59 -0400)]
drm/amdgpu: Add amdgpu_ras_eeprom_reset_table

This will allow to reset the table on the fly.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: rename umc ras_init to err_cnt_init
Tao Zhou [Fri, 6 Sep 2019 06:32:14 +0000 (14:32 +0800)]
drm/amdgpu: rename umc ras_init to err_cnt_init

this interface is related to specific version of umc, distinguish it
from ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: move umc ras init to umc block
Tao Zhou [Thu, 5 Sep 2019 11:25:18 +0000 (19:25 +0800)]
drm/amdgpu: move umc ras init to umc block

move umc ras init from ras module to umc block, generic ras module
should pay less attention to specific ras block.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: move umc late init from gmc to umc block
Tao Zhou [Thu, 5 Sep 2019 11:16:19 +0000 (19:16 +0800)]
drm/amdgpu: move umc late init from gmc to umc block

umc late init is umc specific, it's more suitable to be put in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: remove duplicated header file include
Guchun Chen [Tue, 10 Sep 2019 08:23:53 +0000 (16:23 +0800)]
drm/amdgpu: remove duplicated header file include

amdgpu_ras.h is already included.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: remove needless usage of #ifdef
Shirish S [Thu, 12 Sep 2019 06:33:55 +0000 (12:03 +0530)]
drm/amdgpu: remove needless usage of #ifdef

define sched_policy in case CONFIG_HSA_AMD is not
enabled, with this there is no need to check for CONFIG_HSA_AMD
else where in driver code.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: fix build error without CONFIG_HSA_AMD
Shirish S [Tue, 10 Sep 2019 07:15:01 +0000 (12:45 +0530)]
drm/amdgpu: fix build error without CONFIG_HSA_AMD

If CONFIG_HSA_AMD is not set, build fails:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function `amdgpu_device_ip_early_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to `sched_policy'

Use CONFIG_HSA_AMD to guard this.

Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling policy")

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/powerplay: update smu11_driver_if_arcturus.h
Evan Quan [Fri, 6 Sep 2019 02:08:32 +0000 (10:08 +0800)]
drm/amd/powerplay: update smu11_driver_if_arcturus.h

Also bump the SMU11_DRIVER_IF_VERSION_ARCT.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/powerplay: issue DC-BTC for arcturus on SMU init
Evan Quan [Thu, 5 Sep 2019 04:22:42 +0000 (12:22 +0800)]
drm/amd/powerplay: issue DC-BTC for arcturus on SMU init

Need to perform DC-BTC for arcturus on bootup.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Avoid RAS recovery init when no RAS support.
Andrey Grodzovsky [Fri, 6 Sep 2019 21:23:44 +0000 (17:23 -0400)]
drm/amdgpu: Avoid RAS recovery init when no RAS support.

Fixes driver load regression on APUs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: cleanup PTE flag generation v3
Christian König [Mon, 2 Sep 2019 14:39:40 +0000 (16:39 +0200)]
drm/amdgpu: cleanup PTE flag generation v3

Move the ASIC specific code into a new callback function.

v2: mask the flags for SI and CIK instead of a BUG_ON().
v3: remove last missed BUG_ON().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: cleanup mtype mapping
Christian König [Mon, 2 Sep 2019 12:52:30 +0000 (14:52 +0200)]
drm/amdgpu: cleanup mtype mapping

Unify how we map the UAPI flags to the PTE hardware flags for a mapping.

Only the MTYPE is actually ASIC dependent, all other flags should be
copied over 1 to 1 and ASIC differences are handled later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: add navi14 PCI ID for work station SKU
Tianci.Yin [Thu, 5 Sep 2019 07:28:57 +0000 (15:28 +0800)]
drm/amdgpu: add navi14 PCI ID for work station SKU

Add the navi14 PCI device id.

Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amd/powerplay: Add the interface for geting dpm current power state
Prike Liang [Fri, 6 Sep 2019 07:13:03 +0000 (15:13 +0800)]
drm/amd/powerplay: Add the interface for geting dpm current power state

implement the sysfs power_dpm_state

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: check if nbio->ras_if exist
Philip Yang [Fri, 6 Sep 2019 17:20:40 +0000 (13:20 -0400)]
drm/amdgpu: check if nbio->ras_if exist

To avoid NULL function pointer access. This happens on VG10, reboot
command hangs and have to power off/on to reboot the machine. This is
serial console log:

[  OK  ] Reached target Unmount All Filesystems.
[  OK  ] Reached target Final Step.
         Starting Reboot...
[  305.696271] systemd-shutdown[1]: Syncing filesystems and block
devices.
[  306.947328] systemd-shutdown[1]: Sending SIGTERM to remaining
processes...
[  306.963920] systemd-journald[1722]: Received SIGTERM from PID 1
(systemd-shutdow).
[  307.322717] systemd-shutdown[1]: Sending SIGKILL to remaining
processes...
[  307.336472] systemd-shutdown[1]: Unmounting file systems.
[  307.454202] EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro
[  307.480523] systemd-shutdown[1]: All filesystems unmounted.
[  307.486537] systemd-shutdown[1]: Deactivating swaps.
[  307.491962] systemd-shutdown[1]: All swaps deactivated.
[  307.497624] systemd-shutdown[1]: Detaching loop devices.
[  307.504418] systemd-shutdown[1]: All loop devices detached.
[  307.510418] systemd-shutdown[1]: Detaching DM devices.
[  307.565907] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[  307.731313] BUG: kernel NULL pointer dereference, address:
0000000000000000
[  307.738802] #PF: supervisor read access in kernel mode
[  307.744326] #PF: error_code(0x0000) - not-present page
[  307.749850] PGD 0 P4D 0
[  307.752568] Oops: 0000 [#1] SMP PTI
[  307.756314] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted
5.2.0-rc1-kfd-yangp #453
[  307.764644] Hardware name: ASUS All Series/Z97-PRO(Wi-Fi ac)/USB 3.1,
BIOS 9001 03/07/2016
[  307.773580] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu]
[  307.779760] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48
8b b3 90 7d 00 00 48 c7 c7 17 b8 530
[  307.799967] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286
[  307.805585] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX:
0000000000000006
[  307.813261] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI:
ffff9eb29e350000
[  307.820935] RBP: ffff9eb299da0000 R08: 0000000000000000 R09:
0000000000000000
[  307.828609] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9eb299dbd1f8
[  307.836284] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15:
0000000000000000
[  307.843959] FS:  00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000)
knlGS:0000000000000000
[  307.852663] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  307.858842] CR2: 0000000000000000 CR3: 000000081d798005 CR4:
00000000001606e0
[  307.866516] Call Trace:
[  307.869169]  amdgpu_device_ip_suspend_phase2+0x80/0x110 [amdgpu]
[  307.875654]  ? amdgpu_device_ip_suspend_phase1+0x4d/0xd0 [amdgpu]
[  307.882230]  amdgpu_device_ip_suspend+0x2e/0x60 [amdgpu]
[  307.887966]  amdgpu_pci_shutdown+0x2f/0x40 [amdgpu]
[  307.893211]  pci_device_shutdown+0x31/0x60
[  307.897613]  device_shutdown+0x14c/0x1f0
[  307.901829]  kernel_restart+0xe/0x50
[  307.905669]  __do_sys_reboot+0x1df/0x210
[  307.909884]  ? task_work_run+0x73/0xb0
[  307.913914]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[  307.918970]  do_syscall_64+0x4a/0x1c0
[  307.922904]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  307.928336] RIP: 0033:0x7f0671cf8373
[  307.932176] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00
00 0f 1f 44 00 00 89 fa be 69 19 128
[  307.952384] RSP: 002b:00007ffdd1723d68 EFLAGS: 00000202 ORIG_RAX:
00000000000000a9
[  307.960527] RAX: ffffffffffffffda RBX: 0000000001234567 RCX:
00007f0671cf8373
[  307.968201] RDX: 0000000001234567 RSI: 0000000028121969 RDI:
00000000fee1dead
[  307.975875] RBP: 00007ffdd1723dd0 R08: 0000000000000000 R09:
0000000000000000
[  307.983550] R10: 0000000000000002 R11: 0000000000000202 R12:
00007ffdd1723dd8
[  307.991224] R13: 0000000000000000 R14: 0000001b00000004 R15:
00007ffdd17240c8
[  307.998901] Modules linked in: xt_MASQUERADE nfnetlink iptable_nat
xt_addrtype xt_conntrack nf_nat nf_cos
[  308.026505] CR2: 0000000000000000
[  308.039998] RIP: 0010:soc15_common_hw_fini+0x33/0xc0 [amdgpu]
[  308.046180] Code: 89 fb e8 60 f5 ff ff f6 83 50 df 01 00 04 75 3d 48
8b b3 90 7d 00 00 48 c7 c7 17 b8 530
[  308.066392] RSP: 0018:ffffac9483153d40 EFLAGS: 00010286
[  308.072013] RAX: 0000000000000000 RBX: ffff9eb299da0000 RCX:
0000000000000006
[  308.079689] RDX: 0000000000000000 RSI: ffff9eb29e3508a0 RDI:
ffff9eb29e350000
[  308.087366] RBP: ffff9eb299da0000 R08: 0000000000000000 R09:
0000000000000000
[  308.095042] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9eb299dbd1f8
[  308.102717] R13: ffffffffc04f8368 R14: ffff9eb29cebd130 R15:
0000000000000000
[  308.110394] FS:  00007f06721c9940(0000) GS:ffff9eb2a18c0000(0000)
knlGS:0000000000000000
[  308.119099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  308.125280] CR2: 0000000000000000 CR3: 000000081d798005 CR4:
00000000001606e0
[  308.135304] printk: systemd-shutdow: 3 output lines suppressed due to
ratelimiting
[  308.143518] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x00000009
[  308.151798] Kernel Offset: 0x15000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xff)
[  308.171775] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x00000009 ]---

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: fix null pointer deref in firmware header printing
Xiaojie Yuan [Thu, 5 Sep 2019 08:50:22 +0000 (16:50 +0800)]
drm/amdgpu: fix null pointer deref in firmware header printing

v2: declare as (struct common_firmware_header *) type because
    struct xxx_firmware_header inherits from it

When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an
unallocated amdgpu_firmware_info instance.

This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have
optimized out such 'defined but not referenced' variable.

[ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
[ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0
[ 1120.818931] Oops: 0000 [#1] SMP
[ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid
[ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
[ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015
[ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000
[ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>]  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.990483] RSP: 0018:ffff991ee625f950  EFLAGS: 00010202
[ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000
[ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3
[ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000
[ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220
[ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30
[ 1121.032666] FS:  00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000
[ 1121.041000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0
[ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1121.068938] Call Trace:
[ 1121.071494]  [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu]
[ 1121.077886]  [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu]
[ 1121.085296]  [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu]
[ 1121.092534]  [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu]
[ 1121.099675]  [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu]
[ 1121.106888]  [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm]
[ 1121.113419]  [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140
[ 1121.120183]  [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu]
[ 1121.126919]  [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0
[ 1121.132622]  [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160
[ 1121.138607]  [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0
[ 1121.144766]  [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0
[ 1121.150507]  [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50
[ 1121.156422]  [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0
[ 1121.162213]  [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20
[ 1121.167771]  [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0
[ 1121.173590]  [<ffffffffa4eb4c94>] driver_register+0x64/0xf0
[ 1121.179345]  [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0
[ 1121.185593]  [<ffffffffc099f000>] ? 0xffffffffc099efff
[ 1121.190914]  [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu]
[ 1121.197101]  [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240
[ 1121.202901]  [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0
[ 1121.208598]  [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100
[ 1121.214894]  [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140
[ 1121.220698]  [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a
[ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7
[ 1121.247422] RIP  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1121.254432]  RSP <ffff991ee625f950>
[ 1121.258017] CR2: 000000000000000a
[ 1121.261427] ---[ end trace e98b35387ede75bd ]---

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Fixes: c5fb912653dae3f878 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)")
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: enable renoir while device probes
Huang Rui [Mon, 2 Sep 2019 14:35:37 +0000 (22:35 +0800)]
drm/amdkfd: enable renoir while device probes

This patch is to add asic flag to enable device probe during kfd init.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: disable gfxoff while use no H/W scheduling policy
Huang Rui [Mon, 2 Sep 2019 15:34:32 +0000 (23:34 +0800)]
drm/amdgpu: disable gfxoff while use no H/W scheduling policy

While gfxoff is enabled, the mmVM_XXX registers will be 0xfffffff while the GFX
is in "off" state. KFD queue creattion doesn't use ring based method, so it will
trigger a VM fault.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: add renoir kfd topology
Huang Rui [Mon, 2 Sep 2019 15:25:29 +0000 (23:25 +0800)]
drm/amdkfd: add renoir kfd topology

This patch adds renoir kfd topology which is the same with Raven.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: add package manager for renoir
Huang Rui [Mon, 2 Sep 2019 15:24:29 +0000 (23:24 +0800)]
drm/amdkfd: add package manager for renoir

Renoir use GFX v9, so adds v9 package manager.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: init kernel queue for renoir
Huang Rui [Mon, 2 Sep 2019 15:19:38 +0000 (23:19 +0800)]
drm/amdkfd: init kernel queue for renoir

Renoir is GFX v9, so init v9 kernel queue.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: init kfd apertures v9 for renoir
Huang Rui [Mon, 2 Sep 2019 15:15:46 +0000 (23:15 +0800)]
drm/amdkfd: init kfd apertures v9 for renoir

Renoir is GMC v9, so init v9 kfd apertures.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: add renoir type for the workaround of iommu v2 (v2)
Huang Rui [Mon, 2 Sep 2019 15:13:26 +0000 (23:13 +0800)]
drm/amdkfd: add renoir type for the workaround of iommu v2 (v2)

Renoir is the same with Raven, will enable iommu event in future.

v2: fix the checking (Thong)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: enable kfd device queue manager v9 for renoir
Huang Rui [Mon, 2 Sep 2019 15:10:52 +0000 (23:10 +0800)]
drm/amdkfd: enable kfd device queue manager v9 for renoir

Renoir is GFX9, so enable v9 devcie queue manager.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: add renoir kfd device info (v2)
Huang Rui [Mon, 2 Sep 2019 15:06:58 +0000 (23:06 +0800)]
drm/amdkfd: add renoir kfd device info (v2)

This patch inits renoir kfd device info, so we treat renoir as "dgpu"
(bypass iommu v2). Will enable needs_iommu_device till renoir iommu is ready.

v2: rebase and align the drm-next

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: add renoir cache info for CRAT (v2)
Huang Rui [Mon, 2 Sep 2019 14:59:01 +0000 (22:59 +0800)]
drm/amdkfd: add renoir cache info for CRAT (v2)

Renoir's cache info should be the same with raven and carrizo's.

v2: fix missed "break"

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdkfd: Support Navi14 in KFD
Yong Zhao [Tue, 13 Aug 2019 21:13:27 +0000 (17:13 -0400)]
drm/amdkfd: Support Navi14 in KFD

Initial support of Navi14 in KFD. The device IDs will be added later.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 years agodrm/amdgpu: Disable retry faults in VMID0
Felix Kuehling [Wed, 4 Sep 2019 23:26:16 +0000 (19:26 -0400)]
drm/amdgpu: Disable retry faults in VMID0

There is no point retrying page faults in VMID0. Those faults are
always fatal.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-and-Tested-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>