YiPeng Chai [Fri, 22 Mar 2024 04:06:01 +0000 (12:06 +0800)]
drm/amdgpu: add condition check for amdgpu_umc_fill_error_record
Add condition check for amdgpu_umc_fill_error_record.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Mon, 22 Apr 2024 09:38:54 +0000 (17:38 +0800)]
drm/amdgpu: Add delay work to retire bad pages
Add delay work to retire bad pages.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Mon, 18 Mar 2024 03:48:07 +0000 (11:48 +0800)]
drm/amdgpu: umc v12_0 logs ecc errors
1. umc v12_0 logs ecc errors.
2. Reserve newly detected ecc error pages.
3. Add tag for bad pages, so that they can
be retired later.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Mon, 11 Mar 2024 09:29:26 +0000 (17:29 +0800)]
drm/amdgpu: umc v12_0 converts error address
Umc v12_0 converts error address.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Tue, 19 Mar 2024 02:14:16 +0000 (10:14 +0800)]
drm/amdgpu: add interface to update umc v12_0 ecc status
Add interface to update umc v12_0 ecc status.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Mon, 11 Mar 2024 09:02:40 +0000 (17:02 +0800)]
drm/amdgpu: add poison creation handler
Add poison creation handler.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Mon, 22 Apr 2024 09:37:36 +0000 (17:37 +0800)]
drm/amdgpu: prepare for logging ecc errors
Prepare for logging ecc errors.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Tue, 19 Mar 2024 01:57:58 +0000 (09:57 +0800)]
drm/amdgpu: add message fifo to handle RAS poison events
Add message fifo to handle RAS poison events.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse Zhang [Wed, 24 Apr 2024 09:10:46 +0000 (17:10 +0800)]
drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc
Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001.
V2: To really improve the handling we would actually
need to have a separate value of 0xffffffff.(Christian)
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 9 Apr 2024 21:14:49 +0000 (15:14 -0600)]
drm/amd/display: Add TMDS DC balancer control
Add TMDS balancer control to the list of available encoder registers for
DCN 30.
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Srinivasan Shanmugam [Tue, 23 Apr 2024 03:14:40 +0000 (08:44 +0530)]
drm/amd/display: Remove unnecessary NULL check in dcn20_set_input_transfer_func
This commit removes an unnecessary NULL check in the
`dcn20_set_input_transfer_func` function in the `dcn20_hwseq.c` file.
The variable `tf` is assigned the address of
`plane_state->in_transfer_func` unconditionally, so it can never be
`NULL`. Therefore, the check `if (tf == NULL)` is unnecessary and has
been removed.
The plane_state->in_transfer_func itself cannot be NULL because it's a
structure, not a pointer. When we do tf =
&plane_state->in_transfer_func;, we're getting the address of that
structure, which will always be valid as long as plane_state itself is
not NULL.
we've already checked if plane_state is NULL with the line if (dpp_base
== NULL || plane_state == NULL) return false;. So, if the code execution
gets to the point where tf = &plane_state->in_transfer_func; is called,
plane_state is guaranteed to be not NULL, and therefore tf will also not
be NULL.
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c
1094 bool dcn20_set_input_transfer_func(struct dc *dc,
1095 struct pipe_ctx *pipe_ctx,
1096 const struct dc_plane_state *plane_state)
1097 {
1098 struct dce_hwseq *hws = dc->hwseq;
1099 struct dpp *dpp_base = pipe_ctx->plane_res.dpp;
1100 const struct dc_transfer_func *tf = NULL;
^^^^^^^^^ This assignment is not necessary now.
1101 bool result = true;
1102 bool use_degamma_ram = false;
1103
1104 if (dpp_base == NULL || plane_state == NULL)
1105 return false;
1106
1107 hws->funcs.set_shaper_3dlut(pipe_ctx, plane_state);
1108 hws->funcs.set_blend_lut(pipe_ctx, plane_state);
1109
1110 tf = &plane_state->in_transfer_func;
^^^^^
Before there was an if statement but now tf is assigned unconditionally
1111
--> 1112 if (tf == NULL) {
^^^^^^^^^^^^^^^^^
so these conditions are impossible.
1113 dpp_base->funcs->dpp_set_degamma(dpp_base,
1114 IPP_DEGAMMA_MODE_BYPASS);
1115 return true;
1116 }
1117
1118 if (tf->type == TF_TYPE_HWPWL || tf->type == TF_TYPE_DISTRIBUTED_POINTS)
1119 use_degamma_ram = true;
1120
1121 if (use_degamma_ram == true) {
1122 if (tf->type == TF_TYPE_HWPWL)
1123 dpp_base->funcs->dpp_program_degamma_pwl(dpp_base,
Fixes the below Smatch static checker warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1112 dcn20_set_input_transfer_func() warn: address of 'plane_state->in_transfer_func' is non-NULL
Fixes:
285a7054bf81 ("drm/amd/display: Remove plane and stream pointers from dc scratch")
Cc: Wenjing Liu <wenjing.liu@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 19 Apr 2024 03:30:46 +0000 (23:30 -0400)]
drm/amdgpu/mes11: Use a separate fence per transaction
We can't use a shared fence location because each transaction
should be considered independently.
Reviewed-by: Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 9 Apr 2024 21:11:20 +0000 (15:11 -0600)]
drm/amd/display: Add missing dwb registers
DCN3.0 supports some specific DWB debug registers that are not exposed
yet. This commit just adds the missing registers.
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Melissa Wen [Fri, 12 Apr 2024 16:39:09 +0000 (13:39 -0300)]
drm/amd/display: use mpcc_count to log MPC state
According to [1]:
```
DTN only logs 'pipe_count' instances of MPCC. However in some cases
there are different number of MPCC than DPP (pipe_count).
```
As DTN log still relies on pipe_count to print mpcc state, switch to
mpcc_count in all occurrences.
[1] https://lore.kernel.org/amd-gfx/
20240328195047.
2843715-39-Roman.Li@amd.com/
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 19 Apr 2024 22:03:32 +0000 (18:03 -0400)]
drm/amdgpu: add a spinlock to wb allocation
As we use wb slots more dynamically, we need to lock
access to avoid racing on allocation or free.
Reviewed-by: Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sonny Jiang [Tue, 23 Apr 2024 16:52:25 +0000 (12:52 -0400)]
drm/amdgpu: update fw_share for VCN5
kmd_fw_shared changed in VCN5
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
David Tadokoro [Thu, 22 Feb 2024 14:19:00 +0000 (11:19 -0300)]
drm/amd/display: Remove duplicated function signature from dcn3.01 DCCG
In the header file dc/dcn301/dcn301_dccg.h, the function dccg301_create
is declared twice, so remove duplication.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: David Tadokoro <davidbtadokoro@usp.br>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Tue, 23 Apr 2024 18:40:37 +0000 (14:40 -0400)]
drm/amdgpu: Fix VRAM memory accounting
Subtract the VRAM pinned memory when checking for available memory
in amdgpu_amdkfd_reserve_mem_limit function since that memory is not
available for use.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sathishkumar S [Thu, 11 Apr 2024 19:43:06 +0000 (01:13 +0530)]
drm/amdgpu: update jpeg max decode resolution
jpeg ip version v2.1 and higher supports 16kx16k resolution decode
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jose Fernandez [Mon, 22 Apr 2024 14:35:44 +0000 (08:35 -0600)]
drm/amd/display: Fix division by zero in setup_dsc_config
When slice_height is 0, the division by slice_height in the calculation
of the number of slices will cause a division by zero driver crash. This
leaves the kernel in a state that requires a reboot. This patch adds a
check to avoid the division by zero.
The stack trace below is for the 6.8.4 Kernel. I reproduced the issue on
a Z16 Gen 2 Lenovo Thinkpad with a Apple Studio Display monitor
connected via Thunderbolt. The amdgpu driver crashed with this exception
when I rebooted the system with the monitor connected.
kernel: ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447)
kernel: ? do_trap (arch/x86/kernel/traps.c:113 arch/x86/kernel/traps.c:154)
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: ? do_error_trap (./arch/x86/include/asm/traps.h:58 arch/x86/kernel/traps.c:175)
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: ? exc_divide_error (arch/x86/kernel/traps.c:194 (discriminator 2))
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: ? asm_exc_divide_error (./arch/x86/include/asm/idtentry.h:548)
kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu
kernel: dc_dsc_compute_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1109) amdgpu
After applying this patch, the driver no longer crashes when the monitor
is connected and the system is rebooted. I believe this is the same
issue reported for 3113.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jose Fernandez <josef@netflix.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3113
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 9 Apr 2024 20:35:19 +0000 (14:35 -0600)]
drm/amd/display: Add missing debug registers for DCN2/3/3.1
This commit add some missing debug registers for DPCS and RDPC debug.
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 16 Apr 2024 11:49:41 +0000 (17:19 +0530)]
drm/amdgpu: add ip dump for each ip in devcoredump
Add ip dump for each ip of the asic in the
devcoredump for all the ips where a callback
is registered for register dump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 16 Apr 2024 11:26:23 +0000 (16:56 +0530)]
drm/amdgpu: dump ip state before reset for each ip
Invoke the dump_ip_state function for each ip before
the asic resets and save the register values for
debugging via devcoredump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 16 Apr 2024 11:18:01 +0000 (16:48 +0530)]
drm/amdgpu: add support for gfx v10 print
Add support to print ip information to be
used to print registers in devcoredump
buffer.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 16 Apr 2024 11:00:50 +0000 (16:30 +0530)]
drm/amdgpu: add protype for print ip state
Add the protoype for print ip state to be used
to print the registers in devcoredump during
a gpu reset.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 16 Apr 2024 10:44:12 +0000 (16:14 +0530)]
drm/amdgpu: add support of gfx10 register dump
Adding gfx10 gc registers to be used for register
dump via devcoredump during a gpu reset.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Mon, 1 Apr 2024 10:28:38 +0000 (15:58 +0530)]
drm/amdgpu: add prototype for ip dump
Add the prototype to dump ip registers
for all ips of different asics and set
them to NULL for now. Based on the
requirement add a function pointer for
each of them.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
YiPeng Chai [Fri, 29 Mar 2024 06:42:59 +0000 (14:42 +0800)]
drm/amdgpu: Add interface to reserve bad page
Add interface to reserve bad page.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ma Jun [Mon, 22 Apr 2024 06:47:52 +0000 (14:47 +0800)]
drm/amdgpu: Fix uninitialized variable warnings
return 0 to avoid returning an uninitialized variable r
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jack Xiao [Mon, 22 Apr 2024 08:22:54 +0000 (16:22 +0800)]
drm/amdgpu/mes: fix use-after-free issue
Delete fence fallback timer to fix the ramdom
use-after-free issue.
v2: move to amdgpu_mes.c
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 15 Apr 2024 01:20:56 +0000 (21:20 -0400)]
drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3
This avoids a potential conflict with firmwares with the newer
HDP flush mechanism.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rajneesh Bhardwaj [Mon, 22 Apr 2024 01:06:26 +0000 (21:06 -0400)]
drm/amdgpu: Update CGCG settings for GFXIP 9.4.3
Tune coarse grain clock gating idle threshold and rlc idle timeout to
achieve better kernel launch latency.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Mon, 22 Apr 2024 13:43:55 +0000 (07:43 -0600)]
Revert "drm/amd/display: Add fallback configuration when set DRR"
This reverts commit
d76c0a23b557c6ebb3fac32548100d76a1e0ce23.
This change must be reverted since it caused soft hangs when changing
the refresh rate to 122 & 144Hz when using a 7000 series GPU.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Mark Broadworth <Mark.Broadworth@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Srinivasan Shanmugam [Fri, 19 Apr 2024 09:18:03 +0000 (14:48 +0530)]
drm/amdgpu: Fix snprintf buffer size in smu_v14_0_init_microcode
This commit addresses buffer overflow in the smu_v14_0_init_microcode
function. The issue was about the snprintf function writing more bytes
into the fw_name buffer than it can hold.
The line of code is:
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix);
Here, snprintf is used to write a formatted string into fw_name. The
format is "amdgpu/%s.bin", where %s is a placeholder for the string
ucode_prefix. The sizeof(fw_name) argument tells snprintf the maximum
number of bytes it can write into fw_name, including the
null-terminating character. In the original code, fw_name is an array of
30 characters.
The string "amdgpu/%s.bin" could be up to 41 bytes long, which exceeds
the 30 bytes allocated for fw_name. This is because %s could be replaced
by ucode_prefix, which can be up to 29 characters long. Adding the 12
characters from "amdgpu/" and ".bin", the total length could be 41
characters.
To address this, the size of ucode_prefix has been reduced to 15
characters. This ensures that the maximum length of the string written into
fw_name does not exceed its capacity.
smu_13/14 etc. don't follow legacy scheme ie., amdgpu_ucode_legacy_naming
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c: In function ‘smu_v14_0_init_microcode’:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:80:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=]
80 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix);
| ^~ ~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:80:9: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 30
80 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes:
fe6cd9152464 ("drm/amd/swsmu: add smu14 ip support")
Cc: Li Ma <li.ma@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Frank Min [Wed, 10 Apr 2024 13:13:25 +0000 (21:13 +0800)]
drm/amdgpu: replace tmz flag into buffer flag
Replace tmz flag into buffer flag to make it easier to understand
and extend
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Frank Min <Frank.Min@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Le Ma [Wed, 17 Apr 2024 09:57:52 +0000 (17:57 +0800)]
drm/amdgpu: init microcode chip name from ip versions
To adapt to different gc versions in gfx_v9_4_3.c file.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Mon, 25 Mar 2024 07:33:34 +0000 (15:33 +0800)]
drm/amdgpu: Fix the ring buffer size for queue VM flush
Here are the corrections needed for the queue ring buffer size
calculation for the following cases:
- Remove the KIQ VM flush ring usage.
- Add the invalidate TLBs packet for gfx10 and gfx11 queue.
- There's no VM flush and PFP sync, so remove the gfx9 real
ring and compute ring buffer usage.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 18 Apr 2024 19:13:58 +0000 (15:13 -0400)]
drm/amdkfd: Add VRAM accounting for SVM migration
Do VRAM accounting when doing migrations to vram to make sure
there is enough available VRAM and migrating to VRAM doesn't evict
other possible non-unified memory BOs. If migrating to VRAM fails,
driver can fall back to using system memory seamlessly.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Fri, 12 Apr 2024 07:41:14 +0000 (13:11 +0530)]
drm/amd/pm: Restore config space after reset
During mode-2 reset, pci config space registers are affected at device
side. However, certain platforms have switches which assign virtual BAR
addresses and returns the same even after device is reset. This
affects pci_restore_state() as it doesn't issue another config write, if
the value read is same as the saved value.
Add a workaround to write saved config space values from driver side.
Presently, these switches are in platforms with SMU v13.0.6 SOCs, hence
restrict the workaround only to those.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lang Yu [Fri, 19 Apr 2024 07:40:08 +0000 (15:40 +0800)]
drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend
umsch test needs full GPU functionality(e.g., VM update, TLB flush,
possibly buffer moving under memory pressure) which may be not ready
under these states. Just skip it to avoid potential issues.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Fri, 19 Apr 2024 17:25:58 +0000 (13:25 -0400)]
drm/amdkfd: Fix rescheduling of restore worker
Handle the case that the restore worker was already scheduled by another
eviction while the restore was in progress.
Fixes:
9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs")
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Yunxiang Li <Yunxiang.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Thu, 18 Apr 2024 17:56:42 +0000 (13:56 -0400)]
drm/amdgpu: Update BO eviction priorities
Make SVM BOs more likely to get evicted than other BOs. These BOs
opportunistically use available VRAM, but can fall back relatively
seamlessly to system memory. It also avoids SVM migrations evicting
other, more important BOs as they will evict other SVM allocations
first.
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Mukul Joshi <mukul.joshi@amd.com>
Tested-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jiapeng Chong [Fri, 19 Apr 2024 02:18:47 +0000 (10:18 +0800)]
drm/amd/display: Remove duplicate dcn32/dcn32_clk_mgr.h header
./drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c: dcn32/dcn32_clk_mgr.h is included more than once.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8789
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pierre-Eric Pelloux-Prayer [Thu, 18 Apr 2024 18:06:08 +0000 (20:06 +0200)]
drm/amdgpu/vcn: fix unitialized variable warnings
Avoid returning an uninitialized value if we never enter the loop.
This case should never be hit in practice, but returning 0 doesn't
hurt.
The same fix is applied to the 4 places using the same pattern.
v2: - fixed typos in commit message (Alex)
- use "return 0;" before the done label instead of initializing
r to 0
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Sat, 30 Mar 2024 13:46:49 +0000 (09:46 -0400)]
drm/amdgpu/mes11: print MES opcodes rather than numbers
Makes it easier to review the logs when there are MES
errors.
v2: use dbg for emitted, add helpers for fetching strings
v3: fix missing commas (Harish)
v4: drop command prefixes (Felix)
v5: squash in bounds fix (Jun)
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Peyton Lee [Fri, 19 Apr 2024 06:07:39 +0000 (14:07 +0800)]
drm/amdgpu/vpe: fix vpe dpm setup failed
The vpe dpm settings should be done before firmware is loaded.
Otherwise, the frequency cannot be successfully raised.
Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Wed, 10 Apr 2024 14:00:46 +0000 (19:30 +0530)]
drm/amdgpu: Assign correct bits for SDMA HDP flush
HDP Flush request bit can be kept unique per AID, and doesn't need to be
unique SOC-wide. Assign only bits 10-13 for SDMA v4.4.2.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Li Ma [Wed, 17 Apr 2024 12:42:44 +0000 (20:42 +0800)]
drm/amd/swsmu: add if condition for smu v14.0.1
smu v14.0.1 re-used smu v14.0.0
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ma Jun [Tue, 16 Apr 2024 07:35:25 +0000 (15:35 +0800)]
drm/amdgpu/pm: Print od status info
Print the od status info if it's not supported.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stanley.Yang [Fri, 12 Apr 2024 02:36:25 +0000 (10:36 +0800)]
drm/amdgpu: Support setting reset_method at runtime
In order to support more test cases, support user
change reset_method at runtime.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ma Jun [Tue, 16 Apr 2024 09:30:12 +0000 (17:30 +0800)]
drm/amdgpu/pm: Remove gpu_od if it's an empty directory
gpu_od should be removed if it's an empty directory
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reported-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 19 Apr 2024 13:57:56 +0000 (09:57 -0400)]
drm/amdkfd: demote unsupported device messages to dev_info
It's not really an error since the devices don't support
the necessary hardware functionality.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3331
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Mon, 22 Apr 2024 04:35:22 +0000 (14:35 +1000)]
Backmerge tag 'v6.9-rc5' into drm-next
Linux 6.9-rc5
I've had a persistent msm failure on clang, and the fix is in fixes
so just pull it back to fix that.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 22 Apr 2024 02:29:17 +0000 (12:29 +1000)]
Merge tag 'drm-misc-next-2024-04-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.10-rc1:
UAPI Changes:
- Add SIZE_HINTS property for cursor planes.
Cross-subsystem Changes:
Core Changes:
- Document the requirements and expectations of adding new
driver-specific properties.
- Assorted small fixes to ttm.
- More Kconfig fixes.
- Add struct drm_edid_product_id and helpers.
- Use drm device based logging in more drm functions.
- Fixes for drm-panic, and option to test it.
- Assorted small fixes and updates to edid.
- Add drm_crtc_vblank_crtc and use it in vkms, nouveau.
Driver Changes:
- Assorted small fixes and improvements to bridge/imx8mp-hdmi-tx, nouveau, ast, qaic, lima, vc4, bridge/anx7625, mipi-dsi.
- Add drm panic to simpledrm, mgag200, imx, ast.
- Use dev_err_probe in bridge/panel drivers.
- Add Innolux G121X1-L03, LG sw43408 panels.
- Use struct drm_edid in i915 bios parsing.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2dc1b7c6-1743-4ddd-ad42-36f700234fbe@linux.intel.com
Dave Airlie [Mon, 22 Apr 2024 00:51:39 +0000 (10:51 +1000)]
Merge tag 'amd-drm-next-6.10-2024-04-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.10-2024-04-19:
amdgpu:
- DC resource allocation logic updates
- DC IPS fixes
- DC YUV fixes
- DMCUB fixes
- DML2 fixes
- Devcoredump updates
- USB-C DSC fix
- Misc display code cleanups
- PSR fixes
- MES timeout fix
- RAS updates
- UAF fix in VA IOCTL
- Fix visible VRAM handling during faults
- Fix IP discovery handling during PCI rescans
- Misc code cleanups
- PSP 14 updates
- More runtime PM code rework
- SMU 14.0.2 support
- GPUVM page fault redirection to secondary IH rings for IH 6.x
- Suspend/resume fixes
- SR-IOV fixes
amdkfd:
- Fix eviction fence handling
- Fix leak in GPU memory allocation failure case
- DMABuf import handling fix
radeon:
- Silence UBSAN warnings related to flexible arrays
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419224332.2938259-1-alexander.deucher@amd.com
Linus Torvalds [Sun, 21 Apr 2024 19:35:54 +0000 (12:35 -0700)]
Linux 6.9-rc5
Linus Torvalds [Sun, 21 Apr 2024 17:32:58 +0000 (10:32 -0700)]
Merge tag 'char-misc-6.9-rc5' of git://git./linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are some small char/misc and other driver fixes for 6.9-rc5.
Included in here are the following:
- binder driver fix for reported problem
- speakup crash fix
- mei driver fixes for reported problems
- comdei driver fix
- interconnect driver fixes
- rtsx driver fix
- peci.h kernel doc fix
All of these have been in linux-next for over a week with no reported
problems"
* tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
peci: linux/peci.h: fix Excess kernel-doc description warning
binder: check offset alignment in binder_get_object()
comedi: vmk80xx: fix incomplete endpoint checking
mei: vsc: Unregister interrupt handler for system suspend
Revert "mei: vsc: Call wake_up() in the threaded IRQ handler"
misc: rtsx: Fix rts5264 driver status incorrect when card removed
mei: me: disable RPL-S on SPS and IGN firmwares
speakup: Avoid crash on very long word
interconnect: Don't access req_list while it's being manipulated
interconnect: qcom: x1e80100: Remove inexistent ACV_PERF BCM
Linus Torvalds [Sun, 21 Apr 2024 17:30:21 +0000 (10:30 -0700)]
Merge tag 'driver-core-6.9-rc5' of git://git./linux/kernel/git/gregkh/driver-core
Pull kernfs bugfix and documentation update from Greg KH:
"Here are two changes for 6.9-rc5 that deal with "driver core" stuff,
that do the following:
- sysfs reference leak fix
- embargoed-hardware-issues.rst update for Power
Both of these have been in linux-next for over a week with no reported
issues"
* tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation: embargoed-hardware-issues.rst: Add myself for Power
fs: sysfs: Fix reference leak in sysfs_break_active_protection()
Linus Torvalds [Sun, 21 Apr 2024 17:27:01 +0000 (10:27 -0700)]
Merge tag 'tty-6.9-rc5' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.9-rc5 that
resolve a bunch of reported problems. Included in here are:
- MAINTAINERS and .mailmap update for Richard Genoud
- serial core regression fixes from 6.9-rc1 changes
- pci id cleanups
- serial core crash fix
- stm32 driver fixes
- 8250 driver fixes
All of these have been in linux-next for a while with no reported
problems"
* tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: stm32: Reset .throttled state in .startup()
serial: stm32: Return IRQ_NONE in the ISR if no handling happend
serial: core: Fix missing shutdown and startup for serial base port
serial: core: Clearing the circular buffer before NULLifying it
MAINTAINERS: mailmap: update Richard Genoud's email address
serial/pmac_zilog: Remove flawed mitigation for rx irq flood
serial: 8250_pci: Remove redundant PCI IDs
serial: core: Fix regression when runtime PM is not enabled
serial: mxs-auart: add spinlock around changing cts state
serial: 8250_dw: Revert: Do not reclock if already at correct rate
serial: 8250_lpc18xx: disable clks on error in probe()
Linus Torvalds [Sun, 21 Apr 2024 17:23:27 +0000 (10:23 -0700)]
Merge tag 'usb-6.9-rc5' of git://git./linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are some small USB and Thunderbolt driver fixes for 6.9-rc5.
Included in here are:
- MAINTAINER file update for invalid email address
- usb-serial device id updates
- typec driver fixes
- thunderbolt / usb4 driver fixes
- usb core shutdown fixes
- cdc-wdm driver revert for reported problem in -rc1
- usb gadget driver fixes
- xhci driver fixes
All of these have been in linux-next for a while with no reported
problems"
* tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
USB: serial: option: add Telit FN920C04 rmnet compositions
usb: dwc3: ep0: Don't reset resource alloc flag
Revert "usb: cdc-wdm: close race between read and workqueue"
USB: serial: option: add Rolling RW101-GL and RW135-GL support
USB: serial: option: add Lonsung U8300/U9300 product
USB: serial: option: add support for Fibocom FM650/FG650
USB: serial: option: support Quectel EM060K sub-models
USB: serial: option: add Fibocom FM135-GL variants
usb: misc: onboard_usb_hub: Disable the USB hub clock on failure
thunderbolt: Avoid notify PM core about runtime PM resume
thunderbolt: Fix wake configurations after device unplug
usb: dwc2: host: Fix dereference issue in DDMA completion flow.
usb: typec: mux: it5205: Fix ChipID value typo
MAINTAINERS: Drop Li Yang as their email address stopped working
usb: gadget: fsl: Initialize udc before using it
usb: Disable USB3 LPM at shutdown
usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error
usb: typec: tcpm: Correct the PDO counting in pd_set
usb: gadget: functionfs: Wait for fences before enqueueing DMABUF
usb: gadget: functionfs: Fix inverted DMA fence direction
...
Linus Torvalds [Sun, 21 Apr 2024 16:39:36 +0000 (09:39 -0700)]
Merge tag 'sched_urgent_for_v6.9_rc5' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:
- Add a missing memory barrier in the concurrency ID mm switching
* tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Add missing memory barrier in switch_mm_cid
Linus Torvalds [Sun, 21 Apr 2024 16:36:12 +0000 (09:36 -0700)]
Merge tag 'x86_urgent_for_v6.9_rc5' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Fix CPU feature dependencies of GFNI, VAES, and VPCLMULQDQ
- Print the correct error code when FRED reports a bad event type
- Add a FRED-specific INT80 handler without the special dances that
need to happen in the current one
- Enable the using-the-default-return-thunk-but-you-should-not warning
only on configs which actually enable those special return thunks
- Check the proper feature flags when selecting BHI retpoline
mitigation
* tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ
x86/fred: Fix incorrect error code printout in fred_bad_type()
x86/fred: Fix INT80 emulation for FRED
x86/retpolines: Enable the default thunk warning only on relevant configs
x86/bugs: Fix BHI retpoline check
Linus Torvalds [Sat, 20 Apr 2024 18:28:02 +0000 (11:28 -0700)]
Merge tag 'block-6.9-
20240420' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"Just two minor fixes that should go into the 6.9 kernel release, one
fixing a regression with partition scanning errors, and one fixing a
WARN_ON() that can get triggered if we race with a timer"
* tag 'block-6.9-
20240420' of git://git.kernel.dk/linux:
blk-iocost: do not WARN if iocg was already offlined
block: propagate partition scanning errors to the BLKRRPART ioctl
Linus Torvalds [Sat, 20 Apr 2024 18:17:22 +0000 (11:17 -0700)]
Merge tag 'email' of git://git./linux/kernel/git/jejb/scsi
Pull email address update from James Bottomley:
"My IBM email has stopped working, so update to a working email
address"
* tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
MAINTAINERS: update to working email address
Linus Torvalds [Sat, 20 Apr 2024 18:10:51 +0000 (11:10 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"This is a bit on the large side, mostly due to two changes:
- Changes to disable some broken PMU virtualization (see below for
details under "x86 PMU")
- Clean up SVM's enter/exit assembly code so that it can be compiled
without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched
return thunk in use. This should not happen!" when running KVM
selftests.
Everything else is small bugfixes and selftest changes:
- Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure
where KVM would allow userspace to refresh the cache with a bogus
GPA. The bug has existed for quite some time, but was exposed by a
new sanity check added in 6.9 (to ensure a cache is either
GPA-based or HVA-based).
- Drop an unused param from gfn_to_pfn_cache_invalidate_start() that
got left behind during a 6.9 cleanup.
- Fix a math goof in x86's hugepage logic for
KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow
(detected by KASAN).
- Fix a bug where KVM incorrectly clears root_role.direct when
userspace sets guest CPUID.
- Fix a dirty logging bug in the where KVM fails to write-protect
SPTEs used by a nested guest, if KVM is using Page-Modification
Logging and the nested hypervisor is NOT using EPT.
x86 PMU:
- Drop support for virtualizing adaptive PEBS, as KVM's
implementation is architecturally broken without an obvious/easy
path forward, and because exposing adaptive PEBS can leak host LBRs
to the guest, i.e. can leak host kernel addresses to the guest.
- Set the enable bits for general purpose counters in
PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD
processors.
- Disable LBR virtualization on CPUs that don't support LBR
callstacks, as KVM unconditionally uses
PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and
would fail on such CPUs.
Tests:
- Fix a flaw in the max_guest_memory selftest that results in it
exhausting the supply of ucall structures when run with more than
256 vCPUs.
- Mark KVM_MEM_READONLY as supported for RISC-V in
set_memory_region_test"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start()
KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test
KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting
KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}()
KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status
KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update
KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks
perf/x86/intel: Expose existence of callback support to KVM
KVM: VMX: Snapshot LBR capabilities during module initialization
KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD
KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run()
KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area
KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area
KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted
KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run()
KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV
KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding
KVM: SVM: Remove a useless zeroing of allocated memory
...
Linus Torvalds [Sat, 20 Apr 2024 18:06:42 +0000 (11:06 -0700)]
Merge tag 'powerpc-6.9-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix wireguard loading failure on pre-Power10 due to Power10 crypto
routines
- Fix papr-vpd selftest failure due to missing variable initialization
- Avoid unnecessary get/put in spapr_tce_platform_iommu_attach_dev()
Thanks to Geetika Moolchandani, Jason Gunthorpe, Michal Suchánek, Nathan
Lynch, and Shivaprasad G Bhat.
* tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc/papr-vpd: Fix missing variable initialization
powerpc/crypto/chacha-p10: Fix failure on non Power10
powerpc/iommu: Refactor spapr_tce_platform_iommu_attach_dev()
Linus Torvalds [Sat, 20 Apr 2024 17:36:02 +0000 (10:36 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A couple clk driver fixes, a build fix, and a deadlock fix:
- Mediatek mt7988 has broken PCIe because the wrong parent is used
- Mediatek clk drivers may deadlock when registering their clks
because the clk provider device is repeatedly runtime PM resumed
and suspended during probe and clk registration.
Resuming the clk provider device deadlocks with an ABBA deadlock
due to genpd_lock and the clk prepare_lock. The fix is to keep the
device runtime resumed while registering clks.
- Another runtime PM related deadlock, this time with disabling
unused clks during late init.
We get an ABBA deadlock where a device is runtime PM resuming (or
suspending) while the disabling of unused clks is happening in
parallel. That runtime PM action calls into the clk framework and
tries to grab the clk prepare_lock while the disabling of unused
clks holds the prepare_lock and is waiting for that runtime PM
action to complete.
The fix is to runtime resume all the clk provider devices before
grabbing the clk prepare_lock during disable unused.
- A build fix to provide an empty devm_clk_rate_exclusive_get()
function when CONFIG_COMMON_CLK=n"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe port
clk: mediatek: Do a runtime PM get on controllers during probe
clk: Get runtime PM before walking tree for clk_summary
clk: Get runtime PM before walking tree during disable_unused
clk: Initialize struct clk_core kref earlier
clk: Don't hold prepare_lock when calling kref_put()
clk: Remove prepare_lock hold assertion in __clk_release()
clk: Provide !COMMON_CLK dummy for devm_clk_rate_exclusive_get()
James Bottomley [Sat, 20 Apr 2024 12:34:09 +0000 (08:34 -0400)]
MAINTAINERS: update to working email address
jejb@linux.ibm.com no longer works.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Linus Torvalds [Fri, 19 Apr 2024 23:34:10 +0000 (16:34 -0700)]
Merge tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git./linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"A random set of small bug fixes:
- Fix perf annotate TUI when used with data type profiling
- Work around BPF verifier about sighand lock checking
And a set of kernel header synchronization"
* tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools/include: Sync arm64 asm/cputype.h with the kernel sources
tools/include: Sync asm-generic/bitops/fls.h with the kernel sources
tools/include: Sync x86 asm/msr-index.h with the kernel sources
tools/include: Sync x86 asm/irq_vectors.h with the kernel sources
tools/include: Sync x86 CPU feature headers with the kernel sources
tools/include: Sync uapi/sound/asound.h with the kernel sources
tools/include: Sync uapi/linux/kvm.h and asm/kvm.h with the kernel sources
tools/include: Sync uapi/linux/fs.h with the kernel sources
tools/include: Sync uapi/drm/i915_drm.h with the kernel sources
perf lock contention: Add a missing NULL check
perf annotate: Make sure to call symbol__annotate2() in TUI
Linus Torvalds [Fri, 19 Apr 2024 21:10:11 +0000 (14:10 -0700)]
Merge tag 'hardening-v6.9-rc5' of git://git./linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- Correctly disable UBSAN configs in configs/hardening (Nathan
Chancellor)
- Add missing signed integer overflow trap types to arm64 handler
* tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
ubsan: Add awareness of signed integer overflow traps
configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP
configs/hardening: Fix disabling UBSAN configurations
Linus Torvalds [Fri, 19 Apr 2024 21:02:21 +0000 (14:02 -0700)]
Merge tag 'for-linus-iommufd' of git://git./linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
"Two fixes for the selftests:
- CONFIG_IOMMUFD_TEST needs CONFIG_IOMMUFD_DRIVER to work
- The kconfig fragment sshould include fault injection so the fault
injection test can work"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd: Add config needed for iommufd_fail_nth
iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftest
Linus Torvalds [Fri, 19 Apr 2024 20:46:44 +0000 (13:46 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
- Add a missing mutex_destroy() in rxe
- Enhance the debugging print for cm_destroy failures to help debug
these
- Fix mlx5 MAD processing in cases where multiport devices are running
in switchedev mode
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Fix port number for counter query in multi-port configuration
RDMA/cm: Print the old state when cm_destroy_id gets timeout
RDMA/rxe: Fix the problem "mutex_destroy missing"
Linus Torvalds [Fri, 19 Apr 2024 20:36:28 +0000 (13:36 -0700)]
Merge tag '9p-fixes-for-6.9-rc5' of git://git./linux/kernel/git/ericvh/v9fs
Pull fs/9p fixes from Eric Van Hensbergen:
"This contains a reversion of one of the original 6.9 patches which
seems to have been the cause of most of the instability. It also
incorporates several fixes to legacy support and cache fixes.
There are few additional changes to improve stability, but I want
another week of testing before sending them upstream"
* tag '9p-fixes-for-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
fs/9p: drop inodes immediately on non-.L too
fs/9p: Revert "fs/9p: fix dups even in uncached mode"
fs/9p: remove erroneous nlink init from legacy stat2inode
9p: explicitly deny setlease attempts
fs/9p: fix the cache always being enabled on files with qid flags
fs/9p: translate O_TRUNC into OTRUNC
fs/9p: only translate RWX permissions for plain 9P2000
Linus Torvalds [Fri, 19 Apr 2024 20:16:10 +0000 (13:16 -0700)]
Merge tag 'fuse-fixes-6.9-rc5' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
- Fix two bugs in the new passthrough mode
- Fix a statx bug introduced in v6.6
- Fix code documentation
* tag 'fuse-fixes-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
cuse: add kernel-doc comments to cuse_process_init_reply()
fuse: fix leaked ENOSYS error on first statx call
fuse: fix parallel dio write on file open in passthrough mode
fuse: fix wrong ff->iomode state changes from parallel dio write
Linus Torvalds [Fri, 19 Apr 2024 20:04:21 +0000 (13:04 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix a kernel fault during page table walking in huge_pte_alloc() with
PTABLE_LEVELS=5 due to using p4d_offset() instead of p4d_alloc()
- head.S fix and cleanup to disable the MMU before toggling the
HCR_EL2.E2H bit when entering the kernel with the MMU on from the EFI
stub. Changing this bit (currently from VHE to nVHE) causes some
system registers as well as page table descriptors to be interpreted
differently, potentially resulting in spurious MMU faults
- Fix translation fault in swsusp_save() accessing MEMBLOCK_NOMAP
memory ranges due to kernel_page_present() returning true in most
configurations other than rodata_full == true,
CONFIG_DEBUG_PAGEALLOC=y or CONFIG_KFENCE=y
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hibernate: Fix level3 translation fault in swsusp_save()
arm64/head: Disable MMU at EL2 before clearing HCR_EL2.E2H
arm64/head: Drop unnecessary pre-disable-MMU workaround
arm64/hugetlb: Fix page table walk in huge_pte_alloc()
Linus Torvalds [Fri, 19 Apr 2024 16:59:15 +0000 (09:59 -0700)]
Merge tag 's390-6.9-4' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Fix NULL pointer dereference in program check handler
- Fake IRBs are important events relevant for problem analysis. Add
traces when queueing and delivering
- Fix a race condition in ccw_device_set_online() that can cause the
online process to fail
- Deferred condition code 1 response indicates that I/O was not started
and should be retried. The current QDIO implementation handles a cc1
response as an error, resulting in a failed QDIO setup. Fix that by
retrying the setup when a cc1 response is received
* tag 's390-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: Fix NULL pointer dereference
s390/cio: log fake IRB events
s390/cio: fix race condition during online processing
s390/qdio: handle deferred cc1
Linus Torvalds [Fri, 19 Apr 2024 16:52:09 +0000 (09:52 -0700)]
Merge tag 'bootconfig-fixes-v6.9-rc4' of git://git./linux/kernel/git/trace/linux-trace
Pull bootconfig fixes from Masami Hiramatsu:
- Fix potential static_command_line buffer overrun.
Currently we allocate the memory for static_command_line based on
"boot_command_line", but it will copy "command_line" into it. So we
use the length of "command_line" instead of "boot_command_line" (as
we previously did)
- Use memblock_free_late() in xbc_exit() instead of memblock_free()
after the buddy system is initialized
- Fix a kerneldoc warning
* tag 'bootconfig-fixes-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Fix the kerneldoc of _xbc_exit()
bootconfig: use memblock_free_late to free xbc memory to buddy
init/main.c: Fix potential static_command_line memory overflow
Linus Torvalds [Fri, 19 Apr 2024 16:41:57 +0000 (09:41 -0700)]
Merge tag 'thermal-6.9-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"This prevents the thermal debug code from attempting to divide by zero
and corrects trip point statistics (Rafael Wysocki)"
* tag 'thermal-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()
Linus Torvalds [Fri, 19 Apr 2024 16:29:51 +0000 (09:29 -0700)]
Merge tag 'sound-6.9-rc5' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Things look calm and normal, we got handful HD-audio-related small
fixes and a fix for MIDI 2.0 UMP handling"
* tag 'sound-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: seq: ump: Fix conversion from MIDI2 to MIDI1 UMP messages
ALSA: hda/realtek - Enable audio jacks of Haier Boyue G42 with ALC269VC
ALSA: hda/realtek: Add quirks for Huawei Matebook D14 NBLB-WAX9N
ALSA: hda/realtek: Fix volumn control of ThinkBook 16P Gen4
ALSA: hda/realtek: Fixes for Asus GU605M and GA403U sound
ALSA: hda/tas2781: Add new vendor_id and subsystem_id to support ThinkPad ICE-1
ALSA: hda/tas2781: correct the register for pow calibrated data
ALSA: hda/realtek: Add quirk for HP SnowWhite laptops
Linus Torvalds [Fri, 19 Apr 2024 16:21:25 +0000 (09:21 -0700)]
Merge tag 'drm-fixes-2024-04-19' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Regular week of fixes, seems to be about right for this time in the
release cycle, amdgpu, and nouveau are the main one with some
scattered fixes otherwise.
ttm:
- Stop pooling cached NUMA pages
amdgpu:
- Fix invalid resource->start check
- USB-C DSC fix
- Fix a potential UAF in VA IOCTL
- Fix visible VRAM handling during faults
amdkfd:
- Fix memory leak in create_process failure
radeon:
- Silence UBSAN warnings from variable sized arrays
nouveau:
- dp: Don't probe DP ports twice
- nv04: Fix OOB access
- nv50: Disable AUX bus for disconnected DP ports
- nvkm: Fix instmem race condition
panel:
- Don't unregister DSI devices in several drivers
v3d:
- Fix enabled_ns increment
xe:
- Fix bo leak on error path during fb init
- Fix use-after-free due to order vm is put and destroyed"
* tag 'drm-fixes-2024-04-19' of https://gitlab.freedesktop.org/drm/kernel:
drm/radeon: silence UBSAN warning (v3)
drm/radeon: make -fstrict-flex-arrays=3 happy
drm/amdgpu: fix visible VRAM handling during faults
drm/amdgpu: validate the parameters of bo mapping operations more clearly
Revert "drm/amd/display: fix USB-C flag update after enc10 feature init"
drm/amdkfd: Fix memory leak in create_process failure
drm/amdgpu: remove invalid resource->start check v2
drm/xe/vm: prevent UAF with asid based lookup
drm/xe: Fix bo leak in intel_fb_bo_framebuffer_init
drm/panel: novatek-nt36682e: don't unregister DSI device
drm/panel: visionox-rm69299: don't unregister DSI device
drm/nouveau/dp: Don't probe eDP ports twice harder
drm/nouveau/kms/nv50-: Disable AUX bus for disconnected DP ports
drm/v3d: Don't increment `enabled_ns` twice
drm/vmwgfx: Sort primary plane formats by order of preference
drm/vmwgfx: Fix crtc's atomic check conditional
drm/vmwgfx: Fix prime import/export
drm/ttm: stop pooling cached NUMA pages v2
drm: nv04: Fix out of bounds access
nouveau: fix instmem race condition around ptr stores
Linus Torvalds [Fri, 19 Apr 2024 16:13:35 +0000 (09:13 -0700)]
Merge tag 'mm-hotfixes-stable-2024-04-18-14-41' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"15 hotfixes. 9 are cc:stable and the remainder address post-6.8 issues
or aren't considered suitable for backporting.
There are a significant number of fixups for this cycle's page_owner
changes (series "page_owner: print stacks and their outstanding
allocations"). Apart from that, singleton changes all over, mainly in
MM"
* tag 'mm-hotfixes-stable-2024-04-18-14-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: fix OOB in nilfs_set_de_type
MAINTAINERS: update Naoya Horiguchi's email address
fork: defer linking file vma until vma is fully initialized
mm/shmem: inline shmem_is_huge() for disabled transparent hugepages
mm,page_owner: defer enablement of static branch
Squashfs: check the inode number is not the invalid value of zero
mm,swapops: update check in is_pfn_swap_entry for hwpoison entries
mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled
mm/userfaultfd: allow hugetlb change protection upon poison entry
mm,page_owner: fix printing of stack records
mm,page_owner: fix accounting of pages when migrating
mm,page_owner: fix refcount imbalance
mm,page_owner: update metadata for tail pages
userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE
mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly
Yaxiong Tian [Wed, 17 Apr 2024 02:52:48 +0000 (10:52 +0800)]
arm64: hibernate: Fix level3 translation fault in swsusp_save()
On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:
Unable to handle kernel paging request at virtual address
ffffff8000000000
Mem abort info:
ESR = 0x0000000096000007
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x07: level 3 translation fault
Data abort info:
ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=
00000000eeb0b000
[
ffffff8000000000] pgd=
180000217fff9803, p4d=
180000217fff9803, pud=
180000217fff9803, pmd=
180000217fff8803, pte=
0000000000000000
Internal error: Oops:
0000000096000007 [#1] SMP
Internal error: Oops:
0000000096000007 [#1] SMP
Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ #76
Source Version:
4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
pstate:
600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : swsusp_save+0x280/0x538
lr : swsusp_save+0x280/0x538
sp :
ffffffa034a3fa40
x29:
ffffffa034a3fa40 x28:
ffffff8000001000 x27:
0000000000000000
x26:
ffffff8001400000 x25:
ffffffc08113e248 x24:
0000000000000000
x23:
0000000000080000 x22:
ffffffc08113e280 x21:
00000000000c69f2
x20:
ffffff8000000000 x19:
ffffffc081ae2500 x18:
0000000000000000
x17:
6666662074736420 x16:
3030303030303030 x15:
3038666666666666
x14:
0000000000000b69 x13:
ffffff9f89088530 x12:
00000000ffffffea
x11:
00000000ffff7fff x10:
00000000ffff7fff x9 :
ffffffc08193f0d0
x8 :
00000000000bffe8 x7 :
c0000000ffff7fff x6 :
0000000000000001
x5 :
ffffffa0fff09dc8 x4 :
0000000000000000 x3 :
0000000000000027
x2 :
0000000000000000 x1 :
0000000000000000 x0 :
000000000000004e
Call trace:
swsusp_save+0x280/0x538
swsusp_arch_suspend+0x148/0x190
hibernation_snapshot+0x240/0x39c
hibernate+0xc4/0x378
state_store+0xf0/0x10c
kobj_attr_store+0x14/0x24
The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.
This problem was introduced by changes to the pfn_valid() logic in
commit
a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").
Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.
Fixes:
a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <stable@vger.kernel.org> # 5.14.x
Suggested-by: Mike Rapoport <rppt@kernel.org>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Co-developed-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240417025248.386622-1-tianyaxiong@kylinos.cn
[catalin.marinas@arm.com: rework commit message]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Greg Kroah-Hartman [Fri, 19 Apr 2024 14:07:18 +0000 (16:07 +0200)]
Merge tag 'usb-serial-6.9-rc5' of https://git./linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial device ids for 6.9-rc5
Here are some new modem device ids for 6.9-rc5.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.9-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Telit FN920C04 rmnet compositions
USB: serial: option: add Rolling RW101-GL and RW135-GL support
USB: serial: option: add Lonsung U8300/U9300 product
USB: serial: option: add support for Fibocom FM650/FG650
USB: serial: option: support Quectel EM060K sub-models
USB: serial: option: add Fibocom FM135-GL variants
Li Nan [Fri, 19 Apr 2024 09:32:57 +0000 (17:32 +0800)]
blk-iocost: do not WARN if iocg was already offlined
In iocg_pay_debt(), warn is triggered if 'active_list' is empty, which
is intended to confirm iocg is active when it has debt. However, warn
can be triggered during a blkcg or disk removal, if iocg_waitq_timer_fn()
is run at that time:
WARNING: CPU: 0 PID:
2344971 at block/blk-iocost.c:1402 iocg_pay_debt+0x14c/0x190
Call trace:
iocg_pay_debt+0x14c/0x190
iocg_kick_waitq+0x438/0x4c0
iocg_waitq_timer_fn+0xd8/0x130
__run_hrtimer+0x144/0x45c
__hrtimer_run_queues+0x16c/0x244
hrtimer_interrupt+0x2cc/0x7b0
The warn in this situation is meaningless. Since this iocg is being
removed, the state of the 'active_list' is irrelevant, and 'waitq_timer'
is canceled after removing 'active_list' in ioc_pd_free(), which ensures
iocg is freed after iocg_waitq_timer_fn() returns.
Therefore, add the check if iocg was already offlined to avoid warn
when removing a blkcg or disk.
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20240419093257.3004211-1-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rafael J. Wysocki [Mon, 15 Apr 2024 19:02:12 +0000 (21:02 +0200)]
thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()
The count field in struct trip_stats, representing the number of times
the zone temperature was above the trip point, needs to be incremented
in thermal_debug_tz_trip_up(), for two reasons.
First, if a trip point is crossed on the way up for the first time,
thermal_debug_update_temp() called from update_temperature() does
not see it because it has not been added to trips_crossed[] array
in the thermal zone's struct tz_debugfs object yet. Therefore, when
thermal_debug_tz_trip_up() is called after that, the trip point's
count value is 0, and the attempt to divide by it during the average
temperature computation leads to a divide error which causes the kernel
to crash. Setting the count to 1 before the division by incrementing it
fixes this problem.
Second, if a trip point is crossed on the way up, but it has been
crossed on the way up already before, its count value needs to be
incremented to make a record of the fact that the zone temperature is
above the trip now. Without doing that, if the mitigations applied
after crossing the trip cause the zone temperature to drop below its
threshold, the count will not be updated for this episode at all and
the average temperature in the trip statistics record will be somewhat
higher than it should be.
Fixes:
7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Takashi Iwai [Fri, 19 Apr 2024 10:04:39 +0000 (12:04 +0200)]
ALSA: seq: ump: Fix conversion from MIDI2 to MIDI1 UMP messages
The conversion from MIDI2 to MIDI1 UMP messages had a leftover
artifact (superfluous bit shift), and this resulted in the bogus type
check, leading to empty outputs. Let's fix it.
Fixes:
e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events")
Cc: <stable@vger.kernel.org>
Link: https://github.com/alsa-project/alsa-utils/issues/262
Message-ID: <
20240419100442.14806-1-tiwai@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Ai Chao [Fri, 19 Apr 2024 08:21:59 +0000 (16:21 +0800)]
ALSA: hda/realtek - Enable audio jacks of Haier Boyue G42 with ALC269VC
The Haier Boyue G42 with ALC269VC cannot detect the MIC of headset,
the line out and internal speaker until
ALC269VC_FIXUP_ACER_VCOPPERBOX_PINS quirk applied.
Signed-off-by: Ai Chao <aichao@kylinos.cn>
Cc: <stable@vger.kernel.org>
Message-ID: <
20240419082159.476879-1-aichao@kylinos.cn>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Dave Airlie [Fri, 19 Apr 2024 06:48:51 +0000 (16:48 +1000)]
Merge tag 'drm-intel-next-2024-04-17-1' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-next
Core Changes (DRM):
- Fix documentation of DP tunnel functions (Imre)
- DP MST read sideband messaging cap (Jani)
- Preparation patches for Adaptive Sync SDP Support for DP (Mitul)
Driver Changes:
i915 core (non-display):
- Documentation improvements (Nirmoy)
- Add includes for BUG_ON/BUILD_BUG_ON in i915_memcpy.c (Joonas)
- Do not print 'pxp init failed with 0' when it succeed (Jose)
- Clean-up, including removal of dead code for unsupported platforms (Lucas)
- Adding new DG2 PCI ID (Ravi)
{i915,xe} display:
- Spelling fix (Colin Ian)
- Document CDCLK components (Gustavo)
- Lunar Lake display enabling, including cdclk and other refactors (Gustavo, Bala)
- BIOS/VBT/opregion related refactor (Jani, Ville, RK)
- Save a few bytes of memory using {kstrdup,kfree}_const variant (Christophe)
- Digital port related refactor/clean-up (Ville)
- Fix 2s boot time regression on DP panel replay init (Animesh)
- Remove redundant drm_rect_visible() overlay use (Arthur)
- DSC HW state readout fixes (Imre)
- Remove duplication on audio enable/disable on SDVO and g4x+ DP (Ville)
- Disable AuxCCS framebuffers if built for Xe (Juha-Pekka)
- Fix DSI init order (Ville)
- DRRS related refactor and fixes (Bhanuprakash)
- Fix DSB vblank waits with VRR (Ville)
- General improvements on register name and use of REG_BIT (Ville)
- Some display power well related improvements (Ville)
- FBC changes for better w/a handling (Ville)
- Make crtc disable more atomic (Ville)
- Fix hwmon locking inversion in sysfs getter (Janusz)
- Increase DP idle pattern wait timeout to 2ms (Shekhar)
- PSR related fixes and improvents (Jouni)
- Start using container_of_const() for some extra const safety (Ville)
- Use drm_printer more on display code (Ville)
- Fix Jasper Lake boot freeze (Jonathon)
- Update Pipe src size check in skl_update_scaler (Ankit)
- Enable MST mode for 128b/132b single-stream sideband (Jani)
- Pass encoder around more for port/phy checks (Jani)
- Some initial work to make display code more independent from i915 (Jani)
- Pre-populate the cursor physical dma address (Ville)
- Do not bump min backlight brightness to max on enable (Gareth)
- Fix MTL supported DP rates - removal of UHBR13.5 (Arun)
- Fix the computation for compressed_bpp for DISPLAY < 1 (Ankit)
- Bigjoiner modeset sequence redesign and MST support (Ville)
- Enable Adaptive Sync SDP Support for DP (Mitul)
- Implemnt vblank sycnhronized mbus joining changes (Ville, Stanislav)
- HDCP related fixes (Suraj)
- Fix i915_display_info debugfs when connectors are not active (Ville)
- Clean up on Xe compat layer (Jani)
- Add jitter WAs for MST/FEC/DSC links (Imre)
- DMC wakelock implementation (Luca)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmYfzQEACgkQ+mJfZA7r
# E8qYvAf/T8KrEewHOWz7NOaKcFRCNYaF4QTdVOfgHUYBX5NPDF/xzwFdHCL8QWQu
# bwKwE2b94VEyruG3DYwTMd8GNcDxrsOrmU0IZe3PVkm+BvHLTmrOqL6BlCd85zXF
# 02IuE+LCaWREmmpLMcsDMxsaaq8yp+cw9/F0jJDrH6LiyfxFriefxyZYpGYjRCuv
# 8GP1fHXLFV2yys4rveR/+y9xIhgy82mVcg3/Kfk0+er7gALkY6Vc0N38wedET9MZ
# ZPfVidBeaTkIKcCDFKnFzGjG+9rNQ7NFrXyS7Hl97VolGt2l03qGGPNW1PouDiUx
# 7Y8CJOc+1k9wyBMKl0a/NQBRAqSZBQ==
# =JvZN
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 17 Apr 2024 23:22:09 AEST
# gpg: using RSA key
6D207068EEDD65091C2CE2A3FA625F640EEB13CA
# gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>" [unknown]
# gpg: aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C E2A3 FA62 5F64 0EEB 13CA
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zh_Q72gYKMMbge9A@intel.com
Lang Yu [Sun, 7 Apr 2024 04:36:00 +0000 (12:36 +0800)]
drm/amdkfd: make sure VM is ready for updating operations
When page table BOs were evicted but not validated before
updating page tables, VM is still in evicting state,
amdgpu_vm_update_range returns -EBUSY and
restore_process_worker runs into a dead loop.
v2: Split the BO validation and page table update into two
separate loops in amdgpu_amdkfd_restore_process_bos. (Felix)
1.Validate BOs
2.Validate VM (and DMABuf attachments)
3.Update page tables for the BOs validated above
Fixes:
50661eb1a2c8 ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs")
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Thu, 18 Apr 2024 15:32:34 +0000 (11:32 -0400)]
drm/amdgpu: Fix leak when GPU memory allocation fails
Free the sync object if the memory allocation fails for any
reason.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Thu, 18 Apr 2024 01:13:59 +0000 (21:13 -0400)]
drm/amdkfd: Fix eviction fence handling
Handle case that dma_fence_get_rcu_safe returns NULL.
If restore work is already scheduled, only update its timer. The same
work item cannot be queued twice, so undo the extra queue eviction.
Fixes:
9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs")
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Gang BA <Gang.Ba@amd.com>
Reviewed-by: Gang BA <Gang.Ba@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Zhigang Luo [Wed, 17 Apr 2024 14:59:58 +0000 (10:59 -0400)]
drm/amdgpu: remove virt_init_data_exchange from poison consumption handler
Host will initiate an FLR for all poison consumption.
Guest should wait for FLR message to re-init data exchange.
Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 29 Feb 2024 08:29:19 +0000 (13:59 +0530)]
drm/amdgpu: Change AID detection logic
On GFX 9.4.3 SOCs, only 2 SDMA instances need to be available to be
considered as a valid AID.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Ashton [Thu, 2 Nov 2023 04:21:55 +0000 (04:21 +0000)]
drm/amd/display: Set color_mgmt_changed to true on unsuspend
Otherwise we can end up with a frame on unsuspend where color management
is not applied when userspace has not committed themselves.
Fixes re-applying color management on Steam Deck/Gamescope on S3 resume.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 12 Apr 2024 09:35:00 +0000 (15:05 +0530)]
drm/amdgpu: enable redirection of irq's for IH V6.1
Enable redirection of irq for pagefaults for specific
clients to avoid overflow without dropping interrupts.
So here we redirect the interrupts to another IH ring
i.e ring1 where only these interrupts are processed.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ahmad Rehman [Tue, 16 Apr 2024 18:29:15 +0000 (13:29 -0500)]
drm/amdgpu: Skip the coredump collection on reset during driver reload
In passthrough environment, the driver triggers the mode-1 reset on
reload. The reset causes the core dump collection which is delayed task
and prevents driver from unloading until it is completed. Since we do
not need to collect data on "reset on reload" case, we can skip core
dump collection.
v2: Use the same flag to avoid calling amdgpu_reset_reg_dumps as well.
Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 12 Apr 2024 09:30:20 +0000 (15:00 +0530)]
drm/amdgpu: enable redirection of irq's for IH V6.0
Enable redirection of irq for pagefaults for specific
clients to avoid overflow without dropping interrupts.
So here we redirect the interrupts to another IH ring
i.e ring1 where only these interrupts are processed.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 12 Apr 2024 09:17:30 +0000 (14:47 +0530)]
drm/amdgpu: add IH_RING1_CFG headers for IH v6.0
Add offsets, mask and shift macros for IH v6.0
which are needed to configure ring1 client irq
redirection.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawking Zhang [Tue, 16 Apr 2024 06:25:26 +0000 (14:25 +0800)]
drm/amdgpu: Use driver mode reset for data poison
mode-2 reset is the only reliable method that can get
GC/SDMA back when poison is consumed. mmhub requires
mode-1 reset.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Fri, 12 Apr 2024 08:52:16 +0000 (14:22 +0530)]
drm:amdgpu: enable IH ring1 for IH v6.1
We need IH ring1 for handling the pagefault
interrupts which over flow in default
ring for specific usecases.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>