linux-2.6-microblaze.git
11 days agodrm/xe: drop redundant conversion to bool
Raag Jadav [Thu, 29 May 2025 16:09:37 +0000 (21:39 +0530)]
drm/xe: drop redundant conversion to bool

The result of integer comparison already evaluates to bool. No need for
explicit conversion.

No functional impact.

Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 61761a6b57f2818983466d24aab60baab471ba21)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
11 days agodrm/xe/hwmon: Move card reactive critical power under channel card
Karthik Poosa [Thu, 29 May 2025 16:34:54 +0000 (22:04 +0530)]
drm/xe/hwmon: Move card reactive critical power under channel card

Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this
represents the entire card critical power/current.

v2: Update the date of curr1_crit also in hwmon documentation.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes")
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 25e963a09e059ffdb15c09cc79cfded855b43668)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
11 days agodrm/xe/hwmon: Add support to manage power limits though mailbox
Karthik Poosa [Thu, 29 May 2025 16:34:53 +0000 (22:04 +0530)]
drm/xe/hwmon: Add support to manage power limits though mailbox

Add support to manage power limits using pcode mailbox commands
for supported platforms.

v2:
 - Address review comments. (Badal)
 - Use mailbox commands instead of registers to manage power limits
   for BMG.
 - Clamp the maximum power limit to GPU firmware default value.

v3:
 - Clamp power limit in write also for platforms with mailbox support.

v4:
 - Remove unnecessary debug prints. (Badal)

v5:
 - Update description of variable pl1_on_boot to fix kernel-doc error.

v6:
 - Improve commit message, refer to BIOS as GPU firmware.
 - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW.
 - Rectify drm_warn to drm_info.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG")
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 7596d839f6228757fe17a810da2d1c5f3305078c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
11 days agodrm/xe/vm: move xe_svm_init() earlier
Matthew Auld [Wed, 14 May 2025 15:24:26 +0000 (16:24 +0100)]
drm/xe/vm: move xe_svm_init() earlier

In xe_vm_close_and_put() we need to be able to call xe_svm_fini(),
however during vm creation we can call this on the error path, before
having actually initialised the svm state, leading to various splats
followed by a fatal NPD.

Fixes: 6fd979c2f331 ("drm/xe: Add SVM init / close / fini to faulting VMs")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com
(cherry picked from commit 4f296d77cf49fcb5f90b4674123ad7f3a0676165)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
11 days agodrm/xe/vm: move rebind_work init earlier
Matthew Auld [Wed, 14 May 2025 15:24:25 +0000 (16:24 +0100)]
drm/xe/vm: move rebind_work init earlier

In xe_vm_close_and_put() we need to be able to call
flush_work(rebind_work), however during vm creation we can call this on
the error path, before having actually set up the worker, leading to a
splat from flush_work().

It looks like we can simply move the worker init step earlier to fix
this.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com
(cherry picked from commit 96af397aa1a2d1032a6e28ff3f4bc0ab4be40e1d)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2 weeks agodrm/xe: Add missing documentation of rpa_freq
Rodrigo Vivi [Wed, 21 May 2025 16:51:48 +0000 (12:51 -0400)]
drm/xe: Add missing documentation of rpa_freq

While at it, already adjust the rpe_freq frequency, to highlight
that both are calculated by PCODE at runtime.

Fixes: c6aac2fa77a3 ("drm/xe: Introduce the RPa information")
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20250521165146.39616-4-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 39578fa40420fb11dbe4f42225a347e945d8fd0e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2 weeks agodrm/xe: Make xe_gt_freq part of the Documentation
Rodrigo Vivi [Wed, 21 May 2025 16:51:47 +0000 (12:51 -0400)]
drm/xe: Make xe_gt_freq part of the Documentation

The documentation was created with the creation of the component,
however it has never been actually shown in the actual Documentation.

While doing this, fixes the identation style, to avoid new warnings
while building htmldocs.

Fixes: bef52b5c7a19 ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250521165146.39616-3-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit af53f0fd99c3bbb3afd29f1612c9e88c5a92cc01)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
3 weeks agodrm/xe: Default auto_link_downgrade status to false
Aradhya Bhatia [Fri, 16 May 2025 12:43:55 +0000 (12:43 +0000)]
drm/xe: Default auto_link_downgrade status to false

xe_pcode_read() can return back successfully without updating the
variable 'val'. This can cause an arbitrary value to show up in the
sysfs file.

Allow the auto_link_downgrade_status to default to 0 to avoid any
arbitrary value from coming up.

Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Link: https://lore.kernel.org/r/20250516124355.4872-1-aradhya.bhatia@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit a7f87deac2295d11865048bcb9c2de369b52ed93)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
3 weeks agodrm/xe/guc: Make creation of SLPC debugfs files conditional
Aradhya Bhatia [Fri, 16 May 2025 14:19:02 +0000 (14:19 +0000)]
drm/xe/guc: Make creation of SLPC debugfs files conditional

Platforms that do not support SLPC are exempted from the GuC PC support.
The GuC PC does not get initialized, and neither do its BOs get created.

This causes a problem because the GuC PC debugfs file is still being
created. Whenever the file is attempted to read, it causes a NULL
pointer dereference on the supposed BO of the GuC PC.

So, make the creation of SLPC debugfs files conditional to when SLPC
features are supported.

Fixes: aaab5404b16f ("drm/xe: Introduce GuC PC debugfs")
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Link: https://lore.kernel.org/r/20250516141902.5614-1-aradhya.bhatia@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 17486cf3df5320752cc67ee8bcb2379d1b9de76c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 weeks agoMerge tag 'amd-drm-next-6.16-2025-05-09' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Sun, 11 May 2025 21:14:27 +0000 (07:14 +1000)]
Merge tag 'amd-drm-next-6.16-2025-05-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.16-2025-05-09:

amdgpu:
- IPS fixes
- DSC cleanup
- DC Scaling updates
- DC FP fixes
- Fused I2C-over-AUX updates
- SubVP fixes
- Freesync fix
- DMUB AUX fixes
- VCN fix
- Hibernation fixes
- HDP fixes
- DCN 2.1 fixes
- DPIA fixes
- DMUB updates
- Use drm_file_err in amdgpu
- Enforce isolation updates
- Use new dma_fence helpers
- USERQ fixes
- Documentation updates
- Misc code cleanups
- SR-IOV updates
- RAS updates
- PSP 12 cleanups

amdkfd:
- Update error messages for SDMA
- Userptr updates

drm:
- Add drm_file_err function

dma-buf:
- Add a helper to sort and deduplicate dma_fence arrays

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250509230951.3871914-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
5 weeks agoMerge tag 'drm-misc-next-2025-05-08' of https://gitlab.freedesktop.org/drm/misc/kerne...
Dave Airlie [Sat, 10 May 2025 06:13:46 +0000 (16:13 +1000)]
Merge tag 'drm-misc-next-2025-05-08' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.16-rc1:

Cross-subsystem Changes:
- Change vsprintf %p4cn to %p4chR, remove %p4cn.

Core Changes:
- Documentation updates (fb rendering, actual_brightness)

Driver Changes:
- Small fixes to appletbdrm, panthor, st7571-i2c, rockchip, renesas,
  panic handler, gpusvm, vkms, panel timings.
- Add AUO B140QAN08.H, BOE NE140WUM-N6S, CSW MNE007QS3-8, BOE TD4320 panels.
- Convert rk3066_hdmi to bridge driver.
- Improve HPD on anx7625.
- Speed up loading tegra firmware, and other small fixes to tegra & host1x.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/5428be12-fc08-4e28-8c5f-85d73b8a7e04@linux.intel.com
5 weeks agoMerge tag 'drm-intel-next-2025-05-08' of https://gitlab.freedesktop.org/drm/i915...
Dave Airlie [Fri, 9 May 2025 20:10:04 +0000 (06:10 +1000)]
Merge tag 'drm-intel-next-2025-05-08' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Non-display related:
- Fix undefined reference to `intel_pxp_gsccs_is_ready_for_sessions'

Display related:
- More work towards display separation (Jani)
- Stop writing VRR_CTL_IGN_MAX_SHIFT for MTL onwards (Jouni)
- DSC checks for 3 engines (Ankit)
- Add link rate and lane count to i915_display_info (Khaled)
- PSR fixes and workaround for underrun on idle (Jouni)
- LOBF enablement and ALMP fixes (Animesh)
- Clean up VGA plane handling (Ville)
- Use an intel_connector pointer everywhere (Imre)
- Fix warning for coffeelake on SunrisePoint PCH (Jiajia)
- Rework/Correction on minimum hblank calculation (Arun)
- Dmesg clean up (Jani)
- Add a couple of simple display workarounds (Ankit, Vinod)
- Refactor HDCP GSC (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aByyL3bEufPu79OM@intel.com
5 weeks agoMerge tag 'drm-xe-next-2025-05-08' of https://gitlab.freedesktop.org/drm/xe/kernel...
Dave Airlie [Fri, 9 May 2025 19:24:39 +0000 (05:24 +1000)]
Merge tag 'drm-xe-next-2025-05-08' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

UAPI Changes:
- Expose PCIe link downgrade attributes (Raag)

Cross-subsystem Changes:

Core Changes:
- gpusvm has_dma_mapping fix (Dafna)

Driver Changes:
- Forcewake hold fix (Tejas)
- Fix guc_info debugfs for VFs (Daniele)
- Fix devcoredump chunk alignment calculation (Arnd)
- Don't print timedout job message on killed exec queues (Matt Brost)
- Don't flush the GSC worker from the reset path (Daniele)
- Use copy_from_user() instead of __copy_from_user() (Harish)
- Only flush SVM garbage collector if CONFIG_DRM_XE_GPUSVM (Shuicheng)
- Fix forcewake vs runtime pm ref release ordering (Shuicheng)
- Move xe_device_sysfs_init() to xe_device_probe() (Raag)
- Append PCIe Gen5 limitations to xe_firmware document (Raag)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/aBzUwbzCzz7Qo7fA@fedora
5 weeks agoMerge tag 'drm-intel-gt-next-2025-05-08-1' of https://gitlab.freedesktop.org/drm...
Dave Airlie [Fri, 9 May 2025 01:39:27 +0000 (11:39 +1000)]
Merge tag 'drm-intel-gt-next-2025-05-08-1' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

Driver Changes:

- Fix SLPC wait boosting reference counting to avoid getting stuck on non-boost
  frequency on power saving profile on DG1/DG2 (Vinay)
- Add 20ms delay to engine reset for robustness on HSW (Nitin)

- Use proper sleeping functions for timeouts shorter than 20ms (Andi)
- Fix fence not released on early probe errors for HuC (Janusz)

- Remove const from struct i915_wa list allocation (Kees)
- Apply SPDX license format where missing and use single-line format (Andi)
- Whitespace fixes (Dan, Andi)
- Selftest improvements (Mikolaj, Badal, Sk,

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://lore.kernel.org/r/aBxNYp0IviE23zy-@jlahtine-mobl
5 weeks agoReapply: drm/amdgpu: Use generic hdp flush function
Lijo Lazar [Fri, 11 Apr 2025 11:15:46 +0000 (16:45 +0530)]
Reapply: drm/amdgpu: Use generic hdp flush function

Except HDP v5.2 all use a common logic for HDP flush. Use a generic
function. HDP v5.2 forces NO_KIQ logic, revisit it later.

Reapply after fixing up an HDP regression.

v2: merge the fix (Alex)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/hdp7: use memcfg register to post the write for HDP flush
Alex Deucher [Wed, 30 Apr 2025 16:50:02 +0000 (12:50 -0400)]
drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush

Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: 689275140cb8 ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/hdp6: use memcfg register to post the write for HDP flush
Alex Deucher [Wed, 30 Apr 2025 16:48:51 +0000 (12:48 -0400)]
drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush

Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: abe1cbaec6cf ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: cleanup sriov function for psp v12
Huang Rui [Thu, 24 Apr 2025 11:08:46 +0000 (19:08 +0800)]
drm/amdgpu: cleanup sriov function for psp v12

PSP v12 won't have SRIOV function.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
Alex Deucher [Wed, 30 Apr 2025 16:47:37 +0000 (12:47 -0400)]
drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush

Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: f756dbac1ce1 ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
Alex Deucher [Wed, 30 Apr 2025 16:46:56 +0000 (12:46 -0400)]
drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush

Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: cf424020e040 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: remove re-route ih in psp v12
Huang Rui [Thu, 24 Apr 2025 11:00:46 +0000 (19:00 +0800)]
drm/amdgpu: remove re-route ih in psp v12

APU doesn't have second IH ring, so re-routing action here is a no-op.
It will take a lot of time to wait timeout from PSP during the
initialization. So remove the function in psp v12.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd: Add per-ring reset for vcn v5.0.0 use
Mario Limonciello [Tue, 6 May 2025 20:49:48 +0000 (15:49 -0500)]
drm/amd: Add per-ring reset for vcn v5.0.0 use

If there is a problem requiring a reset of the VCN engine, it is better to
reset the VCN engine rather than the entire GPU.

Add a reset callback for the ring which will stop and start VCN if an
issue happens.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250506204948.12048-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd: Add per-ring reset for vcn v4.0.0 use
Mario Limonciello [Tue, 6 May 2025 20:49:47 +0000 (15:49 -0500)]
drm/amd: Add per-ring reset for vcn v4.0.0 use

If there is a problem requiring a reset of the VCN engine, it is better to
reset the VCN engine rather than the entire GPU.

Add a reset callback for the ring which will stop and start VCN if an
issue happens.

Link: https://lore.kernel.org/r/20250506204948.12048-3-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd: Add per-ring reset for vcn v4.0.5 use
Mario Limonciello [Tue, 6 May 2025 20:49:46 +0000 (15:49 -0500)]
drm/amd: Add per-ring reset for vcn v4.0.5 use

There is a problem occurring on VCN 4.0.5 where in some situations a job
is timing out.  This triggers a job timeout which then causes a GPU
reset for recovery.  That has exposed a number of issues with GPU reset
that have since been fixed. But also a GPU reset isn't actually needed
for this circumstance. Just restarting the ring is enough.

Add a reset callback for the ring which will stop and start VCN if the
issue happens.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3909
Link: https://lore.kernel.org/r/20250506204948.12048-2-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
Alex Deucher [Wed, 30 Apr 2025 16:45:04 +0000 (12:45 -0400)]
drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush

Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.

Fixes: c9b8dcabb52a ("drm/amdgpu/hdp4.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agoRevert "drm/amdgpu: Use generic hdp flush function"
Alex Deucher [Wed, 30 Apr 2025 16:34:17 +0000 (12:34 -0400)]
Revert "drm/amdgpu: Use generic hdp flush function"

This reverts commit 18a878fd8aef0ec21648a3782f55a79790cd4073.

Revert this temporarily to make it easier to fix a regression
in the HDP handling.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd/pm/smu13: Remove unused smu_v3 functions
Dr. David Alan Gilbert [Wed, 7 May 2025 00:24:25 +0000 (01:24 +0100)]
drm/amd/pm/smu13: Remove unused smu_v3 functions

smu_v13_0_display_clock_voltage_request() and
smu_v13_0_set_min_deep_sleep_dcefclk() were added in 2020 by
commit c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)")
but have remained unused.

Remove them.

smu_v13_0_display_clock_voltage_request() was the only user
of smu_v13_0_set_hard_freq_limited_range().  Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_range
Dr. David Alan Gilbert [Wed, 7 May 2025 00:24:24 +0000 (01:24 +0100)]
drm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_range

The last use of smu_v11_0_get_dpm_level_range() was removed in 2020 by
commit 46a301e14e8a ("drm/amd/powerplay: drop unnecessary Navi1x specific
APIs")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd/pm/smu7: Remove unused smu7_copy_bytes_from_smc
Dr. David Alan Gilbert [Wed, 7 May 2025 00:24:23 +0000 (01:24 +0100)]
drm/amd/pm/smu7: Remove unused smu7_copy_bytes_from_smc

smu7_copy_bytes_from_smc() was added in 2016 by
commit 1ff55f465103 ("drm/amd/powerplay: implement smu7_smumgr for asics
with smu ip version 7.")

but never used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: fix the indentation
Sunil Khatri [Wed, 7 May 2025 09:18:48 +0000 (14:48 +0530)]
drm/amdgpu: fix the indentation

fix the indentation
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:6992 gfx_v11_ip_dump

compiler: gcc-11 (Debian 11.3.0-12) 11.3.0

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505071619.7sHTLpNg-lkp@intel.com/
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: remove mdelay in psp v12
Huang Rui [Mon, 21 Apr 2025 05:51:30 +0000 (13:51 +0800)]
drm/amdgpu: remove mdelay in psp v12

Since secure firmware is more stable than bring up phase, I believe we
don't need such mdelays any more before wait PSP response on PSP v12.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agoamd/amdkfd: Trigger segfault for early userptr unmmapping
Shane Xiao [Wed, 23 Apr 2025 09:28:50 +0000 (17:28 +0800)]
amd/amdkfd: Trigger segfault for early userptr unmmapping

If applications unmap the memory before destroying the userptr, it needs
trigger a segfault to notify user space to correct the free sequence in
VM debug mode.

v2: Send gpu access fault to user space
v3: Report gpu address to user space, remove unnecessary params
v4: update pr_err into one line, remove userptr log info

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: Add debug bit for userptr usage
Shane Xiao [Thu, 24 Apr 2025 05:15:01 +0000 (13:15 +0800)]
drm/amdgpu: Add debug bit for userptr usage

In VM debug mode, it is desirable to notify the application
to correct the freeing sequence by unmapping the memory before
destroying the userptr in the old userptr path. Add a bitmask
to decide whether to send gpu vm fault to the applition.

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: unreserve the gem BO before returning from attach error
Prike Liang [Tue, 6 May 2025 08:32:23 +0000 (16:32 +0800)]
drm/amdgpu: unreserve the gem BO before returning from attach error

It requires unlocking the reserved gem BO before returning from
attaching the eviction fence error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: promote the implicit sync to the dependent read fences
Prike Liang [Wed, 23 Apr 2025 12:26:48 +0000 (20:26 +0800)]
drm/amdgpu: promote the implicit sync to the dependent read fences

The driver doesn't want to implicitly sync on the DMA_RESV_USAGE_BOOKKEEP
usage fences, and the BOOKEEP fences should be synced explicitly. So, as
the VM implicit syncing only need to return and sync the dependent read
fences.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/psp: mark securedisplay TA as optional
Alex Deucher [Tue, 29 Apr 2025 13:45:03 +0000 (09:45 -0400)]
drm/amdgpu/psp: mark securedisplay TA as optional

This is an optional TA which is only available on
certain embedded systems.  Mark it as optional to avoid
user confusion.  This mirrors what we already do for
other optional TAs.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4181
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: fix pm notifier handling
Alex Deucher [Thu, 1 May 2025 17:46:46 +0000 (13:46 -0400)]
drm/amdgpu: fix pm notifier handling

Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
the resource evictions properly in pm prepare based on whether
we are suspending or hibernating.  Drop the eviction as processes
are not frozen at this time, we we can end up getting stuck trying
to evict VRAM while applications continue to submit work which
causes the buffers to get pulled back into VRAM.

v2: Move suspend flags out of pm notifier (Mario)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes: 2965e6355dcd ("drm/amd: Add Suspend/Hibernate notification callback support")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: Implement unrecoverable error message handling for VFs
Ellen Pan [Tue, 29 Apr 2025 21:18:44 +0000 (17:18 -0400)]
drm/amdgpu: Implement unrecoverable error message handling for VFs

This notification may arrive in VF mailbox while polling for response from
another event.

This patches covers the following scenarios:

- If VF is already in RMA state, then do not attempt to contact the host.
  Host will ignore the VF after sending the notification.

- If the notification is detected during polling, then set the RMA status,
  and return error to caller.

- If the notification arrives by interrupt, then set the RMA status and
  queue a reset.  This reset will fail and VF will stop runtime services.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: Add unrecoverable error message definitions for VFs
Ellen Pan [Tue, 29 Apr 2025 21:00:50 +0000 (17:00 -0400)]
drm/amdgpu: Add unrecoverable error message definitions for VFs

Host may stop runtime services after reaching a bad page threshold.

This notification will indicate to the VF that it no longer has
access to the GPU.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agoRevert "drm/amd: Stop evicting resources on APUs in suspend"
Alex Deucher [Thu, 1 May 2025 17:00:16 +0000 (13:00 -0400)]
Revert "drm/amd: Stop evicting resources on APUs in suspend"

This reverts commit 3a9626c816db901def438dc2513622e281186d39.

This breaks S4 because we end up setting the s3/s0ix flags
even when we are entering s4 since prepare is used by both
flows.  The causes both the S3/s0ix and s4 flags to be set
which breaks several checks in the driver which assume they
are mutually exclusive.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu/vcn: using separate VCN1_AON_SOC offset
Ruijing Dong [Fri, 2 May 2025 15:19:26 +0000 (11:19 -0400)]
drm/amdgpu/vcn: using separate VCN1_AON_SOC offset

VCN1_AON_SOC_ADDRESS_3_0 offset varies on different
VCN generations, the issue in vcn4.0.5 is caused by
a different VCN1_AON_SOC_ADDRESS_3_0 offset.

This patch does the following:

    1. use the same offset for other VCN generations.
    2. use the vcn4.0.5 special offset
    3. update vcn_4_0 and vcn_5_0

Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: fix the eviction fence dereference
Prike Liang [Tue, 15 Apr 2025 02:42:19 +0000 (10:42 +0800)]
drm/amdgpu: fix the eviction fence dereference

The dma_resv_add_fence() already refers to the added fence.
So when attaching the evciton fence to the gem bo, it needn't
refer to it anymore.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: Implement Runtime Bad Page query for VFs
Ellen Pan [Tue, 29 Apr 2025 20:48:09 +0000 (16:48 -0400)]
drm/amdgpu: Implement Runtime Bad Page query for VFs

Host will send a notification when new bad pages are available.

Uopn guest request, the first 256 bad page addresses
will be placed into the PF2VF region.
Guest should pause the PF2VF worker thread while
the copy is in progress.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: Add Runtime Bad Page message definitions for VFs
Ellen Pan [Tue, 29 Apr 2025 20:22:53 +0000 (16:22 -0400)]
drm/amdgpu: Add Runtime Bad Page message definitions for VFs

Currently VFs rely on poison consumption interrupt from HW
to kick off the bad page retirement process. Part of this process
includes a VF reset.

This patch adds the following:

1) Host Bad Pages notification message.
2) Guest request bad pages message.

When combined, VFs are able to reserve the pages early, and potentially
avoid future poison consumption that will disrupt user services
from consequent FLR.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agoDocumentation/gpu: Add new entries to amdgpu glossary
Rodrigo Siqueira [Sat, 3 May 2025 20:38:43 +0000 (14:38 -0600)]
Documentation/gpu: Add new entries to amdgpu glossary

Add some additional entries.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb
Rodrigo Siqueira [Sat, 3 May 2025 20:38:42 +0000 (14:38 -0600)]
drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb

Add some random documentation associated with the ring buffer
manipulations and writeback.

Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amdkfd: change error to warning message for SDMA queues creation
Eric Huang [Fri, 2 May 2025 18:58:02 +0000 (14:58 -0400)]
drm/amdkfd: change error to warning message for SDMA queues creation

SDMA doesn't support oversubsciption, it is the user matter to create
queues over HW limit, but not supposed to be a KFD error.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd/display: Don't check for NULL divisor in fixpt code
Harry Wentland [Mon, 28 Apr 2025 20:26:20 +0000 (16:26 -0400)]
drm/amd/display: Don't check for NULL divisor in fixpt code

[Why]
We check for a NULL divisor but don't act on it.
This check does nothing other than throw a warning.
It does confuse static checkers though:
See https://lkml.org/lkml/2025/4/26/371

[How]
Drop the ASSERTs in both DC and SPL variants.

Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Fixes: 6efc0ab3b05d ("drm/amd/display: add back quality EASF and ISHARP and dc dependency changes")
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd/display: Use true/false for boolean variables in DML2 core files
Ivan Shamliev [Thu, 24 Apr 2025 15:14:53 +0000 (18:14 +0300)]
drm/amd/display: Use true/false for boolean variables in DML2 core files

Replace 0 and 1 with false and true for boolean variables in
dml2_core_dcn4_calcs.c and dml2_core_utils.c to align with the Linux
kernel coding style guidelines, which recommend using C99 bool type
with true/false values.

Signed-off-by: Ivan Shamliev <ivan.shamliev.dev@abv.bg>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/amd/display: adds kernel-doc comment for dc_stream_remove_writeback()
James Flowers [Sat, 3 May 2025 21:18:51 +0000 (14:18 -0700)]
drm/amd/display: adds kernel-doc comment for dc_stream_remove_writeback()

Adds kernel-doc for externally linked dc_stream_remove_writeback function.

Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
5 weeks agodrm/xe/doc: Wire up PCIe Gen5 limitations
Raag Jadav [Tue, 6 May 2025 05:48:35 +0000 (11:18 +0530)]
drm/xe/doc: Wire up PCIe Gen5 limitations

Append PCIe Gen5 limitations to xe_firmware document.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250506054835.3395220-4-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
5 weeks agodrm/xe: Expose PCIe link downgrade attributes
Raag Jadav [Tue, 6 May 2025 05:48:34 +0000 (11:18 +0530)]
drm/xe: Expose PCIe link downgrade attributes

Expose sysfs attributes for PCIe link downgrade capability and status.

v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal)
    Rework macros and their naming (Rodrigo)
v3: Use sysfs_create_files() (Riana)
    Fix checkpatch warning (Riana)
v4: s/downspeed/downgrade (Lucas, Rodrigo, Riana)
v5: Use PCIe Gen agnostic naming (Rodrigo)
v6: s/pcie_gen/auto_link (Lucas)

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Link: https://lore.kernel.org/r/20250506054835.3395220-3-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
5 weeks agodrm/xe: Move xe_device_sysfs_init() to xe_device_probe()
Raag Jadav [Tue, 6 May 2025 05:48:33 +0000 (11:18 +0530)]
drm/xe: Move xe_device_sysfs_init() to xe_device_probe()

Since xe_device_sysfs_init() exposes device specific attributes, a better
place for it is xe_device_probe().

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Link: https://lore.kernel.org/r/20250506054835.3395220-2-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
5 weeks agodrm/xe: Release force wake first then runtime power
Shuicheng Lin [Wed, 7 May 2025 02:23:02 +0000 (02:23 +0000)]
drm/xe: Release force wake first then runtime power

xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for
the release path, xe_force_wake_put() should be called first then
xe_pm_runtime_put().
Combine the error path and normal path together with goto.

Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
5 weeks agodrm/xe: Add config control for svm flush work
Shuicheng Lin [Fri, 2 May 2025 17:00:52 +0000 (17:00 +0000)]
drm/xe: Add config control for svm flush work

Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below
warning pops. Refine the flush work code to be controlled by the config
to avoid below warning:
"
[  453.132028] ------------[ cut here ]------------
[  453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0
[  453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec
[  453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G     U  W           6.15.0-rc3+ #7 PREEMPT(full)
[  453.135405] Tainted: [U]=USER, [W]=WARN
...
[  453.136921] RIP: 0010:__flush_work+0x379/0x3a0
[  453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe
[  453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246
[  453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000
[  453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8
[  453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000
[  453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001
[  453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00
[  453.143450] FS:  00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000
[  453.144276] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0
[  453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  453.147061] Call Trace:
[  453.147336]  <TASK>
[  453.147579]  ? tick_nohz_tick_stopped+0xd/0x30
[  453.148067]  ? xas_load+0x9/0xb0
[  453.148435]  ? xa_load+0x6f/0xb0
[  453.148781]  __xe_vm_bind_ioctl+0xbd5/0x1500 [xe]
[  453.149338]  ? dev_printk_emit+0x48/0x70
[  453.149762]  ? _dev_printk+0x57/0x80
[  453.150148]  ? drm_ioctl+0x17c/0x440
[  453.150544]  ? __drm_dev_vprintk+0x36/0x90
[  453.150983]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[  453.151575]  ? drm_ioctl_kernel+0x9f/0xf0
[  453.151998]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[  453.152560]  drm_ioctl_kernel+0x9f/0xf0
[  453.152968]  drm_ioctl+0x20f/0x440
[  453.153332]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[  453.153893]  ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100
[  453.154489]  ? memory_bm_test_bit+0x5/0x60
[  453.154935]  xe_drm_ioctl+0x47/0x70 [xe]
[  453.155419]  __x64_sys_ioctl+0x8d/0xc0
[  453.155824]  do_syscall_64+0x47/0x110
[  453.156228]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
"

v2 (Matt):
    refine commit message to have more details
    add Fixes tag
    move the code to xe_svm.h which already have the config
    remove a blank line per codestyle suggestion

Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
5 weeks agodrm/xe: Use copy_from_user() instead of __copy_from_user()
Harish Chegondi [Thu, 1 May 2025 19:14:45 +0000 (12:14 -0700)]
drm/xe: Use copy_from_user() instead of __copy_from_user()

copy_from_user() has more checks and is more safer than
__copy_from_user()

Suggested-by: Kees Cook <kees@kernel.org>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://lore.kernel.org/r/acabf20aa8621c7bc8de09b1bffb8d14b5376484.1746126614.git.harish.chegondi@intel.com
5 weeks agogpu: host1x: Use for_each_available_child_of_node_scoped()
Jinjie Ruan [Fri, 30 Aug 2024 07:38:24 +0000 (15:38 +0800)]
gpu: host1x: Use for_each_available_child_of_node_scoped()

Avoids the need for manual cleanup of_node_put() in early exits
from the loop.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20240830073824.3539690-1-ruanjinjie@huawei.com
5 weeks agodrm/tegra: Fix a possible null pointer dereference
Qiu-ji Chen [Wed, 6 Nov 2024 09:59:06 +0000 (17:59 +0800)]
drm/tegra: Fix a possible null pointer dereference

In tegra_crtc_reset(), new memory is allocated with kzalloc(), but
no check is performed. Before calling __drm_atomic_helper_crtc_reset,
state should be checked to prevent possible null pointer dereference.

Fixes: b7e0b04ae450 ("drm/tegra: Convert to using __drm_atomic_helper_crtc_reset() for reset.")
Cc: stable@vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20241106095906.15247-1-chenqiuji666@gmail.com
5 weeks agodrm/tegra: rgb: Fix the unbound reference count
Biju Das [Wed, 5 Feb 2025 11:21:35 +0000 (11:21 +0000)]
drm/tegra: rgb: Fix the unbound reference count

The of_get_child_by_name() increments the refcount in tegra_dc_rgb_probe,
but the driver does not decrement the refcount during unbind. Fix the
unbound reference count using devm_add_action_or_reset() helper.

Fixes: d8f4a9eda006 ("drm: Add NVIDIA Tegra20 support")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250205112137.36055-1-biju.das.jz@bp.renesas.com
5 weeks agogpu: host1x: Remove mid-job CDMA flushes
Mikko Perttunen [Tue, 4 Feb 2025 02:45:46 +0000 (02:45 +0000)]
gpu: host1x: Remove mid-job CDMA flushes

The current code can issue CDMA flushes (DMAPUT bumps) in the middle
of a job, before all opcodes have been written into the pushbuffer.
This can happen when pushbuffer fills up. Presumably this made sense
at some point in the past, but it doesn't anymore, as it cannot lead
to more space appearing in the pushbuffer as it is only cleaned full
jobs at a time.

Mid-job flushes can also cause problems, as in an extreme situation
(seen in practice), the hardware can run through the entire pushbuffer
including the prefix of a partially written job without the driver
being able to process any CDMA updates. This can cause the engine
MLOCK to be taken and held for extended periods as the tail of the
job is not yet available to hardware.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250204024546.1168126-1-mperttunen@nvidia.com
5 weeks agodrm/tegra: falcon: Pipeline firmware copy
Mikko Perttunen [Wed, 5 Feb 2025 06:10:27 +0000 (06:10 +0000)]
drm/tegra: falcon: Pipeline firmware copy

The Falcon DMA engine allows queueing multiple operations for
improved performance. Do this to optimize firmware loading.
A performance improvement of 4x to 6x is observed.

Co-developed-by: Ivan Raul Guadarrama <iguadarrama@nvidia.com>
Signed-off-by: Ivan Raul Guadarrama <iguadarrama@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250205061027.1205748-1-mperttunen@nvidia.com
5 weeks agodrm/tegra: dpaux: Use dev_err_probe()
Zhang Enpei [Wed, 2 Apr 2025 11:37:58 +0000 (19:37 +0800)]
drm/tegra: dpaux: Use dev_err_probe()

Replace the open-code with dev_err_probe() to simplify the code.

Signed-off-by: Zhang Enpei <zhang.enpei@zte.com.cn>
Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250402193758365XauggSF2EWBYY-e_jgNch@zte.com.cn
5 weeks agodrm/tegra: Remove unneeded include
Jon Hunter [Mon, 28 Apr 2025 15:34:35 +0000 (16:34 +0100)]
drm/tegra: Remove unneeded include

The header file 'tegra_drm.h' is included in gem.c, but this file is
already include 'drm.h'. Although there is no harm in including this
file again, it is also not necessary. Furthermore, the header file is
located under 'include/uapi/drm' so ideally the full path would be
used to be explicit. For now, just remove from gem.c.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250428153435.1013101-1-jonathanh@nvidia.com
5 weeks agodrm/tegra: Assign plane type before registration
Thierry Reding [Mon, 21 Apr 2025 16:13:05 +0000 (11:13 -0500)]
drm/tegra: Assign plane type before registration

Changes to a plane's type after it has been registered aren't propagated
to userspace automatically. This could possibly be achieved by updating
the property, but since we can already determine which type this should
be before the registration, passing in the right type from the start is
a much better solution.

Suggested-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Cc: stable@vger.kernel.org
Fixes: 473079549f27 ("drm/tegra: dc: Add Tegra186 support")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250421-tegra-drm-primary-v2-1-7f740c4c2121@gmail.com
5 weeks agodrm/i915/rps: fix stale reference to i915->irq_lock
Jani Nikula [Wed, 7 May 2025 08:31:36 +0000 (11:31 +0300)]
drm/i915/rps: fix stale reference to i915->irq_lock

The RPS code has been switched from using i915->irq_lock to gt->irq_lock
years ago in commit d762043f7ab1 ("drm/i915: Extract GT powermanagement
interrupt handling"). Fix the stale comment referencing i915->irq_lock,
which has also ceased to exist.

Reported-by: Gustavo Sousa <gustavo.sousa@intel.com>
Closes: https://lore.kernel.org/r/174656703321.1401.8627403371256302933@intel.com
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20250507083136.1987023-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/gt: Remove const from struct i915_wa list allocation
Kees Cook [Sat, 26 Apr 2025 06:13:58 +0000 (23:13 -0700)]
drm/i915/gt: Remove const from struct i915_wa list allocation

In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "struct i915_wa *". The returned type, while
technically matching, will be const qualified. As there is no general
way to remove const qualifiers, adjust the allocation type to match
the assignment.

Signed-off-by: Kees Cook <kees@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20250426061357.work.749-kees@kernel.org
5 weeks agodrm/i915/irq: move i915->irq_lock to display->irq.lock
Jani Nikula [Tue, 6 May 2025 13:06:50 +0000 (16:06 +0300)]
drm/i915/irq: move i915->irq_lock to display->irq.lock

Observe that i915->irq_lock is no longer used to protect anything
outside of display. Make it a display thing.

This allows us to remove the ugly #define irq_lock irq.lock hack from xe
compat header.

Note that this is slightly more subtle than it first looks. For i915,
there's no functional change here. The lock is moved. However, for xe,
we'll now have *two* locks, xe->irq.lock and display->irq.lock. These
should protect different things, though. Indeed, nesting in the past
would've lead to a deadlock because they were the same lock.

With the i915 references gone, we can make a handful more files
independent of i915_drv.h.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/6d8d2ce0f34a9c7361a5e2fcf96bb32a34c57e76.1746536745.git.jani.nikula@intel.com
[Jani: Fixed a comment while applying.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/rps: refactor display rps support
Jani Nikula [Tue, 6 May 2025 13:06:49 +0000 (16:06 +0300)]
drm/i915/rps: refactor display rps support

Make the gt rps code and display irq code interact via
intel_display_rps.[ch], instead of direct access. Add no-op static
inline stubs for xe instead of having a separate build unit doing
nothing. All of this clarifies the interfaces between i915 core and
display.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/ef2a46dc8f30b72282494f54e98cb5fed7523b58.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/irq: make i915_enable_asle_pipestat() static
Jani Nikula [Tue, 6 May 2025 13:06:48 +0000 (16:06 +0300)]
drm/i915/irq: make i915_enable_asle_pipestat() static

With all users of i915_enable_asle_pipestat() inside
intel_display_irq.c, we can make the function static.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/9511e368c5244aaa04cc45f46e2425737acd29fb.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/irq: split out i965_display_irq_postinstall()
Jani Nikula [Tue, 6 May 2025 13:06:47 +0000 (16:06 +0300)]
drm/i915/irq: split out i965_display_irq_postinstall()

Split out i965_display_irq_postinstall() similar to other platforms.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/5d404dcd0c606d1cb11f2e09c45e151a75b5b2c6.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/irq: split out i915_display_irq_postinstall()
Jani Nikula [Tue, 6 May 2025 13:06:46 +0000 (16:06 +0300)]
drm/i915/irq: split out i915_display_irq_postinstall()

Split out i915_display_irq_postinstall() similar to other platforms.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/11de06206ff10c27104b0ac3efda085bf4c1f1a6.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/irq: move locking inside vlv_display_irq_postinstall()
Jani Nikula [Tue, 6 May 2025 13:06:45 +0000 (16:06 +0300)]
drm/i915/irq: move locking inside vlv_display_irq_postinstall()

All users of vlv_display_irq_postinstall() outside of
intel_display_irq.c have a lock/unlock pair. Move the locking inside the
function. Add an unlocked variant for internal use, similar to the
_vlv_display_irq_reset() and vlv_display_irq_reset() functions.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/93ea785d2d9bdb4e18328aa42a00a492d9d783c0.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/irq: move locking inside valleyview_{enable, disable}_display_irqs()
Jani Nikula [Tue, 6 May 2025 13:06:44 +0000 (16:06 +0300)]
drm/i915/irq: move locking inside valleyview_{enable, disable}_display_irqs()

All users of valleyview_enable_display_irqs() and
valleyview_disable_display_irqs() have a lock/unlock pair. Move the
locking inside the functions.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/bb6d941c47260aea11e4af5d52572b0e5f139929.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/irq: move locking inside vlv_display_irq_reset()
Jani Nikula [Tue, 6 May 2025 13:06:43 +0000 (16:06 +0300)]
drm/i915/irq: move locking inside vlv_display_irq_reset()

All users of vlv_display_irq_reset() have a lock/unlock pair. Move the
locking inside the function.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/0f8176b777fa24921458996f7d6f982f955a52f6.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/crtc: pass struct intel_display to DISPLAY_VER()
Jani Nikula [Tue, 6 May 2025 10:57:19 +0000 (13:57 +0300)]
drm/i915/crtc: pass struct intel_display to DISPLAY_VER()

Drop another reference to struct drm_i915_private.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/cb84073ff92a99e74ff6dfb8e395365b7cbb5332.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/bios: fix a comment referencing struct drm_i915_private
Jani Nikula [Tue, 6 May 2025 10:57:18 +0000 (13:57 +0300)]
drm/i915/bios: fix a comment referencing struct drm_i915_private

struct intel_vbt_data is within struct intel_display nowadays, not
struct drm_i915_private.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/b7a9a7c64f41cf61749a42ed4102e04b500fde83.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/display: remove struct drm_i915_private forward declaration
Jani Nikula [Tue, 6 May 2025 10:57:17 +0000 (13:57 +0300)]
drm/i915/display: remove struct drm_i915_private forward declaration

Remove unused struct drm_i915_private forward declaration from
intel_display_core.h. Sort and group forward declarations while at it.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/fbccf45339a61711b377b35fd479a67b378c5571.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agodrm/i915/dsi: remove dependency on i915_drv.h
Jani Nikula [Tue, 6 May 2025 10:57:16 +0000 (13:57 +0300)]
drm/i915/dsi: remove dependency on i915_drv.h

Remove final references to struct drm_i915_private and drop dependency
on i915_drv.h.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/2cee3cd9d7d9bec8dfe9c21fe5d172b1843b3d97.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
5 weeks agocheckpatch: remove %p4cn
Aditya Garg [Wed, 30 Apr 2025 13:49:08 +0000 (19:19 +0530)]
checkpatch: remove %p4cn

%p4cn was recently removed and replaced by %p4chR in vsprintf. So,
remove the check for %p4cn from checkpatch.pl.

Fixes: 37eed892cc5f ("vsprintf: Use %p4chR instead of %p4cn for reading data in reversed host ordering")
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/PN3PR01MB959760B89BF7E4B43852700CB8832@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 weeks agodrm/panel: simple: Update timings for AUO G101EVN010
Kevin Baker [Mon, 5 May 2025 17:02:56 +0000 (12:02 -0500)]
drm/panel: simple: Update timings for AUO G101EVN010

Switch to panel timings based on datasheet for the AUO G101EVN01.0
LVDS panel. Default timings were tested on the panel.

Previous mode-based timings resulted in horizontal display shift.

Signed-off-by: Kevin Baker <kevinb@ventureresearch.com>
Fixes: 4fb86404a977 ("drm/panel: simple: Add AUO G101EVN010 panel support")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
5 weeks agodrm/vkms: Adjust vkms_state->active_planes allocation type
Kees Cook [Sat, 26 Apr 2025 06:14:32 +0000 (23:14 -0700)]
drm/vkms: Adjust vkms_state->active_planes allocation type

In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "struct vkms_plane_state **", but the returned type
will be "struct drm_plane **". These are the same size (pointer size), but
the types don't match. Adjust the allocation type to match the assignment.

Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com>
Fixes: 8b1865873651 ("drm/vkms: totally reworked crc data tracking")
Link: https://lore.kernel.org/r/20250426061431.work.304-kees@kernel.org
Signed-off-by: Louis Chauvet <contact@louischauvet.fr>
5 weeks agoMerge drm/drm-next into drm-misc-next
Thomas Zimmermann [Tue, 6 May 2025 07:09:49 +0000 (09:09 +0200)]
Merge drm/drm-next into drm-misc-next

Backmerging drm-next to get fixes from v6.15-rc5.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
5 weeks agoBackMerge tag 'v6.15-rc5' into drm-next
Dave Airlie [Tue, 6 May 2025 06:39:25 +0000 (16:39 +1000)]
BackMerge tag 'v6.15-rc5' into drm-next

Linux 6.15-rc5, requested by tzimmerman for fixes required in drm-next.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 weeks agodrm/xe/gsc: do not flush the GSC worker from the reset path
Daniele Ceraolo Spurio [Fri, 2 May 2025 15:51:04 +0000 (08:51 -0700)]
drm/xe/gsc: do not flush the GSC worker from the reset path

The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.

The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:

1) GSC proxy init: this only happens once immediately after GSC FW load
   and does not support being interrupted. The only way to recover from
   an interruption of the proxy init is to do an FLR and re-load the GSC.

2) GSC proxy request: this can happen in response to a request that
   the driver sends to the GSC. If this is interrupted, the GSC FW will
   timeout and the driver request will be failed, but overall the GSC
   will keep working fine.

Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).

Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.

v2: fix spelling in commit msg, rename waiter function (Julia)

Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
6 weeks agodocs: backlight: Clarify `actual_brightness`
Mario Limonciello [Tue, 15 Apr 2025 19:20:59 +0000 (14:20 -0500)]
docs: backlight: Clarify `actual_brightness`

Currently userspace software systemd treats `brightness` and
`actual_brightness` identically due to a bug found in an out of tree
driver.

This however causes problems for in-tree drivers that use brightness
to report user requested `brightness` and `actual_brightness` to report
what the hardware actually has programmed.

Clarify the documentation to match the behavior described in commit
6ca017658b1f9 ("[PATCH] backlight: Backlight Class Improvements").

Cc: Lee Jones <lee@kernel.org>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: richard.purdie@linuxfoundation.org
Link: https://github.com/systemd/systemd/pull/36881
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250415192101.2033518-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
6 weeks agodrm/amdgpu: only keep most recent fence for each context
Arvind Yadav [Fri, 25 Apr 2025 14:19:05 +0000 (19:49 +0530)]
drm/amdgpu: only keep most recent fence for each context

Keep only the latest fences to reduce the number of values
given back to userspace

v2: - Export this code from dma-fence-unwrap.c(by Christian).
v3: - To split this in a dma_buf patch and amd userq patch(by Sunil).
    - No need to add a new function just re-use existing(by Christian).
v4: Export dma_fence_dedub_array function and used it(by Christian).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: Add Support for enforcing isolation without Cleaner Shader
Srinivasan Shanmugam [Mon, 28 Apr 2025 10:46:34 +0000 (16:16 +0530)]
drm/amdgpu: Add Support for enforcing isolation without Cleaner Shader

Adjusted the enforce isolation setting handling to include the ability
to disable the cleaner shader without affecting isolation between tasks.

v2: Updated enforce isolation documentation and parameters. (Alex)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodma-fence: Add helper to sort and deduplicate dma_fence arrays
Arvind Yadav [Fri, 25 Apr 2025 14:06:50 +0000 (19:36 +0530)]
dma-fence: Add helper to sort and deduplicate dma_fence arrays

Export a new helper function `dma_fence_dedup_array()` that sorts
an array of dma_fence pointers by context, then deduplicates the array
by retaining only the most recent fence per context.

This utility is useful when merging or optimizing sets of fences where
redundant entries from the same context can be pruned. The operation is
performed in-place and releases references to dropped fences using
dma_fence_put().

v2: - Export this code from dma-fence-unwrap.c(by Christian).
v3: - To split this in a dma_buf patch and amd userq patch(by Sunil).
    - No need to add a new function just re-use existing(by Christian).
v4: - Export dma_fence_dedub_array and use it(by Christian).

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver
Sunil Khatri [Tue, 22 Apr 2025 13:19:02 +0000 (18:49 +0530)]
drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver

update the functions in amdgpu_userqueues.c from
DRM_DBG_DRIVER to drm_dbg_driver so multi gpu instance
can be logged in.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userq.c
Sunil Khatri [Tue, 22 Apr 2025 13:10:03 +0000 (18:40 +0530)]
drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userq.c

change the DRM_ERROR and drm_err to drm_file_err
to add process name and pid to the logging.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: use drm_file_err in fence timeouts
Sunil Khatri [Tue, 22 Apr 2025 12:40:22 +0000 (18:10 +0530)]
drm/amdgpu: use drm_file_err in fence timeouts

use drm_file_err instead of DRM_ERROR which adds
process and pid information in the userqueue error
logging.

Sample log:

[   19.802315] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: ibus-x11 pid: 2055 client: Unset ... Couldn't unmap all the queues
[   19.802319] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: ibus-x11 pid: 2055 client: Unset ... Failed to evict userqueue
[   19.838432] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: systemd-logind pid: 1042 client: Unset ... Couldn't unmap all the queues
[   19.838436] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: systemd-logind pid: 1042 client: Unset ... Failed to evict userqueue

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amdgpu: add drm_file reference in userq_mgr
Sunil Khatri [Tue, 22 Apr 2025 12:25:50 +0000 (17:55 +0530)]
drm/amdgpu: add drm_file reference in userq_mgr

drm_file will be used in usermode queues code to
enable better process information in logging and hence
add drm_file part of the userq_mgr struct.

update the drm_file pointer in userq_mgr for each
amdgpu_driver_open_kms.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm: add drm_file_err function to add process info
Sunil Khatri [Wed, 9 Apr 2025 11:39:23 +0000 (17:09 +0530)]
drm: add drm_file_err function to add process info

Add a drm helper function which appends the process information for
the drm_file over drm_err formatted output.

v5: change to macro from function (Christian Koenig)
    add helper functions for lock/unlock (Christian Koenig)

v6: remove __maybe_unused and make function inline (Jani Nikula)
    remove drm_print.h

v7: Use va_format and %pV to concatenate fmt and vargs (Jani Nikula)

v8: Code formatting and typos (Ursulin tvrtko)

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Promote DC to 3.2.331
Taimur Hassan [Mon, 21 Apr 2025 18:30:58 +0000 (13:30 -0500)]
drm/amd/display: Promote DC to 3.2.331

Summary

* Remove redundant NULL check
* Fix invalid context error in dml helper
* Prepare for Fused I2C-over-AUX
* Allow DSCClock disable
* Vmax / Vmin update for Vsync
* Fix race condition in DPIA AUX transfer
* Fix wrong handling for AUX_DEFER case
* Only wait for required space in DMUB mailbox

Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Only wait for required free space in DMUB mailbox
Dillon Varone [Thu, 24 Apr 2025 15:48:46 +0000 (11:48 -0400)]
drm/amd/display: Only wait for required free space in DMUB mailbox

[WHY&HOW]
When command submission is blocked by a full mailbox, only wait for
enough space to free to submit the command, instead of waiting for idle.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Assign preferred stream encoder instance to dpia
Meenakshikumar Somasundaram [Wed, 23 Apr 2025 18:42:53 +0000 (14:42 -0400)]
drm/amd/display: Assign preferred stream encoder instance to dpia

[Why]
For dpia, preferred engine instance availability is not checked
when assigning stream encoder instance.

[How]
Check for dpia preferred engine id and assign the same stream
encoder instance for the stream if available.

Reviewed-by: PeiChen Huang <peichen.huang@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Fix wrong handling for AUX_DEFER case
Wayne Lin [Sun, 20 Apr 2025 11:22:14 +0000 (19:22 +0800)]
drm/amd/display: Fix wrong handling for AUX_DEFER case

[Why]
We incorrectly ack all bytes get written when the reply actually is defer.
When it's defer, means sink is not ready for the request. We should
retry the request.

[How]
Only reply all data get written when receive I2C_ACK|AUX_ACK. Otherwise,
reply the number of actual written bytes received from the sink.
Add some messages to facilitate debugging as well.

Fixes: ad6756b4d773 ("drm/amd/display: Shift dc link aux to aux_payload")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Copy AUX read reply data whenever length > 0
Wayne Lin [Sun, 20 Apr 2025 09:50:03 +0000 (17:50 +0800)]
drm/amd/display: Copy AUX read reply data whenever length > 0

[Why]
amdgpu_dm_process_dmub_aux_transfer_sync() should return all exact data
reply from the sink side. Don't do the analysis job in it.

[How]
Remove unnecessary check condition AUX_TRANSACTION_REPLY_AUX_ACK.

Fixes: ead08b95fa50 ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Remove incorrect checking in dmub aux handler
Wayne Lin [Sun, 20 Apr 2025 08:56:54 +0000 (16:56 +0800)]
drm/amd/display: Remove incorrect checking in dmub aux handler

[Why & How]
"Request length != reply length" is expected behavior defined in spec.
It's not an invalid reply. Besides, replied data handling logic is not
designed to be written in amdgpu_dm_process_dmub_aux_transfer_sync().
Remove the incorrectly handling section.

Fixes: ead08b95fa50 ("drm/amd/display: Fix race condition in DPIA AUX transfer")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
6 weeks agodrm/amd/display: Fix the checking condition in dmub aux handling
Wayne Lin [Sun, 20 Apr 2025 08:29:07 +0000 (16:29 +0800)]
drm/amd/display: Fix the checking condition in dmub aux handling

[Why & How]
Fix the checking condition for detecting AUX_RET_ERROR_PROTOCOL_ERROR.
It was wrongly checking by "not equals to"

Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>