Daniele Ceraolo Spurio [Thu, 22 May 2025 22:54:03 +0000 (15:54 -0700)]
drm/xe/pxp: Use the correct define in the set_property_funcs array
The define of the extension type was accidentally used instead of the
one of the property itself. They're both zero, so no functional issue,
but we should use the correct define for code correctness.
Fixes:
41a97c4a1294 ("drm/xe/pxp/uapi: Add API to mark a BO as using PXP")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250522225401.3953243-6-daniele.ceraolospurio@intel.com
(cherry picked from commit
1d891ee820fd0fbb4101eacb0d922b5050a24933)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Matthew Auld [Wed, 28 May 2025 11:33:29 +0000 (12:33 +0100)]
drm/xe/sched: stop re-submitting signalled jobs
Customer is reporting a really subtle issue where we get random DMAR
faults, hangs and other nasties for kernel migration jobs when stressing
stuff like s2idle/s3/s4. The explosions seems to happen somewhere
after resuming the system with splats looking something like:
PM: suspend exit
rfkill: input handler disabled
xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0
xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1]
xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out
The likely cause appears to be a race between suspend cancelling the
worker that processes the free_job()'s, such that we still have pending
jobs to be freed after the cancel. Following from this, on resume the
pending_list will now contain at least one already complete job, but it
looks like we call drm_sched_resubmit_jobs(), which will then call
run_job() on everything still on the pending_list. But if the job was
already complete, then all the resources tied to the job, like the bb
itself, any memory that is being accessed, the iommu mappings etc. might
be long gone since those are usually tied to the fence signalling.
This scenario can be seen in ftrace when running a slightly modified
xe_pm IGT (kernel was only modified to inject artificial latency into
free_job to make the race easier to hit):
xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ...
xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13
xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4
xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3
xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3
xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3
xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13
xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ...
.....
xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13
So the job_run() is clearly triggered twice for the same job, even
though the first must have already signalled to completion during
suspend. We can also see a CAT error after the re-submit.
To prevent this only resubmit jobs on the pending_list that have not yet
signalled.
v2:
- Make sure to re-arm the fence callbacks with sched_start().
v3 (Matt B):
- Stop using drm_sched_resubmit_jobs(), which appears to be deprecated
and just open-code a simple loop such that we skip calling run_job()
on anything already signalled.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856
Fixes:
dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: William Tseng <william.tseng@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://lore.kernel.org/r/20250528113328.289392-2-matthew.auld@intel.com
(cherry picked from commit
38fafa9f392f3110d2de431432d43f4eef99cd1b)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Thomas Hellström [Wed, 28 May 2025 16:41:05 +0000 (18:41 +0200)]
drm/xe: Rework eviction rejection of bound external bos
For preempt_fence mode VM's we're rejecting eviction of
shared bos during VM_BIND. However, since we do this in the
move() callback, we're getting an eviction failure warning from
TTM. The TTM callback intended for these things is
eviction_valuable().
However, the latter doesn't pass in the struct ttm_operation_ctx
needed to determine whether the caller needs this.
Instead, attach the needed information to the vm under the
vm->resv, until we've been able to update TTM to provide the
needed information. And add sufficient lockdep checks to prevent
misuse and races.
v2:
- Fix a copy-paste error in xe_vm_clear_validating()
v3:
- Fix kerneldoc errors.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes:
0af944f0e308 ("drm/xe: Reject BO eviction if BO is bound to current VM")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit
9d5558649f68e2e84a87a909631b30e15ca0f8ec)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Arnd Bergmann [Thu, 29 May 2025 17:23:56 +0000 (10:23 -0700)]
drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency
The XE driver can be built with or without VSEC support, but fails to link as
built-in if vsec is in a loadable module:
x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init':
(.text+0x1e83e16): undefined reference to `intel_vsec_register'
The normal fix for this is to add a 'depends on INTEL_VSEC || !INTEL_VSEC',
forcing XE to be a loadable module as well, but that causes a circular
dependency:
symbol DRM_XE depends on INTEL_VSEC
symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES
symbol X86_PLATFORM_DEVICES is selected by DRM_XE
The problem here is selecting a symbol from another subsystem, so change
that as well and rephrase the 'select' into the corresponding dependency.
Since X86_PLATFORM_DEVICES is 'default y', there is no change to
defconfig builds here.
Fixes:
0c45e76fcc62 ("drm/xe/vsec: Support BMG devices")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit
e4931f8be347ec5f19df4d6d33aea37145378c42)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Raag Jadav [Thu, 29 May 2025 16:09:37 +0000 (21:39 +0530)]
drm/xe: drop redundant conversion to bool
The result of integer comparison already evaluates to bool. No need for
explicit conversion.
No functional impact.
Fixes:
0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202505292205.MoljmkjQ-lkp@intel.com/
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit
61761a6b57f2818983466d24aab60baab471ba21)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Karthik Poosa [Thu, 29 May 2025 16:34:54 +0000 (22:04 +0530)]
drm/xe/hwmon: Move card reactive critical power under channel card
Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this
represents the entire card critical power/current.
v2: Update the date of curr1_crit also in hwmon documentation.
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes:
345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes")
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit
25e963a09e059ffdb15c09cc79cfded855b43668)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Karthik Poosa [Thu, 29 May 2025 16:34:53 +0000 (22:04 +0530)]
drm/xe/hwmon: Add support to manage power limits though mailbox
Add support to manage power limits using pcode mailbox commands
for supported platforms.
v2:
- Address review comments. (Badal)
- Use mailbox commands instead of registers to manage power limits
for BMG.
- Clamp the maximum power limit to GPU firmware default value.
v3:
- Clamp power limit in write also for platforms with mailbox support.
v4:
- Remove unnecessary debug prints. (Badal)
v5:
- Update description of variable pl1_on_boot to fix kernel-doc error.
v6:
- Improve commit message, refer to BIOS as GPU firmware.
- Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW.
- Rectify drm_warn to drm_info.
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes:
e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG")
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit
7596d839f6228757fe17a810da2d1c5f3305078c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Matthew Auld [Wed, 14 May 2025 15:24:26 +0000 (16:24 +0100)]
drm/xe/vm: move xe_svm_init() earlier
In xe_vm_close_and_put() we need to be able to call xe_svm_fini(),
however during vm creation we can call this on the error path, before
having actually initialised the svm state, leading to various splats
followed by a fatal NPD.
Fixes:
6fd979c2f331 ("drm/xe: Add SVM init / close / fini to faulting VMs")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com
(cherry picked from commit
4f296d77cf49fcb5f90b4674123ad7f3a0676165)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Matthew Auld [Wed, 14 May 2025 15:24:25 +0000 (16:24 +0100)]
drm/xe/vm: move rebind_work init earlier
In xe_vm_close_and_put() we need to be able to call
flush_work(rebind_work), however during vm creation we can call this on
the error path, before having actually set up the worker, leading to a
splat from flush_work().
It looks like we can simply move the worker init step earlier to fix
this.
Fixes:
dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com
(cherry picked from commit
96af397aa1a2d1032a6e28ff3f4bc0ab4be40e1d)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Rodrigo Vivi [Wed, 21 May 2025 16:51:48 +0000 (12:51 -0400)]
drm/xe: Add missing documentation of rpa_freq
While at it, already adjust the rpe_freq frequency, to highlight
that both are calculated by PCODE at runtime.
Fixes:
c6aac2fa77a3 ("drm/xe: Introduce the RPa information")
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20250521165146.39616-4-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit
39578fa40420fb11dbe4f42225a347e945d8fd0e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Rodrigo Vivi [Wed, 21 May 2025 16:51:47 +0000 (12:51 -0400)]
drm/xe: Make xe_gt_freq part of the Documentation
The documentation was created with the creation of the component,
however it has never been actually shown in the actual Documentation.
While doing this, fixes the identation style, to avoid new warnings
while building htmldocs.
Fixes:
bef52b5c7a19 ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250521165146.39616-3-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit
af53f0fd99c3bbb3afd29f1612c9e88c5a92cc01)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Aradhya Bhatia [Fri, 16 May 2025 12:43:55 +0000 (12:43 +0000)]
drm/xe: Default auto_link_downgrade status to false
xe_pcode_read() can return back successfully without updating the
variable 'val'. This can cause an arbitrary value to show up in the
sysfs file.
Allow the auto_link_downgrade_status to default to 0 to avoid any
arbitrary value from coming up.
Fixes:
0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Link: https://lore.kernel.org/r/20250516124355.4872-1-aradhya.bhatia@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit
a7f87deac2295d11865048bcb9c2de369b52ed93)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Aradhya Bhatia [Fri, 16 May 2025 14:19:02 +0000 (14:19 +0000)]
drm/xe/guc: Make creation of SLPC debugfs files conditional
Platforms that do not support SLPC are exempted from the GuC PC support.
The GuC PC does not get initialized, and neither do its BOs get created.
This causes a problem because the GuC PC debugfs file is still being
created. Whenever the file is attempted to read, it causes a NULL
pointer dereference on the supposed BO of the GuC PC.
So, make the creation of SLPC debugfs files conditional to when SLPC
features are supported.
Fixes:
aaab5404b16f ("drm/xe: Introduce GuC PC debugfs")
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Link: https://lore.kernel.org/r/20250516141902.5614-1-aradhya.bhatia@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit
17486cf3df5320752cc67ee8bcb2379d1b9de76c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Dave Airlie [Sun, 11 May 2025 21:14:27 +0000 (07:14 +1000)]
Merge tag 'amd-drm-next-6.16-2025-05-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.16-2025-05-09:
amdgpu:
- IPS fixes
- DSC cleanup
- DC Scaling updates
- DC FP fixes
- Fused I2C-over-AUX updates
- SubVP fixes
- Freesync fix
- DMUB AUX fixes
- VCN fix
- Hibernation fixes
- HDP fixes
- DCN 2.1 fixes
- DPIA fixes
- DMUB updates
- Use drm_file_err in amdgpu
- Enforce isolation updates
- Use new dma_fence helpers
- USERQ fixes
- Documentation updates
- Misc code cleanups
- SR-IOV updates
- RAS updates
- PSP 12 cleanups
amdkfd:
- Update error messages for SDMA
- Userptr updates
drm:
- Add drm_file_err function
dma-buf:
- Add a helper to sort and deduplicate dma_fence arrays
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250509230951.3871914-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 10 May 2025 06:13:46 +0000 (16:13 +1000)]
Merge tag 'drm-misc-next-2025-05-08' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.16-rc1:
Cross-subsystem Changes:
- Change vsprintf %p4cn to %p4chR, remove %p4cn.
Core Changes:
- Documentation updates (fb rendering, actual_brightness)
Driver Changes:
- Small fixes to appletbdrm, panthor, st7571-i2c, rockchip, renesas,
panic handler, gpusvm, vkms, panel timings.
- Add AUO B140QAN08.H, BOE NE140WUM-N6S, CSW MNE007QS3-8, BOE TD4320 panels.
- Convert rk3066_hdmi to bridge driver.
- Improve HPD on anx7625.
- Speed up loading tegra firmware, and other small fixes to tegra & host1x.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/5428be12-fc08-4e28-8c5f-85d73b8a7e04@linux.intel.com
Dave Airlie [Fri, 9 May 2025 20:10:04 +0000 (06:10 +1000)]
Merge tag 'drm-intel-next-2025-05-08' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Non-display related:
- Fix undefined reference to `intel_pxp_gsccs_is_ready_for_sessions'
Display related:
- More work towards display separation (Jani)
- Stop writing VRR_CTL_IGN_MAX_SHIFT for MTL onwards (Jouni)
- DSC checks for 3 engines (Ankit)
- Add link rate and lane count to i915_display_info (Khaled)
- PSR fixes and workaround for underrun on idle (Jouni)
- LOBF enablement and ALMP fixes (Animesh)
- Clean up VGA plane handling (Ville)
- Use an intel_connector pointer everywhere (Imre)
- Fix warning for coffeelake on SunrisePoint PCH (Jiajia)
- Rework/Correction on minimum hblank calculation (Arun)
- Dmesg clean up (Jani)
- Add a couple of simple display workarounds (Ankit, Vinod)
- Refactor HDCP GSC (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aByyL3bEufPu79OM@intel.com
Dave Airlie [Fri, 9 May 2025 19:24:39 +0000 (05:24 +1000)]
Merge tag 'drm-xe-next-2025-05-08' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- Expose PCIe link downgrade attributes (Raag)
Cross-subsystem Changes:
Core Changes:
- gpusvm has_dma_mapping fix (Dafna)
Driver Changes:
- Forcewake hold fix (Tejas)
- Fix guc_info debugfs for VFs (Daniele)
- Fix devcoredump chunk alignment calculation (Arnd)
- Don't print timedout job message on killed exec queues (Matt Brost)
- Don't flush the GSC worker from the reset path (Daniele)
- Use copy_from_user() instead of __copy_from_user() (Harish)
- Only flush SVM garbage collector if CONFIG_DRM_XE_GPUSVM (Shuicheng)
- Fix forcewake vs runtime pm ref release ordering (Shuicheng)
- Move xe_device_sysfs_init() to xe_device_probe() (Raag)
- Append PCIe Gen5 limitations to xe_firmware document (Raag)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/aBzUwbzCzz7Qo7fA@fedora
Dave Airlie [Fri, 9 May 2025 01:39:27 +0000 (11:39 +1000)]
Merge tag 'drm-intel-gt-next-2025-05-08-1' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
Driver Changes:
- Fix SLPC wait boosting reference counting to avoid getting stuck on non-boost
frequency on power saving profile on DG1/DG2 (Vinay)
- Add 20ms delay to engine reset for robustness on HSW (Nitin)
- Use proper sleeping functions for timeouts shorter than 20ms (Andi)
- Fix fence not released on early probe errors for HuC (Janusz)
- Remove const from struct i915_wa list allocation (Kees)
- Apply SPDX license format where missing and use single-line format (Andi)
- Whitespace fixes (Dan, Andi)
- Selftest improvements (Mikolaj, Badal, Sk,
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://lore.kernel.org/r/aBxNYp0IviE23zy-@jlahtine-mobl
Lijo Lazar [Fri, 11 Apr 2025 11:15:46 +0000 (16:45 +0530)]
Reapply: drm/amdgpu: Use generic hdp flush function
Except HDP v5.2 all use a common logic for HDP flush. Use a generic
function. HDP v5.2 forces NO_KIQ logic, revisit it later.
Reapply after fixing up an HDP regression.
v2: merge the fix (Alex)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Apr 2025 16:50:02 +0000 (12:50 -0400)]
drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes:
689275140cb8 ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Apr 2025 16:48:51 +0000 (12:48 -0400)]
drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes:
abe1cbaec6cf ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Thu, 24 Apr 2025 11:08:46 +0000 (19:08 +0800)]
drm/amdgpu: cleanup sriov function for psp v12
PSP v12 won't have SRIOV function.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Apr 2025 16:47:37 +0000 (12:47 -0400)]
drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes:
f756dbac1ce1 ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Apr 2025 16:46:56 +0000 (12:46 -0400)]
drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes:
cf424020e040 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Thu, 24 Apr 2025 11:00:46 +0000 (19:00 +0800)]
drm/amdgpu: remove re-route ih in psp v12
APU doesn't have second IH ring, so re-routing action here is a no-op.
It will take a lot of time to wait timeout from PSP during the
initialization. So remove the function in psp v12.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Tue, 6 May 2025 20:49:48 +0000 (15:49 -0500)]
drm/amd: Add per-ring reset for vcn v5.0.0 use
If there is a problem requiring a reset of the VCN engine, it is better to
reset the VCN engine rather than the entire GPU.
Add a reset callback for the ring which will stop and start VCN if an
issue happens.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250506204948.12048-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Tue, 6 May 2025 20:49:47 +0000 (15:49 -0500)]
drm/amd: Add per-ring reset for vcn v4.0.0 use
If there is a problem requiring a reset of the VCN engine, it is better to
reset the VCN engine rather than the entire GPU.
Add a reset callback for the ring which will stop and start VCN if an
issue happens.
Link: https://lore.kernel.org/r/20250506204948.12048-3-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Tue, 6 May 2025 20:49:46 +0000 (15:49 -0500)]
drm/amd: Add per-ring reset for vcn v4.0.5 use
There is a problem occurring on VCN 4.0.5 where in some situations a job
is timing out. This triggers a job timeout which then causes a GPU
reset for recovery. That has exposed a number of issues with GPU reset
that have since been fixed. But also a GPU reset isn't actually needed
for this circumstance. Just restarting the ring is enough.
Add a reset callback for the ring which will stop and start VCN if the
issue happens.
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3909
Link: https://lore.kernel.org/r/20250506204948.12048-2-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Apr 2025 16:45:04 +0000 (12:45 -0400)]
drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes:
c9b8dcabb52a ("drm/amdgpu/hdp4.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 30 Apr 2025 16:34:17 +0000 (12:34 -0400)]
Revert "drm/amdgpu: Use generic hdp flush function"
This reverts commit
18a878fd8aef0ec21648a3782f55a79790cd4073.
Revert this temporarily to make it easier to fix a regression
in the HDP handling.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dr. David Alan Gilbert [Wed, 7 May 2025 00:24:25 +0000 (01:24 +0100)]
drm/amd/pm/smu13: Remove unused smu_v3 functions
smu_v13_0_display_clock_voltage_request() and
smu_v13_0_set_min_deep_sleep_dcefclk() were added in 2020 by
commit
c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)")
but have remained unused.
Remove them.
smu_v13_0_display_clock_voltage_request() was the only user
of smu_v13_0_set_hard_freq_limited_range(). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dr. David Alan Gilbert [Wed, 7 May 2025 00:24:24 +0000 (01:24 +0100)]
drm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_range
The last use of smu_v11_0_get_dpm_level_range() was removed in 2020 by
commit
46a301e14e8a ("drm/amd/powerplay: drop unnecessary Navi1x specific
APIs")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dr. David Alan Gilbert [Wed, 7 May 2025 00:24:23 +0000 (01:24 +0100)]
drm/amd/pm/smu7: Remove unused smu7_copy_bytes_from_smc
smu7_copy_bytes_from_smc() was added in 2016 by
commit
1ff55f465103 ("drm/amd/powerplay: implement smu7_smumgr for asics
with smu ip version 7.")
but never used.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Wed, 7 May 2025 09:18:48 +0000 (14:48 +0530)]
drm/amdgpu: fix the indentation
fix the indentation
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:6992 gfx_v11_ip_dump
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/
202505071619.7sHTLpNg-lkp@intel.com/
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Mon, 21 Apr 2025 05:51:30 +0000 (13:51 +0800)]
drm/amdgpu: remove mdelay in psp v12
Since secure firmware is more stable than bring up phase, I believe we
don't need such mdelays any more before wait PSP response on PSP v12.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shane Xiao [Wed, 23 Apr 2025 09:28:50 +0000 (17:28 +0800)]
amd/amdkfd: Trigger segfault for early userptr unmmapping
If applications unmap the memory before destroying the userptr, it needs
trigger a segfault to notify user space to correct the free sequence in
VM debug mode.
v2: Send gpu access fault to user space
v3: Report gpu address to user space, remove unnecessary params
v4: update pr_err into one line, remove userptr log info
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shane Xiao [Thu, 24 Apr 2025 05:15:01 +0000 (13:15 +0800)]
drm/amdgpu: Add debug bit for userptr usage
In VM debug mode, it is desirable to notify the application
to correct the freeing sequence by unmapping the memory before
destroying the userptr in the old userptr path. Add a bitmask
to decide whether to send gpu vm fault to the applition.
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Tue, 6 May 2025 08:32:23 +0000 (16:32 +0800)]
drm/amdgpu: unreserve the gem BO before returning from attach error
It requires unlocking the reserved gem BO before returning from
attaching the eviction fence error.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Wed, 23 Apr 2025 12:26:48 +0000 (20:26 +0800)]
drm/amdgpu: promote the implicit sync to the dependent read fences
The driver doesn't want to implicitly sync on the DMA_RESV_USAGE_BOOKKEEP
usage fences, and the BOOKEEP fences should be synced explicitly. So, as
the VM implicit syncing only need to return and sync the dependent read
fences.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 29 Apr 2025 13:45:03 +0000 (09:45 -0400)]
drm/amdgpu/psp: mark securedisplay TA as optional
This is an optional TA which is only available on
certain embedded systems. Mark it as optional to avoid
user confusion. This mirrors what we already do for
other optional TAs.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4181
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 1 May 2025 17:46:46 +0000 (13:46 -0400)]
drm/amdgpu: fix pm notifier handling
Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
the resource evictions properly in pm prepare based on whether
we are suspending or hibernating. Drop the eviction as processes
are not frozen at this time, we we can end up getting stuck trying
to evict VRAM while applications continue to submit work which
causes the buffers to get pulled back into VRAM.
v2: Move suspend flags out of pm notifier (Mario)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes:
2965e6355dcd ("drm/amd: Add Suspend/Hibernate notification callback support")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ellen Pan [Tue, 29 Apr 2025 21:18:44 +0000 (17:18 -0400)]
drm/amdgpu: Implement unrecoverable error message handling for VFs
This notification may arrive in VF mailbox while polling for response from
another event.
This patches covers the following scenarios:
- If VF is already in RMA state, then do not attempt to contact the host.
Host will ignore the VF after sending the notification.
- If the notification is detected during polling, then set the RMA status,
and return error to caller.
- If the notification arrives by interrupt, then set the RMA status and
queue a reset. This reset will fail and VF will stop runtime services.
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ellen Pan [Tue, 29 Apr 2025 21:00:50 +0000 (17:00 -0400)]
drm/amdgpu: Add unrecoverable error message definitions for VFs
Host may stop runtime services after reaching a bad page threshold.
This notification will indicate to the VF that it no longer has
access to the GPU.
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 1 May 2025 17:00:16 +0000 (13:00 -0400)]
Revert "drm/amd: Stop evicting resources on APUs in suspend"
This reverts commit
3a9626c816db901def438dc2513622e281186d39.
This breaks S4 because we end up setting the s3/s0ix flags
even when we are entering s4 since prepare is used by both
flows. The causes both the S3/s0ix and s4 flags to be set
which breaks several checks in the driver which assume they
are mutually exclusive.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ruijing Dong [Fri, 2 May 2025 15:19:26 +0000 (11:19 -0400)]
drm/amdgpu/vcn: using separate VCN1_AON_SOC offset
VCN1_AON_SOC_ADDRESS_3_0 offset varies on different
VCN generations, the issue in vcn4.0.5 is caused by
a different VCN1_AON_SOC_ADDRESS_3_0 offset.
This patch does the following:
1. use the same offset for other VCN generations.
2. use the vcn4.0.5 special offset
3. update vcn_4_0 and vcn_5_0
Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Tue, 15 Apr 2025 02:42:19 +0000 (10:42 +0800)]
drm/amdgpu: fix the eviction fence dereference
The dma_resv_add_fence() already refers to the added fence.
So when attaching the evciton fence to the gem bo, it needn't
refer to it anymore.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ellen Pan [Tue, 29 Apr 2025 20:48:09 +0000 (16:48 -0400)]
drm/amdgpu: Implement Runtime Bad Page query for VFs
Host will send a notification when new bad pages are available.
Uopn guest request, the first 256 bad page addresses
will be placed into the PF2VF region.
Guest should pause the PF2VF worker thread while
the copy is in progress.
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ellen Pan [Tue, 29 Apr 2025 20:22:53 +0000 (16:22 -0400)]
drm/amdgpu: Add Runtime Bad Page message definitions for VFs
Currently VFs rely on poison consumption interrupt from HW
to kick off the bad page retirement process. Part of this process
includes a VF reset.
This patch adds the following:
1) Host Bad Pages notification message.
2) Guest request bad pages message.
When combined, VFs are able to reserve the pages early, and potentially
avoid future poison consumption that will disrupt user services
from consequent FLR.
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Sat, 3 May 2025 20:38:43 +0000 (14:38 -0600)]
Documentation/gpu: Add new entries to amdgpu glossary
Add some additional entries.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Sat, 3 May 2025 20:38:42 +0000 (14:38 -0600)]
drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb
Add some random documentation associated with the ring buffer
manipulations and writeback.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Huang [Fri, 2 May 2025 18:58:02 +0000 (14:58 -0400)]
drm/amdkfd: change error to warning message for SDMA queues creation
SDMA doesn't support oversubsciption, it is the user matter to create
queues over HW limit, but not supposed to be a KFD error.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Mon, 28 Apr 2025 20:26:20 +0000 (16:26 -0400)]
drm/amd/display: Don't check for NULL divisor in fixpt code
[Why]
We check for a NULL divisor but don't act on it.
This check does nothing other than throw a warning.
It does confuse static checkers though:
See https://lkml.org/lkml/2025/4/26/371
[How]
Drop the ASSERTs in both DC and SPL variants.
Fixes:
4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Fixes:
6efc0ab3b05d ("drm/amd/display: add back quality EASF and ISHARP and dc dependency changes")
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ivan Shamliev [Thu, 24 Apr 2025 15:14:53 +0000 (18:14 +0300)]
drm/amd/display: Use true/false for boolean variables in DML2 core files
Replace 0 and 1 with false and true for boolean variables in
dml2_core_dcn4_calcs.c and dml2_core_utils.c to align with the Linux
kernel coding style guidelines, which recommend using C99 bool type
with true/false values.
Signed-off-by: Ivan Shamliev <ivan.shamliev.dev@abv.bg>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
James Flowers [Sat, 3 May 2025 21:18:51 +0000 (14:18 -0700)]
drm/amd/display: adds kernel-doc comment for dc_stream_remove_writeback()
Adds kernel-doc for externally linked dc_stream_remove_writeback function.
Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Raag Jadav [Tue, 6 May 2025 05:48:35 +0000 (11:18 +0530)]
drm/xe/doc: Wire up PCIe Gen5 limitations
Append PCIe Gen5 limitations to xe_firmware document.
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250506054835.3395220-4-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Raag Jadav [Tue, 6 May 2025 05:48:34 +0000 (11:18 +0530)]
drm/xe: Expose PCIe link downgrade attributes
Expose sysfs attributes for PCIe link downgrade capability and status.
v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal)
Rework macros and their naming (Rodrigo)
v3: Use sysfs_create_files() (Riana)
Fix checkpatch warning (Riana)
v4: s/downspeed/downgrade (Lucas, Rodrigo, Riana)
v5: Use PCIe Gen agnostic naming (Rodrigo)
v6: s/pcie_gen/auto_link (Lucas)
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Link: https://lore.kernel.org/r/20250506054835.3395220-3-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Raag Jadav [Tue, 6 May 2025 05:48:33 +0000 (11:18 +0530)]
drm/xe: Move xe_device_sysfs_init() to xe_device_probe()
Since xe_device_sysfs_init() exposes device specific attributes, a better
place for it is xe_device_probe().
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Link: https://lore.kernel.org/r/20250506054835.3395220-2-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Shuicheng Lin [Wed, 7 May 2025 02:23:02 +0000 (02:23 +0000)]
drm/xe: Release force wake first then runtime power
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for
the release path, xe_force_wake_put() should be called first then
xe_pm_runtime_put().
Combine the error path and normal path together with goto.
Fixes:
85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Shuicheng Lin [Fri, 2 May 2025 17:00:52 +0000 (17:00 +0000)]
drm/xe: Add config control for svm flush work
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below
warning pops. Refine the flush work code to be controlled by the config
to avoid below warning:
"
[ 453.132028] ------------[ cut here ]------------
[ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0
[ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec
[ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full)
[ 453.135405] Tainted: [U]=USER, [W]=WARN
...
[ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0
[ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe
[ 453.139250] RSP: 0018:
ffffc90000c67b18 EFLAGS:
00010246
[ 453.139782] RAX:
0000000000000000 RBX:
ffff888108a24000 RCX:
0000000000002000
[ 453.140521] RDX:
0000000000000001 RSI:
0000000000000000 RDI:
ffff8881016d61c8
[ 453.141253] RBP:
ffff8881016d61c8 R08:
0000000000000000 R09:
0000000000000000
[ 453.141985] R10:
0000000000000000 R11:
0000000008a24000 R12:
0000000000000001
[ 453.142709] R13:
0000000000000002 R14:
0000000000000000 R15:
ffff888107db8c00
[ 453.143450] FS:
00007f44853d4c80(0000) GS:
ffff8882f469b000(0000) knlGS:
0000000000000000
[ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 453.144853] CR2:
00007f4487629228 CR3:
00000001016aa000 CR4:
00000000000406f0
[ 453.145594] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 453.146320] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 453.147061] Call Trace:
[ 453.147336] <TASK>
[ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30
[ 453.148067] ? xas_load+0x9/0xb0
[ 453.148435] ? xa_load+0x6f/0xb0
[ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe]
[ 453.149338] ? dev_printk_emit+0x48/0x70
[ 453.149762] ? _dev_printk+0x57/0x80
[ 453.150148] ? drm_ioctl+0x17c/0x440
[ 453.150544] ? __drm_dev_vprintk+0x36/0x90
[ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0
[ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.152560] drm_ioctl_kernel+0x9f/0xf0
[ 453.152968] drm_ioctl+0x20f/0x440
[ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100
[ 453.154489] ? memory_bm_test_bit+0x5/0x60
[ 453.154935] xe_drm_ioctl+0x47/0x70 [xe]
[ 453.155419] __x64_sys_ioctl+0x8d/0xc0
[ 453.155824] do_syscall_64+0x47/0x110
[ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e
"
v2 (Matt):
refine commit message to have more details
add Fixes tag
move the code to xe_svm.h which already have the config
remove a blank line per codestyle suggestion
Fixes:
63f6e480d115 ("drm/xe: Add SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
Harish Chegondi [Thu, 1 May 2025 19:14:45 +0000 (12:14 -0700)]
drm/xe: Use copy_from_user() instead of __copy_from_user()
copy_from_user() has more checks and is more safer than
__copy_from_user()
Suggested-by: Kees Cook <kees@kernel.org>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://lore.kernel.org/r/acabf20aa8621c7bc8de09b1bffb8d14b5376484.1746126614.git.harish.chegondi@intel.com
Jinjie Ruan [Fri, 30 Aug 2024 07:38:24 +0000 (15:38 +0800)]
gpu: host1x: Use for_each_available_child_of_node_scoped()
Avoids the need for manual cleanup of_node_put() in early exits
from the loop.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20240830073824.3539690-1-ruanjinjie@huawei.com
Qiu-ji Chen [Wed, 6 Nov 2024 09:59:06 +0000 (17:59 +0800)]
drm/tegra: Fix a possible null pointer dereference
In tegra_crtc_reset(), new memory is allocated with kzalloc(), but
no check is performed. Before calling __drm_atomic_helper_crtc_reset,
state should be checked to prevent possible null pointer dereference.
Fixes:
b7e0b04ae450 ("drm/tegra: Convert to using __drm_atomic_helper_crtc_reset() for reset.")
Cc: stable@vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20241106095906.15247-1-chenqiuji666@gmail.com
Biju Das [Wed, 5 Feb 2025 11:21:35 +0000 (11:21 +0000)]
drm/tegra: rgb: Fix the unbound reference count
The of_get_child_by_name() increments the refcount in tegra_dc_rgb_probe,
but the driver does not decrement the refcount during unbind. Fix the
unbound reference count using devm_add_action_or_reset() helper.
Fixes:
d8f4a9eda006 ("drm: Add NVIDIA Tegra20 support")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250205112137.36055-1-biju.das.jz@bp.renesas.com
Mikko Perttunen [Tue, 4 Feb 2025 02:45:46 +0000 (02:45 +0000)]
gpu: host1x: Remove mid-job CDMA flushes
The current code can issue CDMA flushes (DMAPUT bumps) in the middle
of a job, before all opcodes have been written into the pushbuffer.
This can happen when pushbuffer fills up. Presumably this made sense
at some point in the past, but it doesn't anymore, as it cannot lead
to more space appearing in the pushbuffer as it is only cleaned full
jobs at a time.
Mid-job flushes can also cause problems, as in an extreme situation
(seen in practice), the hardware can run through the entire pushbuffer
including the prefix of a partially written job without the driver
being able to process any CDMA updates. This can cause the engine
MLOCK to be taken and held for extended periods as the tail of the
job is not yet available to hardware.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250204024546.1168126-1-mperttunen@nvidia.com
Mikko Perttunen [Wed, 5 Feb 2025 06:10:27 +0000 (06:10 +0000)]
drm/tegra: falcon: Pipeline firmware copy
The Falcon DMA engine allows queueing multiple operations for
improved performance. Do this to optimize firmware loading.
A performance improvement of 4x to 6x is observed.
Co-developed-by: Ivan Raul Guadarrama <iguadarrama@nvidia.com>
Signed-off-by: Ivan Raul Guadarrama <iguadarrama@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250205061027.1205748-1-mperttunen@nvidia.com
Zhang Enpei [Wed, 2 Apr 2025 11:37:58 +0000 (19:37 +0800)]
drm/tegra: dpaux: Use dev_err_probe()
Replace the open-code with dev_err_probe() to simplify the code.
Signed-off-by: Zhang Enpei <zhang.enpei@zte.com.cn>
Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250402193758365XauggSF2EWBYY-e_jgNch@zte.com.cn
Jon Hunter [Mon, 28 Apr 2025 15:34:35 +0000 (16:34 +0100)]
drm/tegra: Remove unneeded include
The header file 'tegra_drm.h' is included in gem.c, but this file is
already include 'drm.h'. Although there is no harm in including this
file again, it is also not necessary. Furthermore, the header file is
located under 'include/uapi/drm' so ideally the full path would be
used to be explicit. For now, just remove from gem.c.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250428153435.1013101-1-jonathanh@nvidia.com
Thierry Reding [Mon, 21 Apr 2025 16:13:05 +0000 (11:13 -0500)]
drm/tegra: Assign plane type before registration
Changes to a plane's type after it has been registered aren't propagated
to userspace automatically. This could possibly be achieved by updating
the property, but since we can already determine which type this should
be before the registration, passing in the right type from the start is
a much better solution.
Suggested-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Cc: stable@vger.kernel.org
Fixes:
473079549f27 ("drm/tegra: dc: Add Tegra186 support")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250421-tegra-drm-primary-v2-1-7f740c4c2121@gmail.com
Jani Nikula [Wed, 7 May 2025 08:31:36 +0000 (11:31 +0300)]
drm/i915/rps: fix stale reference to i915->irq_lock
The RPS code has been switched from using i915->irq_lock to gt->irq_lock
years ago in commit
d762043f7ab1 ("drm/i915: Extract GT powermanagement
interrupt handling"). Fix the stale comment referencing i915->irq_lock,
which has also ceased to exist.
Reported-by: Gustavo Sousa <gustavo.sousa@intel.com>
Closes: https://lore.kernel.org/r/
174656703321.1401.
8627403371256302933@intel.com
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20250507083136.1987023-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Kees Cook [Sat, 26 Apr 2025 06:13:58 +0000 (23:13 -0700)]
drm/i915/gt: Remove const from struct i915_wa list allocation
In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)
The assigned type is "struct i915_wa *". The returned type, while
technically matching, will be const qualified. As there is no general
way to remove const qualifiers, adjust the allocation type to match
the assignment.
Signed-off-by: Kees Cook <kees@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20250426061357.work.749-kees@kernel.org
Jani Nikula [Tue, 6 May 2025 13:06:50 +0000 (16:06 +0300)]
drm/i915/irq: move i915->irq_lock to display->irq.lock
Observe that i915->irq_lock is no longer used to protect anything
outside of display. Make it a display thing.
This allows us to remove the ugly #define irq_lock irq.lock hack from xe
compat header.
Note that this is slightly more subtle than it first looks. For i915,
there's no functional change here. The lock is moved. However, for xe,
we'll now have *two* locks, xe->irq.lock and display->irq.lock. These
should protect different things, though. Indeed, nesting in the past
would've lead to a deadlock because they were the same lock.
With the i915 references gone, we can make a handful more files
independent of i915_drv.h.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/6d8d2ce0f34a9c7361a5e2fcf96bb32a34c57e76.1746536745.git.jani.nikula@intel.com
[Jani: Fixed a comment while applying.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:49 +0000 (16:06 +0300)]
drm/i915/rps: refactor display rps support
Make the gt rps code and display irq code interact via
intel_display_rps.[ch], instead of direct access. Add no-op static
inline stubs for xe instead of having a separate build unit doing
nothing. All of this clarifies the interfaces between i915 core and
display.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/ef2a46dc8f30b72282494f54e98cb5fed7523b58.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:48 +0000 (16:06 +0300)]
drm/i915/irq: make i915_enable_asle_pipestat() static
With all users of i915_enable_asle_pipestat() inside
intel_display_irq.c, we can make the function static.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/9511e368c5244aaa04cc45f46e2425737acd29fb.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:47 +0000 (16:06 +0300)]
drm/i915/irq: split out i965_display_irq_postinstall()
Split out i965_display_irq_postinstall() similar to other platforms.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/5d404dcd0c606d1cb11f2e09c45e151a75b5b2c6.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:46 +0000 (16:06 +0300)]
drm/i915/irq: split out i915_display_irq_postinstall()
Split out i915_display_irq_postinstall() similar to other platforms.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/11de06206ff10c27104b0ac3efda085bf4c1f1a6.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:45 +0000 (16:06 +0300)]
drm/i915/irq: move locking inside vlv_display_irq_postinstall()
All users of vlv_display_irq_postinstall() outside of
intel_display_irq.c have a lock/unlock pair. Move the locking inside the
function. Add an unlocked variant for internal use, similar to the
_vlv_display_irq_reset() and vlv_display_irq_reset() functions.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/93ea785d2d9bdb4e18328aa42a00a492d9d783c0.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:44 +0000 (16:06 +0300)]
drm/i915/irq: move locking inside valleyview_{enable, disable}_display_irqs()
All users of valleyview_enable_display_irqs() and
valleyview_disable_display_irqs() have a lock/unlock pair. Move the
locking inside the functions.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/bb6d941c47260aea11e4af5d52572b0e5f139929.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 13:06:43 +0000 (16:06 +0300)]
drm/i915/irq: move locking inside vlv_display_irq_reset()
All users of vlv_display_irq_reset() have a lock/unlock pair. Move the
locking inside the function.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/0f8176b777fa24921458996f7d6f982f955a52f6.1746536745.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 10:57:19 +0000 (13:57 +0300)]
drm/i915/crtc: pass struct intel_display to DISPLAY_VER()
Drop another reference to struct drm_i915_private.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/cb84073ff92a99e74ff6dfb8e395365b7cbb5332.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 10:57:18 +0000 (13:57 +0300)]
drm/i915/bios: fix a comment referencing struct drm_i915_private
struct intel_vbt_data is within struct intel_display nowadays, not
struct drm_i915_private.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/b7a9a7c64f41cf61749a42ed4102e04b500fde83.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 10:57:17 +0000 (13:57 +0300)]
drm/i915/display: remove struct drm_i915_private forward declaration
Remove unused struct drm_i915_private forward declaration from
intel_display_core.h. Sort and group forward declarations while at it.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/fbccf45339a61711b377b35fd479a67b378c5571.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Jani Nikula [Tue, 6 May 2025 10:57:16 +0000 (13:57 +0300)]
drm/i915/dsi: remove dependency on i915_drv.h
Remove final references to struct drm_i915_private and drop dependency
on i915_drv.h.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/2cee3cd9d7d9bec8dfe9c21fe5d172b1843b3d97.1746529001.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Aditya Garg [Wed, 30 Apr 2025 13:49:08 +0000 (19:19 +0530)]
checkpatch: remove %p4cn
%p4cn was recently removed and replaced by %p4chR in vsprintf. So,
remove the check for %p4cn from checkpatch.pl.
Fixes:
37eed892cc5f ("vsprintf: Use %p4chR instead of %p4cn for reading data in reversed host ordering")
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/PN3PR01MB959760B89BF7E4B43852700CB8832@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Kevin Baker [Mon, 5 May 2025 17:02:56 +0000 (12:02 -0500)]
drm/panel: simple: Update timings for AUO G101EVN010
Switch to panel timings based on datasheet for the AUO G101EVN01.0
LVDS panel. Default timings were tested on the panel.
Previous mode-based timings resulted in horizontal display shift.
Signed-off-by: Kevin Baker <kevinb@ventureresearch.com>
Fixes:
4fb86404a977 ("drm/panel: simple: Add AUO G101EVN010 panel support")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Kees Cook [Sat, 26 Apr 2025 06:14:32 +0000 (23:14 -0700)]
drm/vkms: Adjust vkms_state->active_planes allocation type
In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)
The assigned type is "struct vkms_plane_state **", but the returned type
will be "struct drm_plane **". These are the same size (pointer size), but
the types don't match. Adjust the allocation type to match the assignment.
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com>
Fixes:
8b1865873651 ("drm/vkms: totally reworked crc data tracking")
Link: https://lore.kernel.org/r/20250426061431.work.304-kees@kernel.org
Signed-off-by: Louis Chauvet <contact@louischauvet.fr>
Thomas Zimmermann [Tue, 6 May 2025 07:09:49 +0000 (09:09 +0200)]
Merge drm/drm-next into drm-misc-next
Backmerging drm-next to get fixes from v6.15-rc5.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Dave Airlie [Tue, 6 May 2025 06:39:25 +0000 (16:39 +1000)]
BackMerge tag 'v6.15-rc5' into drm-next
Linux 6.15-rc5, requested by tzimmerman for fixes required in drm-next.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Daniele Ceraolo Spurio [Fri, 2 May 2025 15:51:04 +0000 (08:51 -0700)]
drm/xe/gsc: do not flush the GSC worker from the reset path
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.
The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:
1) GSC proxy init: this only happens once immediately after GSC FW load
and does not support being interrupted. The only way to recover from
an interruption of the proxy init is to do an FLR and re-load the GSC.
2) GSC proxy request: this can happen in response to a request that
the driver sends to the GSC. If this is interrupted, the GSC FW will
timeout and the driver request will be failed, but overall the GSC
will keep working fine.
Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).
Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.
v2: fix spelling in commit msg, rename waiter function (Julia)
Fixes:
dd0e89e5edc2 ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
Mario Limonciello [Tue, 15 Apr 2025 19:20:59 +0000 (14:20 -0500)]
docs: backlight: Clarify `actual_brightness`
Currently userspace software systemd treats `brightness` and
`actual_brightness` identically due to a bug found in an out of tree
driver.
This however causes problems for in-tree drivers that use brightness
to report user requested `brightness` and `actual_brightness` to report
what the hardware actually has programmed.
Clarify the documentation to match the behavior described in commit
6ca017658b1f9 ("[PATCH] backlight: Backlight Class Improvements").
Cc: Lee Jones <lee@kernel.org>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: richard.purdie@linuxfoundation.org
Link: https://github.com/systemd/systemd/pull/36881
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250415192101.2033518-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Arvind Yadav [Fri, 25 Apr 2025 14:19:05 +0000 (19:49 +0530)]
drm/amdgpu: only keep most recent fence for each context
Keep only the latest fences to reduce the number of values
given back to userspace
v2: - Export this code from dma-fence-unwrap.c(by Christian).
v3: - To split this in a dma_buf patch and amd userq patch(by Sunil).
- No need to add a new function just re-use existing(by Christian).
v4: Export dma_fence_dedub_array function and used it(by Christian).
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Srinivasan Shanmugam [Mon, 28 Apr 2025 10:46:34 +0000 (16:16 +0530)]
drm/amdgpu: Add Support for enforcing isolation without Cleaner Shader
Adjusted the enforce isolation setting handling to include the ability
to disable the cleaner shader without affecting isolation between tasks.
v2: Updated enforce isolation documentation and parameters. (Alex)
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Fri, 25 Apr 2025 14:06:50 +0000 (19:36 +0530)]
dma-fence: Add helper to sort and deduplicate dma_fence arrays
Export a new helper function `dma_fence_dedup_array()` that sorts
an array of dma_fence pointers by context, then deduplicates the array
by retaining only the most recent fence per context.
This utility is useful when merging or optimizing sets of fences where
redundant entries from the same context can be pruned. The operation is
performed in-place and releases references to dropped fences using
dma_fence_put().
v2: - Export this code from dma-fence-unwrap.c(by Christian).
v3: - To split this in a dma_buf patch and amd userq patch(by Sunil).
- No need to add a new function just re-use existing(by Christian).
v4: - Export dma_fence_dedub_array and use it(by Christian).
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 22 Apr 2025 13:19:02 +0000 (18:49 +0530)]
drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver
update the functions in amdgpu_userqueues.c from
DRM_DBG_DRIVER to drm_dbg_driver so multi gpu instance
can be logged in.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 22 Apr 2025 13:10:03 +0000 (18:40 +0530)]
drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userq.c
change the DRM_ERROR and drm_err to drm_file_err
to add process name and pid to the logging.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 22 Apr 2025 12:40:22 +0000 (18:10 +0530)]
drm/amdgpu: use drm_file_err in fence timeouts
use drm_file_err instead of DRM_ERROR which adds
process and pid information in the userqueue error
logging.
Sample log:
[ 19.802315] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: ibus-x11 pid: 2055 client: Unset ... Couldn't unmap all the queues
[ 19.802319] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: ibus-x11 pid: 2055 client: Unset ... Failed to evict userqueue
[ 19.838432] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: systemd-logind pid: 1042 client: Unset ... Couldn't unmap all the queues
[ 19.838436] amdgpu 0000:0a:00.0: [drm] *ERROR* comm: systemd-logind pid: 1042 client: Unset ... Failed to evict userqueue
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Tue, 22 Apr 2025 12:25:50 +0000 (17:55 +0530)]
drm/amdgpu: add drm_file reference in userq_mgr
drm_file will be used in usermode queues code to
enable better process information in logging and hence
add drm_file part of the userq_mgr struct.
update the drm_file pointer in userq_mgr for each
amdgpu_driver_open_kms.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Wed, 9 Apr 2025 11:39:23 +0000 (17:09 +0530)]
drm: add drm_file_err function to add process info
Add a drm helper function which appends the process information for
the drm_file over drm_err formatted output.
v5: change to macro from function (Christian Koenig)
add helper functions for lock/unlock (Christian Koenig)
v6: remove __maybe_unused and make function inline (Jani Nikula)
remove drm_print.h
v7: Use va_format and %pV to concatenate fmt and vargs (Jani Nikula)
v8: Code formatting and typos (Ursulin tvrtko)
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Taimur Hassan [Mon, 21 Apr 2025 18:30:58 +0000 (13:30 -0500)]
drm/amd/display: Promote DC to 3.2.331
Summary
* Remove redundant NULL check
* Fix invalid context error in dml helper
* Prepare for Fused I2C-over-AUX
* Allow DSCClock disable
* Vmax / Vmin update for Vsync
* Fix race condition in DPIA AUX transfer
* Fix wrong handling for AUX_DEFER case
* Only wait for required space in DMUB mailbox
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Thu, 24 Apr 2025 15:48:46 +0000 (11:48 -0400)]
drm/amd/display: Only wait for required free space in DMUB mailbox
[WHY&HOW]
When command submission is blocked by a full mailbox, only wait for
enough space to free to submit the command, instead of waiting for idle.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Meenakshikumar Somasundaram [Wed, 23 Apr 2025 18:42:53 +0000 (14:42 -0400)]
drm/amd/display: Assign preferred stream encoder instance to dpia
[Why]
For dpia, preferred engine instance availability is not checked
when assigning stream encoder instance.
[How]
Check for dpia preferred engine id and assign the same stream
encoder instance for the stream if available.
Reviewed-by: PeiChen Huang <peichen.huang@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>