git.monstr.eu Git - linux-2.6-microblaze.git/log

drm/xe/migrate: use XE_BO_FLAG_PAGETABLE

On some HW we want to avoid the host caching PTEs, since access from GPU
side can be incoherent. However here the special migrate object is
mapping PTEs which are written from the host and potentially cached. Use
XE_BO_FLAG_PAGETABLE to ensure that non-cached mapping is used, on
platforms where this matters.

Fixes: 7a060d786cc1 ("drm/xe/mtl: Map PPGTT as CPU:WC")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241126181259.159713-4-matthew.auld@intel.com

drm/xe/migrate: fix pat index usage

XE_CACHE_WB must be converted into the per-platform pat index for that
particular caching mode, otherwise we are just encoding whatever happens
to be the value of that enum.

Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v6.12+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241126181259.159713-3-matthew.auld@intel.com

drm/xe/xe3lpg: Add Wa_16024792527

Force Sampler Tile64 Overfetch via MMIO

Signed-off-by: Apoorva Singh <apoorva.singh@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241107082158.1436637-1-apoorva.singh@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/xe/guc_submit: fix race around suspend_pending

Currently in some testcases we can trigger:

xe 0000:03:00.0: [drm] Assertion `exec_queue_destroyed(q)` failed!
....
WARNING: CPU: 18 PID: 2640 at drivers/gpu/drm/xe/xe_guc_submit.c:1826 xe_guc_sched_done_handler+0xa54/0xef0 [xe]
xe 0000:03:00.0: [drm] *ERROR* GT1: DEREGISTER_DONE: Unexpected engine state 0x00a1, guc_id=57

Looking at a snippet of corresponding ftrace for this GuC id we can see:

162.673311: xe_sched_msg_add:     dev=0000:03:00.0, gt=1 guc_id=57, opcode=3
162.673317: xe_sched_msg_recv:    dev=0000:03:00.0, gt=1 guc_id=57, opcode=3
162.673319: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0
162.674089: xe_exec_queue_kill:   dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0
162.674108: xe_exec_queue_close:  dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0
162.674488: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0
162.678452: xe_exec_queue_deregister: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa1, flags=0x0

It looks like we try to suspend the queue (opcode=3), setting
suspend_pending and triggering a disable_scheduling. The user then
closes the queue. However the close will also forcefully signal the
suspend fence after killing the queue, later when the G2H response for
disable_scheduling comes back we have now cleared suspend_pending when
signalling the suspend fence, so the disable_scheduling now incorrectly
tries to also deregister the queue. This leads to warnings since the queue
has yet to even be marked for destruction. We also seem to trigger
errors later with trying to double unregister the same queue.

To fix this tweak the ordering when handling the response to ensure we
don't race with a disable_scheduling that didn't actually intend to
perform an unregister.  The destruction path should now also correctly
wait for any pending_disable before marking as destroyed.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3371
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-6-matthew.auld@intel.com

drm/xe/guc_submit: fix race around pending_disable

Currently in some testcases we can trigger:

[drm] *ERROR* GT0: SCHED_DONE: Unexpected engine state 0x02b1, guc_id=8, runnable_state=0
[drm] *ERROR* GT0: G2H action 0x1002 failed (-EPROTO) len 3 msg 02 10 00 90 08 00 00 00 00 00 00 00

Looking at a snippet of corresponding ftrace for this GuC id we can see:

498.852891: xe_sched_msg_add:     dev=0000:03:00.0, gt=0 guc_id=8, opcode=3
498.854083: xe_sched_msg_recv:    dev=0000:03:00.0, gt=0 guc_id=8, opcode=3
498.855389: xe_exec_queue_kill:   dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x3, flags=0x0
498.855436: xe_exec_queue_lr_cleanup: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x83, flags=0x0
498.856767: xe_exec_queue_close:  dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x83, flags=0x0
498.862889: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0xa9, flags=0x0
498.863032: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b9, flags=0x0
498.875596: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b9, flags=0x0
498.875604: xe_exec_queue_deregister: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b1, flags=0x0
499.074483: xe_exec_queue_deregister_done: dev=0000:03:00.0, 5:0x1, gt=0, width=1, guc_id=8, guc_state=0x2b1, flags=0x0

This looks to be the two scheduling_disable racing with each other, one
from the suspend (opcode=3) and then again during lr cleanup. While
those two operations are serialized, the G2H portion is not, therefore
when marking the queue as pending_disabled and then firing off the first
request, we proceed do the same again, however the first disable
response only fires after this which then clears the pending_disabled.
At this point the second comes back and is processed, however the
pending_disabled is no longer set, hence triggering the warning.

To fix this wait for pending_disabled when doing the lr cleanup and
calling disable_scheduling_deregister. Also do the same for all other
disable_scheduling callers.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3515
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <mattheq.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-5-matthew.auld@intel.com

drm/xe/trace: improve xe_sched_msg trace

Also include the gt_id, that way we can ignore duplicate guc_id across
different GTs when applying some filtering.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-4-matthew.auld@intel.com

drm/xe/vram: fix lpfn check

Technically we should check the lfpn value and not the place->lpfn, for
the case where the allocation itself could be as large as the entire
region and not be range based, which might result in incorrectly doing a
power-of-two roundup. The allocator itself will already ensure it's
contiguous underneath for such an allocation. This shouldn't fix any
current usecase, but never the less came up from some internal testing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241119101926.190203-2-matthew.auld@intel.com

drm/xe: Update xe2_graphics name string

Since both Xe2 and Xe3 platforms currently use the same set of graphics
IP feature flags, we associate the "graphics_xe2" structure with both IPs.
Update the name string on that IP structure to clarify this and avoid
confusion as Xe3 platforms start going into public CI.

Fixes: 800d75bf20ae ("drm/xe/xe3: Define Xe3 feature flags")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241125194838.1190599-2-matthew.d.roper@intel.com

drm/xe/guc: Add support for G2G communications

Some features require inter-GuC communication channels on multi-tile
devices. So allocate and enable such.

v2: Correct use of xe_bo_get/put (review feedback from Matthew Brost)
Add extra assert, re-order a calculation for better clarity and add
comments to slot calculation (review feedback from Daniele). Also
slightly re-work the slot calc to avoid code duplication.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241120000222.204095-3-John.C.Harrison@Intel.com

drm/xe: Allow bo mapping on multiple ggtts

Make bo->ggtt an array to support bo mapping on multiple ggtts.
Add XE_BO_FLAG_GGTTx flags to map the bo on ggtt of tile 'x'.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241120000222.204095-2-John.C.Harrison@Intel.com

drm/xe/ptl: Add another PTL PCI ID

An additional pci id has been added to bspec.

Bspec: 72574
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114234410.145863-1-matthew.s.atwood@intel.com

drm/xe/pf: Drop 2GiB limit of fair LMEM allocation

Since commit 678ccbf98796 ("drm/xe/vram: drop 2G block
restriction") we are able to provision VFs with more than 2GiB.
Drop our temporary limit of maximum fair LMEM size that was added
just to avoid hitting -EINVAL from auto-provisioning.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241121175754.302-1-michal.wajdeczko@intel.com

drm/xe: Sort again the info flags

Those flags are supposed to be kept sorted alphabetically. Unfortunately
it's a constant battle as new flags are added to the end or at random
places. Sort it again.

v2: Include the other non-has_* 1-bit flags in the sort
v3: Add comment to keep flags sorted

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> # v1
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241120195710.3447100-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Drop useless d3cold allowed message

This message just spams dmesg providing little benefit. Remove it.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241115192155.2050987-1-matthew.brost@intel.com

drm/xe/vram: drop 2G block restriction

Currently we limit the max block size for all users to ensure each block
can fit within a sg entry (uint). Drop this restriction and tweak the sg
construction to instead handle this itself and break down blocks which
are too big, if needed. Most users don't need an sg list in the first
place.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Tested-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241113172346.256165-2-matthew.auld@intel.com

drm/xe: Split xe_gt_stat.h

Follow what's done for the other headers, with the types split into a
separate header that can be included by other *_types.h headers.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114152148.572447-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Drop HAS_HECI_*

Just do the same as for other has_* flags, without a macro.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114152148.572447-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Include xe_oa_types.h

xe_device_types.h and xe_gt_types.h only need to know about the xe_oa
struct sizes. Include only the _types.h, like done for other components,
and let the full header to be included by the compilation units.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114152148.572447-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Mark preempt fence workqueue as reclaim

Preempt fences are in the path of reclaim, and we signal these fences in
the preempt workqueue. With that, we need to mark the preempt fence
workqueue with reclaim so that this workqueue can make forward progress
during reclaim.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241113171751.1677784-1-matthew.brost@intel.com

drm/xe/ufence: Wake up waiters after setting ufence->signalled

If a previous ufence is not signalled, vm_bind will return -EBUSY.
Delaying the modification of ufence->signalled can cause issues if the
UMD reuses the same ufence so update ufence->signalled before waking up
waiters.

Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3233
Fixes: 977e5b82e090 ("drm/xe: Expose user fence from xe_sync_entry")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114150537.4161573-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>

drm/xe/guc: Remove duplicate source field

xe_hw_engine_snapshot.source save the information of where data copied
from. Because the 'source' field is already populated inside
'matched_node' ptr hanging off xe_devcoredump_snapshot, which happenned
either in guc_capture_extract_reglists or xe_engine_manual_capture, we
can remove this redundant copy of 'source' from xe_hw_engine_snapshot.

Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Changes from prior revs:
v2:- Update commit message

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241107213841.436384-1-zhanjun.dong@intel.com

drm/xe: Wire devcoredump to LR TDR

LR queues can hang, cause engine reset, or cause IOMMU CAT errors.
Collect an error capture when this occurs.

v2:
- s/queue's/queues (Jonathan)
v4:
- Fix build (CI)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-8-matthew.brost@intel.com

drm/xe: Change xe_engine_snapshot_capture_for_job to be for queue

During capture time, the target job may be unavailable (e.g., if it's in
LR mode). However, the associated exec queue will be available
regardless, change xe_engine_snapshot_capture_for_job to take a queue
argument ann rename to xe_engine_snapshot_capture_for_queue.

v2:
- Reword commit message (Jonathan)
- Remove redundant queueu check (Zhanjun)
- Remove devcoredump job member (Zhanjun)

Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-7-matthew.brost@intel.com

drm/xe: Improve schedule disable response failure

Print Guc ID and take devcoredump on schedule disable response failure.
GuC ID is useful information and a schedule disable response failure is
possible the LRC state is corrupted so a devcoredump is helpful to debug.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-6-matthew.brost@intel.com

drm/xe: Add exec queue param to devcoredump

During capture time, the target job may be unavailable (e.g., if it's in
LR mode). However, the associated exec queue will be available
regardless, so add an exec queue param for such cases.

v2:
- Reword commit message (Jonathan)

Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-5-matthew.brost@intel.com

drm/xe: Add ring start to LRC snapshot

Add LRC ring start register to LRC snapshot to verify no LRC register
corruption upon hang. This could be possible if the indirect ring state
was mapped to user space or via an internal KMD memory corruption.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-4-matthew.brost@intel.com

drm/xe: Add ring address to LRC snapshot

The ring is currently in LRC BO but this may change going forward.
Include the ring address in the snapshot protecting again any future
changes.

v2:
- s/ring_desc/ring_addr (Jonathan)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-3-matthew.brost@intel.com

drm/xe: Add xe_ring_lrc_is_idle() helper

Add helper to compare ring head and tail to determine if LRC is idle.

v2:
- Fix kernel doc (CI, Zhanjun)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114022522.1951351-2-matthew.brost@intel.com

drm/xe: Ignore GGTT TLB inval errors during GT reset

During GT reset, GGTT TLB invalidations may fail. This is acceptable
as the reset will clear GGTT caches. Suppress only -ECANCELED other
return codes are still unexpected error.

v2: Add code comment(Matt).

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3389
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241112170123.1236443-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>

drm/xe: Allow fault injection in vm create and vm bind IOCTLs

Use fault injection infrastructure to allow specific functions to
be configured over debugfs for failing during the execution of
xe_vm_create_ioctl() and xe_vm_bind_ioctl(). This allows more
thorough testing from user space by going through code paths for
error handling and unwinding which cannot be reached by simply
injecting errors in IOCTL arguments. This can help increase code
robustness.

v2: Add xe_pt_update_ops_{prepare,run} (Matthew Brost)

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241113162212.2154103-1-francois.dugast@intel.com
Signed-off-by: Francois Dugast <francois.dugast@intel.com>

drm/xe/oa: Fix "Missing outer runtime PM protection" warning

Fix the following drm_WARN:

[953.586396] xe 0000:00:02.0: [drm] Missing outer runtime PM protection
...
<4> [953.587090]  ? xe_pm_runtime_get_noresume+0x8d/0xa0 [xe]
<4> [953.587208]  guc_exec_queue_add_msg+0x28/0x130 [xe]
<4> [953.587319]  guc_exec_queue_fini+0x3a/0x40 [xe]
<4> [953.587425]  xe_exec_queue_destroy+0xb3/0xf0 [xe]
<4> [953.587515]  xe_oa_release+0x9c/0xc0 [xe]

Suggested-by: John Harrison <john.c.harrison@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241109032003.3093811-1-ashutosh.dixit@intel.com

drm/xe: Sample gpu timestamp closer to exec queues

Move the force_wake_get to the beginning of the function so the gpu
timestamp can be closer to the sampled value for exec queues. This
avoids additional delays waiting for force wake ack which can make the
proportion between cycles/total_cycles fluctuate around the real value.
For a gputop-like application getting 2 samples to calculate the
utilization:

sample 0:
read_exec_queue_timestamp
<<<< (A)
read_gpu_timestamp
sample 1:
read_exec_queue_timestamp
<<<<< (B)
read_gpu_timestamp

In the above case, utilization can be bigger or smaller than it should
be, depending on if (A) or (B) receives additional delay, respectively.

With this a LNL system that was failing on
`xe_drm_fdinfo --r utilization-single-full-load` after ~60 iterations,
get to run to 100 without a failure. This is still not perfect, and it's
easy to introduce errors by just loading the CPU with `stress --cpu
$(nproc)` - the same igt test in this case fails after 2 or 3
iterations. That will be dealt with in the test itself, using a longer
sampling period.

v2: Rename function and add another to get "any engine", preparing for
caching the hwe in future (Umesh / Jonathan)

Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241108053318.3483678-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Wait on killed exec queues

When an exec queue is killed it triggers an async process of asking the
GuC to schedule the context out. The timestamp in the context image is
only updated when this process completes. In case a userspace process
kills an exec and tries to read the timestamp, it may not get an updated
runtime.

Add synchronization between the process reading the fdinfo and the exec
queue being killed. After reading all the timestamps, wait on exec
queues in the process of being killed. When that wait is over,
xe_exec_queue_fini() was already called and updated the timestamps.

v2: Do not update pending_removal before validating user args
    (Matthew Auld)
v3: Move wait on pending to be done before getting any timestamp
    so it's more likely for the gpu and exec queue timestamps to
    be closer together

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2667
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241108053318.3483678-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: handle flat ccs during hibernation on igpu

Starting from LNL, CCS has moved over to flat CCS model where there is
now dedicated memory reserved for storing compression state. On
platforms like LNL this reserved memory lives inside graphics stolen
memory, which is not treated like normal RAM and is therefore skipped by
the core kernel when creating the hibernation image. Currently if
something was compressed and we enter hibernation all the corresponding
CCS state is lost on such HW, resulting in corrupted memory. To fix this
evict user buffers from TT -> SYSTEM to ensure we take a snapshot of the
raw CCS state when entering hibernation, where upon resuming we can
restore the raw CCS state back when next validating the buffer. This has
been confirmed to fix display corruption on LNL when coming back from
hibernation.

Fixes: cbdc52c11c9b ("drm/xe/xe2: Support flat ccs")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3409
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241112162827.116523-2-matthew.auld@intel.com

drm/xe/guc: Support crash dump notification from GuC

Add support for the two crash dump notifications from GuC. Either one
means GuC is toast, so just capture state trigger a reset.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241108212737.2044007-3-John.C.Harrison@Intel.com

drm/xe/guc: Reduce default GuC log verbosity

Drop the default verbosity from 5 (max) to 3 as the extra verbosity
generally doesn't provide anything vitally important but does cause
rapid log overflow.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241108212737.2044007-2-John.C.Harrison@Intel.com

drm/xe/gsc: Improve SW proxy error checking and logging

If an error occurs in the GSC<->CSME handshake, the GSC will send a
PROXY_END msg to the driver with the status set to an error code. We
currently don't check the status when receiving a PROXY_END message and
instead check the proxy initialization status in the FWSTS reg;
therefore, while still catching any initialization failures, we lose the
actual returned error code. This can be easily improved by checking the
status value and printing it to dmesg if it's an error.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106002402.486700-1-daniele.ceraolospurio@intel.com

drm/xe: Take job list lock in xe_sched_first_pending_job

Access to the pending_list should always happens under job_list_lock.

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105160327.2970277-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>

drm/xe/guc: Do not assert CTB state while sending MMIO

During VF post-migration recovery, MMIO communication channel to GuC
is used, despite CTB channel being enabled. This behavior is rooted
in the save-restore architecture specification.

Therefore, a VF driver cannot assert that CTB is disabled while sending
MMIO messages to GuC. Such assertion needs to be PF only, or be removed.

This patch simply removes the assertion.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241107151357.1623733-2-tomasz.lis@intel.com

drm/xe/guc: Prefer GT oriented logs in submit code

For better diagnostics, use xe_gt_err() instead of drm_err().

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241107194741.2167-3-michal.wajdeczko@intel.com

drm/xe/guc: Prefer GT oriented asserts in submit code

For better diagnostics, use xe_gt_assert() instead of xe_assert().

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241107194741.2167-2-michal.wajdeczko@intel.com

drm/xe: improve hibernation on igpu

The GGTT looks to be stored inside stolen memory on igpu which is not
treated as normal RAM. The core kernel skips this memory range when
creating the hibernation image, therefore when coming back from
hibernation the GGTT programming is lost. This seems to cause issues
with broken resume where GuC FW fails to load:

[drm] *ERROR* GT0: load failed: status = 0x400000A0, time = 10ms, freq = 1250MHz (req 1300MHz), done = -1
[drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
[drm] *ERROR* GT0: firmware signature verification failed
[drm] *ERROR* CRITICAL: Xe has declared device 0000:00:02.0 as wedged.

Current GGTT users are kernel internal and tracked as pinned, so it
should be possible to hook into the existing save/restore logic that we
use for dgpu, where the actual evict is skipped but on restore we
importantly restore the GGTT programming. This has been confirmed to
fix hibernation on at least ADL and MTL, though likely all igpu
platforms are affected.

This also means we have a hole in our testing, where the existing s4
tests only really test the driver hooks, and don't go as far as actually
rebooting and restoring from the hibernation image and in turn powering
down RAM (and therefore losing the contents of stolen).

v2 (Brost)
- Remove extra newline and drop unnecessary parentheses.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3275
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241101170156.213490-2-matthew.auld@intel.com

drm/xe: Add gt_id to xe_sched_job traces

In order to uniquely identify the jobs, xe_sched_job tracepoints need
to have the tuple (gt_id, guc_id). Find a "hole" where the next entry is
unaligned and add one more field.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241107060606.3130885-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/pf: Adjust scheduling priority based on policy change

Local values of scheduling priorities need to be adjusted when
changing the schedule_if_idle policy as those are related.

Disabling schedule_if_idle policy forces the low priority, and
enabling schedule_if_idle policy forces the normal priority.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-6-michal.wajdeczko@intel.com

drm/xe/pf: Allow to control scheduling priority using debugfs

Add 'sched_priority' attribute that will represents the scheduling
prority of the VF or PF. Values {0,1,2} correspond to priorities
{LOW,NORMAL,HIGH} respectively.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-5-michal.wajdeczko@intel.com

drm/xe/pf: Add functions to configure VF scheduling priority

Add functions to configure PF or VF scheduling priority using the
VF_CFG_SCHED_PRIORITY KLV.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-4-michal.wajdeczko@intel.com

drm/xe/guc: Add VF_CFG_SCHED_PRIORITY to KLV helper

Recognize new VF_CFG_SCHED_PRIORITY key in to_string() helper.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-3-michal.wajdeczko@intel.com

drm/xe/guc: Add VF_CFG_SCHED_PRIORITY_KEY KLV definition

This KLV allows to set the scheduling priority for each VF, also
for the PF.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106151301.2079-2-michal.wajdeczko@intel.com

drm/xe: Ensure all locks released in exec IOCTL

In couple of places the wrong error handling goto was used to release
locks. Fix these to ensure all locks dropped on exec IOCTL errors.

Cc: Francois Dugast <francois.dugast@intel.com>
Fixes: d16ef1a18e39 ("drm/xe/exec: Switch hw engine group execution mode upon job submission")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106224944.30130-1-matthew.brost@intel.com

drm/xe/guc: Don't treat GuC generic CAT error as protocol error

GuC uses GUC_ID_UNKNOWN if it can not map the CAT fault to any
context. We shouldn't treat that as G2H protocol error that would
justify a GT reset, as it may happen due to some VF activity.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105204557.1991-1-michal.wajdeczko@intel.com

drm/xe/guc: Don't read data from G2H prior to length check

While highly unlikely, incoming G2H message might be too short
so we shouldn't read any data from it prior to checking a length.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105173032.1947-4-michal.wajdeczko@intel.com

drm/xe/guc: Drop redundant logs about invalid G2H length

We are now logging details of the failed G2H message (including
its length) at the GuC CT component. Drop now redundant log from
the GuC submit code.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105173032.1947-3-michal.wajdeczko@intel.com

drm/xe/guc: Log content of the failed G2H message

We are already logging an error once we failed to process a G2H
message, but then it's quite hard to extract the content of the
broken G2H message from the captured snapshot. Extend our error
log with the raw hexdump of the G2H message.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105173032.1947-2-michal.wajdeczko@intel.com

drm/xe/vf: Defer fixups if migrated twice fast

If another VF migration happened during post-migration recovery,
then the current worker should be finished to allow the next
one start swiftly and cleanly.

Check for defer in two places: before fixups, and before
sending RESFIX_DONE.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-6-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/xe/vf: Start post-migration fixups with provisioning query

During post-migration recovery, only MMIO communication to GuC is
allowed. The VF KMD needs to use that channel to ask for the new
provisioning, which includes a new GGTT range assigned to the VF.

v2: query config only instead of handshake; no need to get pm ref as
it's now kept through whole recovery (Michal)
v3: switched names of 'err' and 'ret' (Michal)

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-5-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/xe/vf: Send RESFIX_DONE message at end of VF restore

After restore, GuC will not answer to any messages from VF KMD until
fixups are applied. When that is done, VF KMD sends RESFIX_DONE
message to GuC, at which point GuC resumes normal operation.

This patch implements sending the RESFIX_DONE message at end of
post-migration recovery.

v2: keep pm ref during whole recovery, style fixes (Michal)
v3: assert removal to separate patch, debug message per GuC instead
of one, comments changes (Michal)
v4: improve one debug message (Michal)

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-4-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/xe/vf: Document SRIOV VF restore flow

This adds a documentation chapter, containing high level flow
of VF restore procedure.

v2: Better describe initial conditions, include GuC states on
sequence diagram (Michal)
v3: moved DOC to .c (Michal)

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-3-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/xe/vf: React to MIGRATED interrupt

To properly support VF Save/Restore procedure, fixups need to be
applied after PF driver finishes its part of VF Restore. The fixups
are required to adjust the ongoing execution for a hardware switch
that happened, because some GFX resources are not fully virtualized,
and assigned to a VF as range from a global pool. The VF on which
a VM is restored will often have different ranges provisioned than
the VF on which save process happened. Those resource fixups are
applied by the VF driver within a restored VM.

A VF driver gets informed that it was migrated by receiving an
interrupt from each GuC. The interrupt assigned for that purpose
is "GUC SW interrupt 0". Seeing that fields set from within the
irq handler should be the trigger for fixups.

The VF can safely do post-migration fixups on resources associated
to each GuC only after that GuC issued the MIGRATED interrupt.

This change introduces a worker to be used for post-migration fixups,
and a mechanism to schedule said worker when all GuCs sent the irq.

v2: renamed and moved functions, updated logged messages, removed
  unused includes, used anon struct (Michal)
v3: ordering, kerneldoc, asserts, debug messages,
  on_all_tiles -> on_all_gts (Michal)
v4: fixed missing header include
v5: Explained what fixups are, explained which IRQ is used, style
  fixes (Michal)

Bspec: 50868
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-2-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/xe: Reword exec_queue and vm lock doc

Reword documentation to possibly what it meant to be.

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Stop accumulating LRC timestamp on job_free

The exec queue timestamp is only really useful when it's being queried
through the fdinfo. There's no need to update it so often, on every
job_free. Tracing a simple app like vkcube running shows an update
rate of ~ 120Hz. In case of discrete, the BO is on vram, creating a lot
of pcie transactions.

The update on job_free() is used to cover a gap: if exec
queue is created and destroyed rapidly, before a new query, the
timestamp still needs to be accumulated and accounted for in the xef.

Initial implementation in commit 6109f24f87d7 ("drm/xe: Add helper to
accumulate exec queue runtime") couldn't do it on the exec_queue_fini
since the xef could be gone at that point. However since commit
ce8c161cbad4 ("drm/xe: Add ref counting for xe_file") the xef is
refcounted and the exec queue always holds a reference, making this safe
now.

Improve the fix in commit 2149ded63079 ("drm/xe: Fix use after free when
client stats are captured") by reducing the frequency in which the
update is needed.

Fixes: 2149ded63079 ("drm/xe: Fix use after free when client stats are captured")
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Add trace to lrc timestamp update

Help debugging when LRC timestamp is updated for a exec queue.

Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe/pf: Fix potential GGTT allocation leak

In unlikely event that we fail during sending the new VF GGTT
configuration to the GuC, we will free only the GGTT node data
struct but will miss to release the actual GGTT allocation.

This will later lead to list corruption, GGTT space leak and
finally risking crash when unloading the driver:

[ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-EIO)
[ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT

[ ] list_add corruption. next->prev should be prev (ffff88813cfcd628), but was 0000000000000000. (next=ffff88813cfe2028).
[ ] RIP: 0010:__list_add_valid_or_report+0x6b/0xb0
[ ] Call Trace:
[ ]  drm_mm_insert_node_in_range+0x2c0/0x4e0
[ ]  xe_ggtt_node_insert+0x46/0x70 [xe]
[ ]  pf_provision_vf_ggtt+0x7f5/0xa70 [xe]
[ ]  xe_gt_sriov_pf_config_set_ggtt+0x5e/0x770 [xe]
[ ]  ggtt_set+0x4b/0x70 [xe]
[ ]  simple_attr_write_xsigned.constprop.0.isra.0+0xb0/0x110

[ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-ENOSPC)
[ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT

[ ] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP NOPTI
[ ] RIP: 0010:drm_mm_remove_node+0x1b7/0x390
[ ] Call Trace:
[ ]  <TASK>
[ ]  ? die_addr+0x2e/0x80
[ ]  ? exc_general_protection+0x1a1/0x3e0
[ ]  ? asm_exc_general_protection+0x22/0x30
[ ]  ? drm_mm_remove_node+0x1b7/0x390
[ ]  ggtt_node_remove+0xa5/0xf0 [xe]
[ ]  xe_ggtt_node_remove+0x35/0x70 [xe]
[ ]  xe_ttm_bo_destroy+0x123/0x220 [xe]
[ ]  intel_user_framebuffer_destroy+0x44/0x70 [xe]
[ ]  intel_plane_destroy_state+0x3b/0xc0 [xe]
[ ]  drm_atomic_state_default_clear+0x1cd/0x2f0
[ ]  intel_atomic_state_clear+0x9/0x20 [xe]
[ ]  __drm_atomic_state_free+0x1d/0xb0

Fix that by using pf_release_ggtt() on the error path, which now
works regardless if the node has GGTT allocation or not.

Fixes: 34e804220f69 ("drm/xe: Make xe_ggtt_node struct independent")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104144901.1903-1-michal.wajdeczko@intel.com

drm/xe: Drop VM dma-resv lock on xe_sync_in_fence_get failure in exec IOCTL

Upon failure all locks need to be dropped before returning to the user.

Fixes: 58480c1c912f ("drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-3-matthew.brost@intel.com

drm/xe: Fix possible exec queue leak in exec IOCTL

In a couple of places after an exec queue is looked up the exec IOCTL
returns on input errors without dropping the exec queue ref. Fix this
ensuring the exec queue ref is dropped on input error.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-2-matthew.brost@intel.com

drm/xe: Fix case for asserts in documentation

The rendered html documentation for "Xe ASSERTs" doesn't look nice with
the mixed caps and gives the impression it was a typo. Use Title Case
Style.

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105071539.2623727-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Wire up devcoredump in documentation

Add a documentation page to detail the device coredump as implemented by
xe - it was documented in the source code, but not visible in the
rendered (html) documentation.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241102161254.1818604-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Improve devcoredump documentation

Let the introduction be useful for both userspace and kernel. Also
improve the formatting to wire up later to the documentation build.

Reviewed-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241102161254.1818604-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Fix build error for XE_IOCTL_DBG macro

If CONFIG_DRM_USE_DYNAMIC_DEBUG is set, 'drm_dbg' function is replaced
with '__dynamic_func_call_cls', which is replaced with a do while
statement. So in the previous code, there are the following build
errors.

include/linux/dynamic_debug.h:221:58: error: expected expression before ‘do’
  221 | #define __dynamic_func_call_cls(id, cls, fmt, func, ...) do {   \
      |                                                          ^~
include/linux/dynamic_debug.h:248:9: note: in expansion of macro ‘__dynamic_func_call_cls’
  248 |         __dynamic_func_call_cls(__UNIQUE_ID(ddebug), cls, fmt, func, ##__VA_ARGS__)
      |         ^~~~~~~~~~~~~~~~~~~~~~~
include/drm/drm_print.h:425:9: note: in expansion of macro ‘_dynamic_func_call_cls’
  425 |         _dynamic_func_call_cls(cat, fmt, __drm_dev_dbg,         \
      |         ^~~~~~~~~~~~~~~~~~~~~~
include/drm/drm_print.h:504:9: note: in expansion of macro ‘drm_dev_dbg’
  504 |         drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, ##__VA_ARGS__)
      |         ^~~~~~~~~~~
include/drm/drm_print.h:522:33: note: in expansion of macro ‘drm_dbg_driver’
  522 | #define drm_dbg(drm, fmt, ...)  drm_dbg_driver(drm, fmt, ##__VA_ARGS__)
      |                                 ^~~~~~~~~~~~~~
drivers/gpu/drm/xe/xe_macros.h:14:21: note: in expansion of macro ‘drm_dbg’
   14 |         ((cond) && (drm_dbg(&(xe)->drm, \
      |                     ^~~~~~~
drivers/gpu/drm/xe/xe_bo.c:2029:13: note: in expansion of macro ‘XE_IOCTL_DBG’
2029 |         if (XE_IOCTL_DBG(xe, !gem_obj))

The problem is that XE_IOCTL_DBG uses this function for conditional expr.
Use a statement expression so it can also be used on conditional checks.

v2: Modify to print when only cond is true.
v3: Modify to evaluate cond only once.

Signed-off-by: Gyeyoung Baek <gye976@gmail.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241102022204.155039-1-gye976@gmail.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/xe: Fix drm-next merge

Fix merge in commit c787c2901e2c ("Merge drm/drm-next into
drm-xe-next"). That workaround needs to be done only once.

While at it, also remove undesired newline before checking for error.

Fixes: c787c2901e2c ("Merge drm/drm-next into drm-xe-next")
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104183104.2265275-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

Merge drm/drm-next into drm-xe-next

Backmerging to get up-to-date and to bring in a fix that was
merged through drm-misc-fixes.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Backmerge v6.12-rc6 of git://git./linux/kernel/git/torvalds/linux into drm-next

Backmerge Linus tree for some drm-fixes needed for msm and xe merges.

Signed-off-by: Dave Airlie <airlied@redhat.com>

Linux 6.12-rc6

Merge tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git./linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"17 hotfixes.  9 are cc:stable.  13 are MM and 4 are non-MM.

  The usual collection of singletons - please see the changelogs"

* tag 'mm-hotfixes-stable-2024-11-03-10-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify()
  mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats
  mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes
  mm: shrinker: avoid memleak in alloc_shrinker_info
  .mailmap: update e-mail address for Eugen Hristev
  vmscan,migrate: fix page count imbalance on node stats when demoting pages
  mailmap: update Jarkko's email addresses
  mm: allow set/clear page_type again
  nilfs2: fix potential deadlock with newly created symlinks
  Squashfs: fix variable overflow in squashfs_readpage_block
  kasan: remove vmalloc_percpu test
  tools/mm: -Werror fixes in page-types/slabinfo
  mm, swap: avoid over reclaim of full clusters
  mm: fix PSWPIN counter for large folios swap-in
  mm: avoid VM_BUG_ON when try to map an anon large folio to zero page.
  mm/codetag: fix null pointer check logic for ref and tag
  mm/gup: stop leaking pinned pages in low memory conditions

Merge tag 'phy-fixes-6.12' of git://git./linux/kernel/git/phy/linux-phy

Pull phy fixes from Vinod Koul:

- Qualcomm QMP driver fixes for null deref on suspend, bogus supplies
   fix and reset entries fix

- BCM usb driver init array fix

- cadence array offset fix

- starfive link configuration fix

- config dependency fix for rockchip driver

- freescale reset signal fix before pll lock

- tegra driver fix for error pointer check

* tag 'phy-fixes-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: tegra: xusb: Add error pointer check in xusb.c
  dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Fix X1E80100 resets entries
  phy: freescale: imx8m-pcie: Do CMN_RST just before PHY PLL lock check
  phy: phy-rockchip-samsung-hdptx: Depend on CONFIG_COMMON_CLK
  phy: ti: phy-j721e-wiz: fix usxgmii configuration
  phy: starfive: jh7110-usb: Fix link configuration to controller
  phy: qcom: qmp-pcie: drop bogus x1e80100 qref supplies
  phy: qcom: qmp-combo: move driver data initialisation earlier
  phy: qcom: qmp-usbc: fix NULL-deref on runtime suspend
  phy: qcom: qmp-usb-legacy: fix NULL-deref on runtime suspend
  phy: qcom: qmp-usb: fix NULL-deref on runtime suspend
  dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: add missing x1e80100 pipediv2 clocks
  phy: usb: disable COMMONONN for dual mode
  phy: cadence: Sierra: Fix offset of DEQ open eye algorithm control register
  phy: usb: Fix missing elements in BCM4908 USB init array

Merge tag 'dmaengine-fix-6.12' of git://git./linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:

- TI driver fix to set EOP for cyclic BCDMA transfers

- sh rz-dmac driver fix for handling config with zero address

* tag 'dmaengine-fix-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: ti: k3-udma: Set EOP for all TRs in cyclic BCDMA transfer
dmaengine: sh: rz-dmac: handle configs where one address is zero

Merge tag 'driver-core-6.12-rc6' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core revert from Greg KH:
"Here is a single driver core revert for 6.12-rc6. It reverts a change
  that came in -rc1 that was supposed to resolve a reported problem, but
  caused another one, so revert it for now so that we can get this all
  worked out properly in 6.13.

  The revert has been in linux-next all week with no reported issues"

* tag 'driver-core-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Revert "driver core: Fix uevent_show() vs driver detach race"

Merge tag 'usb-6.12-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB / Thunderbolt fixes from Greg KH:
"Here are some small USB and Thunderbolt driver fixes for 6.12-rc6 that
  have been sitting in my tree this week. Included in here are the
  following:

   - thunderbolt driver fixes for reported issues

   - USB typec driver fixes

   - xhci driver fixes for reported problems

   - dwc2 driver revert for a broken change

   - usb phy driver fix

   - usbip tool fix

  All of these have been in linux-next this week with no reported
  issues"

* tag 'usb-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: tcpm: restrict SNK_WAIT_CAPABILITIES_TIMEOUT transitions to non self-powered devices
  usb: phy: Fix API devm_usb_put_phy() can not release the phy
  usb: typec: use cleanup facility for 'altmodes_node'
  usb: typec: fix unreleased fwnode_handle in typec_port_register_altmodes()
  usb: typec: qcom-pmic-typec: fix missing fwnode removal in error path
  usb: typec: qcom-pmic-typec: use fwnode_handle_put() to release fwnodes
  usb: acpi: fix boot hang due to early incorrect 'tunneled' USB3 device links
  Revert "usb: dwc2: Skip clock gating on Broadcom SoCs"
  xhci: Fix Link TRB DMA in command ring stopped completion event
  xhci: Use pm_runtime_get to prevent RPM on unsupported systems
  usbip: tools: Fix detach_port() invalid port error path
  thunderbolt: Honor TMU requirements in the domain when setting TMU mode
  thunderbolt: Fix KASAN reported stack out-of-bounds read in tb_retimer_scan()

mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify()

When the MM_WALK capability is enabled, memory that is mostly accessed by
a VM appears younger than it really is, therefore this memory will be less
likely to be evicted. Therefore, the presence of a running VM can
significantly increase swap-outs for non-VM memory, regressing the
performance for the rest of the system.

Fix this regression by always calling {ptep,pmdp}_clear_young_notify()
whenever we clear the young bits on PMDs/PTEs.

[jthoughton@google.com: fix link-time error]
Link: https://lkml.kernel.org/r/20241019012940.3656292-3-jthoughton@google.com
Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: James Houghton <jthoughton@google.com>
Reported-by: David Stevens <stevensd@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: <stable@vger.kernel.org>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: multi-gen LRU: remove MM_LEAF_OLD and MM_NONLEAF_TOTAL stats

Patch series "mm: multi-gen LRU: Have secondary MMUs participate in
MM_WALK".

Today, the MM_WALK capability causes MGLRU to clear the young bit from
PMDs and PTEs during the page table walk before eviction, but MGLRU does
not call the clear_young() MMU notifier in this case.  By not calling this
notifier, the MM walk takes less time/CPU, but it causes pages that are
accessed mostly through KVM / secondary MMUs to appear younger than they
should be.

We do call the clear_young() notifier today, but only when attempting to
evict the page, so we end up clearing young/accessed information less
frequently for secondary MMUs than for mm PTEs, and therefore they appear
younger and are less likely to be evicted.  Therefore, memory that is
*not* being accessed mostly by KVM will be evicted *more* frequently,
worsening performance.

ChromeOS observed a tab-open latency regression when enabling MGLRU with a
setup that involved running a VM:

Tab-open latency histogram (ms)
Version p50 mean p95 p99 max
base 1315 1198 2347 3454 10319
mglru 2559 1311 7399 12060 43758
fix 1119 926 2470 4211 6947

This series replaces the final non-selftest patchs from this series[1],
which introduced a similar change (and a new MMU notifier) with KVM
optimizations.  I'll send a separate series (to Sean and Paolo) for the
KVM optimizations.

This series also makes proactive reclaim with MGLRU possible for KVM
memory.  I have verified that this functions correctly with the selftest
from [1], but given that that test is a KVM selftest, I'll send it with
the rest of the KVM optimizations later.  Andrew, let me know if you'd
like to take the test now anyway.

[1]: https://lore.kernel.org/linux-mm/20240926013506.860253-18-jthoughton@google.com/

This patch (of 2):

The removed stats, MM_LEAF_OLD and MM_NONLEAF_TOTAL, are not very helpful
and become more complicated to properly compute when adding
test/clear_young() notifiers in MGLRU's mm walk.

Link: https://lkml.kernel.org/r/20241019012940.3656292-1-jthoughton@google.com
Link: https://lkml.kernel.org/r/20241019012940.3656292-2-jthoughton@google.com
Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: James Houghton <jthoughton@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Stevens <stevensd@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Merge tag 'char-misc-6.12-rc6' of git://git./linux/kernel/git/gregkh/char-misc

Pull misc driver fixes from Greg KH:
"Here are some small char/misc/iio fixes for 6.12-rc6 that resolve
  some reported issues. Included in here are the following:

   - small IIO driver fixes for many reported issues

   - mei driver fix for a suddenly much reported issue for an "old"
     issue.

   - MAINTAINERS update for a developer who has moved companies and
     forgot to update their old entry.

  All of these have been in linux-next this week with no reported
  issues"

* tag 'char-misc-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  mei: use kvmalloc for read buffer
  MAINTAINERS: add netup_unidvb maintainer
  iio: dac: Kconfig: Fix build error for ltc2664
  iio: adc: ad7124: fix division by zero in ad7124_set_channel_odr()
  staging: iio: frequency: ad9832: fix division by zero in ad9832_calc_freqreg()
  docs: iio: ad7380: fix supply for ad7380-4
  iio: adc: ad7380: fix supplies for ad7380-4
  iio: adc: ad7380: add missing supplies
  iio: adc: ad7380: use devm_regulator_get_enable_read_voltage()
  dt-bindings: iio: adc: ad7380: fix ad7380-4 reference supply
  iio: light: veml6030: fix microlux value calculation
  iio: gts-helper: Fix memory leaks for the error path of iio_gts_build_avail_scale_table()
  iio: gts-helper: Fix memory leaks in iio_gts_build_avail_scale_table()

Merge tag 'input-for-v6.12-rc5' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

- a fix for regression in input core introduced in 6.11 preventing
   re-registering input handlers

- a fix for adp5588-keys driver tyring to disable interrupt 0 at
   suspend when devices is used without interrupt

- a fix for edt-ft5x06 to stop leaking regmap structure when probing
   fails and to make sure it is not released too early on removal.

* tag 'input-for-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: fix regression when re-registering input handlers
  Input: adp5588-keys - do not try to disable interrupt 0
  Input: edt-ft5x06 - fix regmap leak when probe fails

Merge tag 'kbuild-fixes-v6.12-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- Fix a memory leak in modpost

- Resolve build issues when cross-compiling RPM and Debian packages

- Fix another regression in Kconfig

- Fix incorrect MODULE_ALIAS() output in modpost

* tag 'kbuild-fixes-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  modpost: fix input MODULE_DEVICE_TABLE() built for 64-bit on 32-bit host
  modpost: fix acpi MODULE_DEVICE_TABLE built with mismatched endianness
  kconfig: show sub-menu entries even if the prompt is hidden
  kbuild: deb-pkg: add pkg.linux-upstream.nokerneldbg build profile
  kbuild: deb-pkg: add pkg.linux-upstream.nokernelheaders build profile
  kbuild: rpm-pkg: disable kernel-devel package when cross-compiling
  sumversion: Fix a memory leak in get_src_version()

Merge tag 'x86-urgent-2024-11-03' of git://git./linux/kernel/git/tip/tip

Pull x86 fix from Thomas Gleixner:
"A trivial compile test fix for x86:

  When CONFIG_AMD_NB is not set a COMPILE_TEST of an AMD specific driver
  fails due to a missing inline stub. Add the stub to cure it"

* tag 'x86-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/amd_nb: Fix compile-testing without CONFIG_AMD_NB

Merge tag 'timers-urgent-2024-11-03' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
"A single fix for posix CPU timers.

  When a thread is cloned, the posix CPU timers are not inherited.

  If the parent has a CPU timer armed the corresponding tick dependency
  in the tasks tick_dep_mask is set and copied to the new thread, which
  means the new thread and all decendants will prevent the system to go
  into full NOHZ operation.

  Clear the tick dependency mask in copy_process() to fix this"

* tag 'timers-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone

Merge tag 'sched-urgent-2024-11-03' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:

- Plug a race between pick_next_task_fair() and try_to_wake_up() where
   both try to write to the same task, even though both paths hold a
   runqueue lock, but obviously from different runqueues.

   The problem is that the store to task::on_rq in __block_task() is
   visible to try_to_wake_up() which assumes that the task is not
   queued. Both sides then operate on the same task.

   Cure it by rearranging __block_task() so the the store to task::on_rq
   is the last operation on the task.

- Prevent a potential NULL pointer dereference in task_numa_work()

   task_numa_work() iterates the VMAs of a process. A concurrent unmap
   of the address space can result in a NULL pointer return from
   vma_next() which is unchecked.

   Add the missing NULL pointer check to prevent this.

- Operate on the correct scheduler policy in task_should_scx()

   task_should_scx() returns true when a task should be handled by sched
   EXT. It checks the tasks scheduling policy.

   This fails when the check is done before a policy has been set.

   Cure it by handing the policy into task_should_scx() so it operates
   on the requested value.

- Add the missing handling of sched EXT in the delayed dequeue
   mechanism. This was simply forgotten.

* tag 'sched-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/ext: Fix scx vs sched_delayed
  sched: Pass correct scheduling policy to __setscheduler_class
  sched/numa: Fix the potential null pointer dereference in task_numa_work()
  sched: Fix pick_next_task_fair() vs try_to_wake_up() race

Merge tag 'perf-urgent-2024-11-03' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Thomas Gleixner:
"perf_event_clear_cpumask() uses list_for_each_entry_rcu() without
  being in a RCU read side critical section, which triggers a
  'suspicious RCU usage' warning.

  It turns out that the list walk does not be RCU protected because the
  write side lock is held in this context.

  Change it to a regular list walk"

* tag 'perf-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix missing RCU reader protection in perf_event_clear_cpumask()

Merge tag 'irq-urgent-2024-11-03' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:

- Fix an off-by-one error in the failure path of msi_domain_alloc(),
   which causes the cleanup loop to terminate early and leaking the
   first allocated interrupt.

- Handle a corner case in GIC-V4 versus a lazily mapped Virtual
   Processing Element (VPE). If the VPE has not been mapped because the
   guest has not yet emitted a mapping command, then the set_affinity()
   callback returns an error code, which causes the vCPU management to
   fail.

   Return success in this case without touching the hardware. This will
   be done later when the guest issues the mapping command.

* tag 'irq-urgent-2024-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v4: Correctly deal with set_affinity on lazily-mapped VPEs
  genirq/msi: Fix off-by-one error in msi_domain_alloc()

modpost: fix input MODULE_DEVICE_TABLE() built for 64-bit on 32-bit host

When building a 64-bit kernel on a 32-bit build host, incorrect
input MODULE_ALIAS() entries may be generated.

For example, when compiling a 64-bit kernel with CONFIG_INPUT_MOUSEDEV=m
on a 64-bit build machine, you will get the correct output:

  $ grep MODULE_ALIAS drivers/input/mousedev.mod.c
  MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*110,*r*0,*1,*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*r*8,*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*14A,*r*a*0,*1,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*145,*r*a*0,*1,*18,*1C,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*110,*r*a*0,*1,*m*l*s*f*w*");

However, building the same kernel on a 32-bit machine results in
incorrect output:

  $ grep MODULE_ALIAS drivers/input/mousedev.mod.c
  MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*110,*130,*r*0,*1,*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*2,*k*r*8,*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*14A,*16A,*r*a*0,*1,*20,*21,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*145,*165,*r*a*0,*1,*18,*1C,*20,*21,*38,*3C,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*3,*k*110,*130,*r*a*0,*1,*20,*21,*m*l*s*f*w*");

A similar issue occurs with CONFIG_INPUT_JOYDEV=m. On a 64-bit build
machine, the output is:

  $ grep MODULE_ALIAS drivers/input/joydev.mod.c
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*0,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*2,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*8,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*6,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*k*120,*r*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*k*130,*r*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*k*2C0,*r*a*m*l*s*f*w*");

However, on a 32-bit machine, the output is incorrect:

  $ grep MODULE_ALIAS drivers/input/joydev.mod.c
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*0,*20,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*2,*22,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*8,*28,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*3,*k*r*a*6,*26,*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*k*11F,*13F,*r*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*k*11F,*13F,*r*a*m*l*s*f*w*");
  MODULE_ALIAS("input:b*v*p*e*-e*1,*k*2C0,*2E0,*r*a*m*l*s*f*w*");

When building a 64-bit kernel, BITS_PER_LONG is defined as 64. However,
on a 32-bit build machine, the constant 1L is a signed 32-bit value.
Left-shifting it beyond 32 bits causes wraparound, and shifting by 31
or 63 bits makes it a negative value.

The fix in commit e0e92632715f ("[PATCH] PATCH: 1 line 2.6.18 bugfix:
modpost-64bit-fix.patch") is incorrect; it only addresses cases where
a 64-bit kernel is built on a 64-bit build machine, overlooking cases
on a 32-bit build machine.

Using 1ULL ensures a 64-bit width on both 32-bit and 64-bit machines,
avoiding the wraparound issue.

Fixes: e0e92632715f ("[PATCH] PATCH: 1 line 2.6.18 bugfix: modpost-64bit-fix.patch")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

modpost: fix acpi MODULE_DEVICE_TABLE built with mismatched endianness

When CONFIG_SATA_AHCI_PLATFORM=m, modpost outputs incorect acpi
MODULE_ALIAS() if the endianness of the target and the build machine
do not match.

When the endianness of the target kernel and the build machine match,
the output is correct:

  $ grep 'MODULE_ALIAS("acpi' drivers/ata/ahci_platform.mod.c
  MODULE_ALIAS("acpi*:APMC0D33:*");
  MODULE_ALIAS("acpi*:010601:*");

However, when building a little-endian kernel on a big-endian machine
(or vice versa), the output is incorrect:

  $ grep 'MODULE_ALIAS("acpi' drivers/ata/ahci_platform.mod.c
  MODULE_ALIAS("acpi*:APMC0D33:*");
  MODULE_ALIAS("acpi*:0601??:*");

The 'cls' and 'cls_msk' fields are 32-bit.

DEF_FIELD() must be used instead of DEF_FIELD_ADDR() to correctly handle
endianness of these 32-bit fields.

The check 'if (cls)' was unnecessary; it never became NULL, as it was
the pointer to 'symval' plus the offset to the 'cls' field.

Fixes: 26095a01d359 ("ACPI / scan: Add support for ACPI _CLS device matching")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

Input: fix regression when re-registering input handlers

Commit d469647bafd9 ("Input: simplify event handling logic") introduced
code that would set handler->events() method to either
input_handler_events_filter() or input_handler_events_default() or
input_handler_events_null(), depending on the kind of input handler
(a filter or a regular one) we are dealing with. Unfortunately this
breaks cases when we try to re-register the same filter (as is the case
with sysrq handler): after initial registration the handler will have 2
event handling methods defined, and will run afoul of the check in
input_handler_check_methods():

input: input_handler_check_methods: only one event processing method can be defined (sysrq)
sysrq: Failed to register input handler, error -22

Fix this by adding handle_events() method to input_handle structure and
setting it up when registering a new input handle according to event
handling methods defined in associated input_handler structure, thus
avoiding modifying the input_handler structure.

Reported-by: "Ned T. Crigler" <crigler@gmail.com>
Reported-by: Christian Heusel <christian@heusel.eu>
Tested-by: "Ned T. Crigler" <crigler@gmail.com>
Tested-by: Peter Seiderer <ps.report@gmx.net>
Fixes: d469647bafd9 ("Input: simplify event handling logic")
Link: https://lore.kernel.org/r/Zx2iQp6csn42PJA7@xavtug
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'nfsd-6.12-3' of git://git./linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

- Fix two async COPY bugs found during NFS bake-a-thon

- Fix an svcrdma memory leak

* tag 'nfsd-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  rpcrdma: Always release the rpcrdma_device's xa_array
  NFSD: Never decrement pending_async_copies on error
  NFSD: Initialize struct nfsd4_copy earlier

Merge tag 'xfs-6.12-fixes-6' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Carlos Maiolino:

- fix a sysbot reported crash on filestreams

- Reduce cpu time spent searching for extents in a very fragmented FS

- Check for delayed allocations before setting extsize

* tag 'xfs-6.12-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: streamline xfs_filestream_pick_ag
  xfs: fix finding a last resort AG in xfs_filestream_pick_ag
  xfs: Reduce unnecessary searches when searching for the best extents
  xfs: Check for delayed allocations before setting extsize

Merge tag 'linux_kselftest-fixes-6.12-rc6' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

- fix syntax error in frequency calculation arithmetic expression in
   intel_pstate run.sh

- add missing cpupower dependency check intel_pstate run.sh

- fix idmap_mount_tree_invalid test failure due to incorrect argument

- fix watchdog-test run leaving the watchdog timer enabled causing
   system reboot. With this fix, the test disables the watchdog timer
   when it gets terminated with SIGTERM, SIGKILL, and SIGQUIT in
   addition to SIGINT

* tag 'linux_kselftest-fixes-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/watchdog-test: Fix system accidentally reset after watchdog-test
  selftests/intel_pstate: check if cpupower is installed
  selftests/intel_pstate: fix operand expected error
  selftests/mount_setattr: fix idmap_mount_tree_invalid failed to run

Merge tag 'rust-fixes-6.12-3' of https://github.com/Rust-for-Linux/linux

Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:

   - Avoid build errors with old 'rustc's without LLVM patch version
     (important since it impacts people that do not even enable Rust)

   - Update LLVM version for 'HAVE_CFI_ICALL_NORMALIZE_INTEGERS' in
     'depends on' condition (the fix was eventually backported rather
     than land in LLVM 19)"

* tag 'rust-fixes-6.12-3' of https://github.com/Rust-for-Linux/linux:
  cfi: tweak llvm version for HAVE_CFI_ICALL_NORMALIZE_INTEGERS
  kbuild: rust: avoid errors with old `rustc`s without LLVM patch version

Merge tag 'pci-v6.12-fixes-2' of git://git./linux/kernel/git/pci/pci

Pull pci fix from Bjorn Helgaas:

- Enable device-specific ACS-like functionality even if the device
   doesn't advertise an ACS capability, which got broken when adding
   fancy ACS kernel parameter (Jason Gunthorpe)

* tag 'pci-v6.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: Fix pci_enable_acs() support for the ACS quirks

Merge tag 'drm-fixes-2024-11-02' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Regular fixes pull, nothing too out of the ordinary, the mediatek
  fixes came in a batch that I might have preferred a bit earlier but
  all seem fine, otherwise regular xe/amdgpu and a few misc ones.

  xe:
   - Fix missing HPD interrupt enabling, bringing one PM refactor with it
   - Workaround LNL GGTT invalidation not being visible to GuC
   - Avoid getting jobs stuck without a protecting timeout

  ivpu:
   - Fix firewall IRQ handling

  panthor:
   - Fix firmware initialization wrt page sizes
   - Fix handling and reporting of dead job groups

  sched:
   - Guarantee forward progress via WC_MEM_RECLAIM

  tests:
   - Fix memory leak in drm_display_mode_from_cea_vic()

  amdgpu:
   - DCN 3.5 fix
   - Vangogh SMU KASAN fix
   - SMU 13 profile reporting fix

  mediatek:
   - Fix degradation problem of alpha blending
   - Fix color format MACROs in OVL
   - Fix get efuse issue for MT8188 DPTX
   - Fix potential NULL dereference in mtk_crtc_destroy()
   - Correct dpi power-domains property
   - Add split subschema property constraints"

* tag 'drm-fixes-2024-11-02' of https://gitlab.freedesktop.org/drm/kernel: (27 commits)
  drm/xe: Don't short circuit TDR on jobs not started
  drm/xe: Add mmio read before GGTT invalidate
  drm/tests: hdmi: Fix memory leaks in drm_display_mode_from_cea_vic()
  drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic()
  drm/tests: helpers: Add helper for drm_display_mode_from_cea_vic()
  drm/panthor: Report group as timedout when we fail to properly suspend
  drm/panthor: Fail job creation when the group is dead
  drm/panthor: Fix firmware initialization on systems with a page size > 4k
  accel/ivpu: Fix NOC firewall interrupt handling
  drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume
  drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling
  drm/xe: Remove runtime argument from display s/r functions
  drm/amdgpu/smu13: fix profile reporting
  drm/amd/pm: Vangogh: Fix kernel memory out of bounds write
  Revert "drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35"
  drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM
  drm/tegra: Fix NULL vs IS_ERR() check in probe()
  dt-bindings: display: mediatek: split: add subschema property constraints
  dt-bindings: display: mediatek: dpi: correct power-domains property
  drm/mediatek: Fix potential NULL dereference in mtk_crtc_destroy()
  ...

Merge tag 'cxl-fixes-6.12-rc6' of git://git./linux/kernel/git/cxl/cxl

Pull cxl fixes from Ira Weiny:
"The bulk of these fixes center around an initialization order bug
  reported by Gregory Price and some additional fall out from the
  debugging effort.

  In summary, cxl_acpi and cxl_mem race and previously worked because of
  a bus_rescan_devices() while testing without modules built in.

  Unfortunately with modules built in the rescan would fail due to the
  cxl_port driver being registered late via the build order. Furthermore
  it was found bus_rescan_devices() did not guarantee a probe barrier
  which CXL was expecting. Additional fixes to cxl-test and decoder
  allocation came along as they were found in this debugging effort.

  The other fixes are pretty minor but one affects trace point data seen
  by user space.

  Summary:

   - Fix crashes when running with cxl-test code

   - Fix Trace DRAM Event Record field decodes

   - Fix module/built in initialization order errors

   - Fix use after free on decoder shutdowns

   - Fix out of order decoder allocations

   - Improve cxl-test to better reflect real world systems"

* tag 'cxl-fixes-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/test: Improve init-order fidelity relative to real-world systems
  cxl/port: Prevent out-of-order decoder allocation
  cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
  cxl/acpi: Ensure ports ready at cxl_acpi_probe() return
  cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()
  cxl/port: Fix CXL port initialization order when the subsystem is built-in
  cxl/events: Fix Trace DRAM Event Record
  cxl/core: Return error when cxl_endpoint_gather_bandwidth() handles a non-PCI device

Merge tag 'block-6.12-20241101' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- Fixup for a recent blk_rq_map_user_bvec() patch

- NVMe pull request via Keith:
     - Spec compliant identification fix (Keith)
     - Module parameter to enable backward compatibility on unusual
       namespace formats (Keith)
     - Target double free fix when using keys (Vitaliy)
     - Passthrough command error handling fix (Keith)

* tag 'block-6.12-20241101' of git://git.kernel.dk/linux:
  nvme: re-fix error-handling for io_uring nvme-passthrough
  nvmet-auth: assign dh_key to NULL after kfree_sensitive
  nvme: module parameter to disable pi with offsets
  block: fix queue limits checks in blk_rq_map_user_bvec for real
  nvme: enhance cns version checking

Merge tag 'io_uring-6.12-20241101' of git://git.kernel.dk/linux

Pull io_uring fix from Jens Axboe:

- Fix not honoring IOCB_NOWAIT for starting buffered writes in terms of
   calling sb_start_write(), leading to a deadlock if someone is
   attempting to freeze the file system with writes in progress, as each
   side will end up waiting for the other to make progress.

* tag 'io_uring-6.12-20241101' of git://git.kernel.dk/linux:
  io_uring/rw: fix missing NOWAIT check for O_DIRECT start write

Merge tag 'acpi-6.12-rc6' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
"Make the ACPI CPPC library use a raw spinlock for operations carried
  out in scheduler context via the schedutil governor and the ACPI CPPC
  cpufreq driver (Pierre Gondois)"

* tag 'acpi-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: CPPC: Make rmw_lock a raw_spin_lock